Correction of uneven illumination in color microscopic image based on fully convolutional network

Jianhang Wang; Jianhang Wang; Xin Wang; Xin Wang; Ping Zhang; Shiling Xie; Shujun Fu; Yuliang Li; Hongbin Han; Hongbin Han

doi:10.1364/OE.433064

1. Introduction

Microscope is one of the most basic and widely used observation tools in medical research and clinical practice. The acquired images can display the fine structure or functional spatial distribution of optical or transformable non-optical information, which plays an important role for doctors or scientific researchers to understand imaged objects [1,2]. Unfortunately, mostly images acquired through a microscope have non-uniform illumination [3].

Many factors cause uneven illumination, including imperfect illumination of specimen, vignetting, misalignment of optical devices, dust and operator expertise, etc [4–8]. The inherent poor characteristics of this optical system have become one of the most common problems affecting microscopic imaging [9]. From a visual point of view, a microscopic image with uneven background illumination, usually displayed as a brighter area towards the center of the image, and a darker area towards its edges [1,10–13].

In practice, if the image is qualitatively analyzed (that is, viewed by an expert), the uneven distribution of light intensity can usually be tolerated. Conversely, when quantitative measurements are required, uneven lighting can mask real quantitative differences and endanger biological experiments [14,15]. For instance, if illumination correction is ignored, false detection and missed detection rate of the yeast cell image will increase by $35\%$ when measured by CellProfiler [16]. Other routine measurements will also be affected [3]. Besides, many other computer vision and pattern recognition tasks become more difficult due to inhomogeneity of illumination intensity, especially in the processing stage of microscopic image segmentation [17], target tracking [18], and mosaicking [19,20]. When several microscopic images are stitched together to increase the field of view, if the illumination is not corrected, the seams in the stitched area will become obvious, which can mislead vision and lead to poor medical diagnostics. Therefore, effectively overcoming the influence of uneven illumination is an indispensable link in medical microscopic image processing, and it is also a technical difficulty that has not been completely resolved in computer vision.

Various pretreatments of the extraction of illumination information from images through low-pass filtering methods are reported based on the assumption that the uneven illumination only originates from additive low-frequency signals [21,22]. Zheng et al. [23] and Lyu [24] tried to estimate the vignetting function of a single image through a strong prior to correct the uneven illumination of the image. The vignetting function is estimated based on multiple images [8,25], which makes the estimation result faster and more reliable. Morphological, polynomial fit, and Gaussian blurring methods are evaluated for correction of uneven illumination on pathological microscopy images [13]. These three classic algorithms are embedded in user-friendly and powerful open-source software [26–28]. Chernavskaia et al. [29] analyzed different flat-field correction methods and applied them for correcting mosaic artifacts in multimodal images. Lee and Kim [30] proposed a penalized nonlinear least squares function to correct the effect of non-uniform illumination on a bi-level image. A simple two-step method is proposed to estimate the flat-field distortion model of brightfield WSI [31]. As the two state-of-the-art methods in the field of microscopic image illumination correction, the background and shading correction (BaSiC) [32] and the regularized energy minimization (CIDRE) [3] have been tested on multiple datasets, and have achieved the best performance among all the compared methods. However, most of the above methods are designed to process monochromatic image. Color microscopic image’s channels can be separated, and each corrected separately. This not only adds to the workload and complexity of the illumination correction, but also sometimes leads to unexpected mistakes.

In recent years, the vigorous development of computer software and hardware technology has promoted the development of deep learning [33]. Much progress has been taken in applying deep convolutional neural networks (CNN) to color image lighting processing. For example, low-light image enhancement is one of the most active research directions in color image illumination processing, and many excellent CNN algorithms [34–40] have been proposed. Similarly, some CNN algorithms have also been reported in the processing of uneven illumination of color images, such as non-uniform illumination correction of underwater images [41], uneven illumination removal of simulated dermoscopy images [42], wavefront correction of aberrations in the fluorescence images [43], illumination correction of either one of over illuminated or under illuminated color images of paintings [44], etc. However, these CNN-based image lighting processing algorithms have the following characteristics:

• most of them focus on natural image processing,
• difficult to apply to new datasets,
• cause image distortion. For example, the simulated dermoscopy images have a serious color shift after being processed [42], and an obvious detail loss can be found in the corrected color images of paintings [44].

Motivated by the above discussions and also the vanilla U-Net structure [45], we propose a fully convolutional network (FCN) to directly process color microscopic images with uneven illumination. The whole process includes three modules to complete two tasks of original image illumination information prediction and correction. We also implement the proposed algorithm through parallelization, and verify that this network design idea has an exciting effect in reflecting the light distribution. In addition, a non-reference indicator that quantifies the uniformity of image illumination is proposed, which matches well with human visual perception. The proposed algorithm is applied to the WSI dataset [31] for preprocessing, which significantly improves the quality of image mosaicking. To the best of our knowledge, this work is the first to introduce deep learning into real medical microscopic images to correct uneven illumination. This network design idea is expected to inject new research vitality into this field.

The remainder of the paper is organized as follows. In Section 2., the proposed method is introduced in detail. Section 3 presents the experimental results. In Section 4., an application example is given. Section 5 discusses the parallel implementation and encoder-decoder architecture of proposed method. The conclusion is drawn in Section 6.

2. Methodology

The proposed algorithm consists of three major parts: feature encoder module, feature decoder module and detail supplement module. The first two modules are to predict the light distribution of microscopic images, and the third module is to fill in details. The outline of whole network structure is shown in Fig. 1.

Fig. 1. Architecture of proposed fully convolutional network method. Feature encoder module performs down-sampling through the same convolution with stride 2, and feature decoder module performs up-sampling through resize convolution. The corresponding feature maps of the two are directly added through skip connections and then aggregated to form illumination prediction. Original image and illumination information feature maps are concatenated and input into a detail supplement module to obtain the illumination correction results. In this process, overlapping residual blocks are designed to better transfer the illumination information.

Download Full Size | PDF

2.1 Feature encoder module

In the feature encoder module of the vanilla U-Net, each down-sampling includes two $3\times 3$ convolutions (valid convolutions), activated by the rectified linear unit (ReLU), and a $2\times 2$ max pooling layer with stride $2$. However, the structure has some drawbacks for uneven illumination imaging. For instance, it makes the vanilla U-Net lack global color information, which leads to the generated image colors inconsistent [36]. In the proposed method, we scale the original images to a specific size using bilinear interpolation at first. Then the approach of the vanilla U-Net is replaced by a $3\times 3$ convolution (same convolution) with stride $2$ and a ReLU. At each down-sampling step the feature map size will be reduced by half and no pooling operation is needed, which cuts the number of network layers and improves network efficiency.

After several times of the same down-sampling, the feature encoder module reaches the bottleneck layer. At this time, the receptive field of the bottleneck layer covers the whole input image, which can transmit the global illumination information. Since the encoder and decoder modules are to predict the approximate lighting distribution of the input image, the number of feature channels does not increase like the vanilla U-Net during the encoder stage.

2.2 Feature decoder module

In the feature decoder module, our main purpose is not to predict semantic information, but to estimate the illumination distribution of the entire image. After down-sampling by the feature encoder module, the illumination information passed in from the bottleneck layer has a global view. In order to transform low-resolution features into high-resolution features in up-sampling, vanilla U-Net employs deconvolution [46,47] (also called transposed convolution). Unfortunately, the high-resolution features reconstructed by deconvolution usually have "uneven overlap", resulting in high-frequency checkerboard artifacts or low-frequency artifacts in the later reconstructed image [48].

In the proposed method, we use the nearest-neighbor (NN) interpolation for up-sampling instead of regular deconvolution, and then perform convolutional layers (same convolution). This is a natural operation that can cause the artifact of different frequencies in the reconstructed image to disappear. The same convolution can also avoid the artifact from the image boundary.

As shown in Fig. 2 [49], deconvolution has a unique entry in each output window, and weight tying is implicit in resize-convolution to prevent the high-frequency artifact. NN-resize convolution (NN interpolation with same convolution) achieves the best results in preventing the image artifact. Most importantly, NN-resize convolution has better performance in maintaining the illumination information transmitted by the bottleneck layer, which is considered to be helpful for the prediction of the image illumination distribution.

Fig. 2. Images in different interpolation.

Download Full Size | PDF

In feature decoder module, another essential operation is to introduce skip connection from feature encoder module. Vanilla U-Net completes the fusion of the corresponding channel feature information in the encoder-decoder module by using the copy and crop operation. The crop operation makes the size of the corresponding feature maps in the encoder-decoder module consistent. The copy operation is the corresponding feature concatenation [50]. They enable the following convolution layer to freely choose between shallow and deep features, which is conducive to the prediction of pixel semantic information. Different from U-Net, the skip connection in the proposed method makes the feature maps in the feature decoder module and the feature maps (activated by ReLU) in the feature encoder module at the corresponding position add directly. By adding the local features of the encoder network, the decoder network is forced to predict more feature information than to predict specific semantic pixel values. This is beneficial to the characterization of the light distribution of the input image.

The last up-sampling resizes all feature maps to the input image size through bilinear interpolation. By aggregating all the output feature maps of the feature decoder module, the illumination estimation image of the input image is obtained.

2.3 Detail supplement module

Based on NN-resize convolutions and skip connections, the feature decoder module finally outputs feature maps of the same size as the original input through bilinear interpolation. These feature maps contain the illumination information, which can describe the illumination distribution of the original image. However, up-sampling by interpolation algorithms will inevitably lose the details of the input image. Obviously, the original input contains the most image detail information. Therefore, in order to merge the illumination information output by the feature decoder module and the detail information of the original image, we concatenate the two as input feature maps of the detail supplement module.

Afterwards, the final output image is reconstructed through five convolutional layers with kernel size of $3\times 3$. In this process, we perform two skip connections (also called residual connections [51]) to form overlapping residual blocks to ensure the transmission of illumination information.

As shown in Fig. 3, the residual connection here constructs a residual block through a natural identity mapping. It directly sums the previous feature x and the feature F(x) after two convolutions and then activates it through ReLU. The size and the number of channels of x and F(x) are the same. Some articles have conducted theoretical analysis and experimental demonstrations on the role of residual blocks [52–54].

Fig. 3. Residual block.

Download Full Size | PDF

In this paper, without special explanation, each convolution layer consists of the same convolution of stride $1$ and kernel size $3\times 3$, which is activated by ReLU.

2.4 Loss function

The proposed end-to-end FCN has two primary goals. The first and most basic thing is that the image after illumination correction must conform to human visual perception; Secondly, important details in the original image should be preserved as much as possible. There is no doubt that the loss function plays an important role in these two objectives. Therefore, a loss function that can measure the visual quality and the image detail should be investigated [55].

The Structural SIMilarity (SSIM) index is a perception-based metric [56]. It defines structure information (that is, the pixels that are close to each other in space are highly correlated) as an attribute independent of illumination and contrast information to reflect the structure of objects in the scene. Then, the image distortion is modeled as a combination of illumination, contrast, and structure.

The SSIM index is defined as follows:

(1)$$SSIM(x,y)=[l(x,y)]^\alpha\cdot[c(x,y)]^\beta\cdot[s(x,y)]^\gamma ,$$

where $x$ is the reconstructed image and $y$ is the ground truth image. $\alpha$, $\beta$, and $\gamma$ are the weight factors, all of which are greater than zero. $l(x,y)$, $c(x,y)$, and $s(x,y)$ are used to measure the illumination, contrast, and structure between two images, respectively, and are defined as follows:

(2)$$l(x,y)=\frac{2\mu_x\mu_y+C_1}{\mu_x^2+\mu_y^2+C_1},$$

(3)$$c(x,y)=\frac{2\delta_x\delta_y+C_2}{\delta_x^2+\delta_y^2+C_2},$$

(4)$$s(x,y)=\frac{\delta_{xy}+C_3}{\delta_x\delta_y+C_3},$$

where $C_1$, $C_2$, and $C_3$ are very small constants to avoid system errors caused when the corresponding denominator is $0$. In practical applications,

(5)$$C_3=\frac{1}{2}C_2,$$

when $\alpha$, $\beta$ and $\gamma$ are all taken as $1$, then Eq. (1) can be expressed as:

(6)$$SSIM(x,y)=\frac{2\mu_x\mu_y+C_1}{\mu_x^2+\mu_y^2+C_1}\cdot\frac{2\delta_{xy}+C_2}{\delta_x^2+\delta_y^2+C_2},$$

where $\mu _x$ and $\mu _y$, $\delta _x$ and $\delta _y$, and $\delta _{xy}$ are the means, standard deviations, and covariance between $x$ and $y$, respectively. In actual programming, they are calculated through a Gaussian function with a mean value of $0$ and a standard variance of $\sigma _g$ in exchange for higher efficiency. $C_1$, $C_2$ are calculated as follows:

(7) $$C_1=(K_1L)^2,$$

(8) $$C_2=(K_2L)^2,$$

where $L$ is the maximum value of the dynamic range of the image pixel value, generally $1$ or $255$ ($1$ in our experiments). $K_1$ and $K_2$ are positive constants much smaller than $1$.

The SSIM index is a scalar between $0$ and $1$. The larger the SSIM value, the smaller the structural difference between $x$ and $y$. Therefore, the structural similarity loss function used to optimize the network is defined as follows:

(9)$$L_{SSIM}(x,y)=1-SSIM(x,y).$$

$L_{SSIM}$ can retain high-frequency information (the edges and details of the image), making human visual perception more pleasant. However, $L_{SSIM}$ alone is not enough for the task of light equalization of microscopic images, and it is easy to cause the brightness or color of the reconstructed image to change compared with the ground truth image. Therefore, we introduce the $L_{l1}$ loss function to further optimize the brightness and color of the reconstructed image [57]. The $L_{l1}$ loss function is given as follows:

(10)$$L_{l1}(x,y)=||x-y||_1.$$

Formally, combined with Eq. (9) and Eq. (10), the final loss function in this paper is defined by:

(11)$$L_{loss}(x,y)=\varepsilon L_{SSIM}(x,y)+(1-\varepsilon)L_{l1}(x,y),$$

where $\varepsilon \in [0,1]$ is the weight to balance $L_{SSIM}$ and $L_{l1}$.

3. Experimental results

In this section, the proposed algorithm is compared with four algorithms to verify its effectiveness and superiority. These include two classic algorithms of morphological filtering (MF) [13] and polynomial fitting (PF) [28], and two state-of-the-art algorithms CIDRE [3] and BaSiC [32].

3.1 Dataset

We use a real microscopic cell image dataset, it is composed of images of female reproductive tract pathological cells (FRTPC). All $300$ cell images are obtained under the same conditions using a $10$-fold magnification, $250$ of which have been processed by experts as ground truth images. We use them for training together with the original images. The remaining $50$ images without any processing, will be used to test the trained model. The image size of the camera used in the experiments is $1600 \times 1200$ pixels, and $8$-bit RGB color images are generated. The collected cell images all contain uneven illumination distribution, which causes slight color distortion in their upper right corner.

3.2 Implementation details

Figure 1 shows the number of feature maps of each layer of the proposed network. Input the original image into the network and then resize it to $96\times 96$ through bilinear interpolation. Next, convolutions with a stride 2 are used for down-sampling. The size of feature maps is halved for each down-sampling, and the down-sampling of the last layer is scaled to $1\times 1$.

We train the model from scratch using Adam [58] and the initial learning rate is set at $10^{-4}$. The learning rate is multiplied by $0.9$ each after $100$ batch, and a batch size of $4$ is applied. The image pairs input to the network are randomly rotated or mirrored, which makes the image illumination more extensive and increases the applicability of the model. We set the $\varepsilon$ in Eq. (11) to be $0.4$ and train the network for $120$ epochs. We implement our framework with Tensorflow on an NVIDIA Tesla T$4$ GPU.

3.3 Qualitative results

We present visual comparisons on typical uneven illumination images of FRTPC dataset in Fig. 4. Fig. 4(a) is input five typical uneven illumination images, Fig. 4(b) is our algorithm’s prediction of the corresponding image illumination distribution. It can be seen from Fig. 4(c) that, visually, the five input images have not achieved light balance through the morphological filtering algorithm. The light intensity near the center of the five images is still stronger than the surroundings. Each image in Fig. 4(d) is the result of fitting by using a polynomial shading corrector in Fiji [28] to attempt to flatten the brightness of the entire image. But in the process, the overall color distribution of the image is distorted. For example, the position near the center of the fourth image and the lower left and upper right corners of the fifth image. Fig. 4(e)-(g) achieved better illumination correction on five input images.

Fig. 4. Visual comparison among competitors on FRTPC dataset. (b) is the illumination prediction of input images (a) by our method. MF method’s illumination correction failed in (c). PF method caused image color distortion in (d). In (e)-(f), CIDRE, BaSiC, and Ours realized the illumination correction of input images.

Download Full Size | PDF

Figure 5 shows the detailed comparison with CIDRE and BaSiC on two typical input images of Fig. 4. In the top row of Fig. 5, the enlarged area of the red box shows that all the three algorithms have, to some extent, corrected the color distortion area in the upper right corner of the original image, and the image details have also been well preserved. However, as can be seen from the top row of Fig. 5(b) and Fig. 5(c), artifacts composed of the image foreground colors appear in the image background processed by CIDRE and BaSiC, respectively, which reduces the image perception quality. From the enlarged area of the bottom row in Fig. 5, the bottom and right edges of Fig. 5(b) and Fig. 5(c) have brightness distortion, but the proposed algorithm handles the image edges very well, as shown in Fig. 5(d).

Fig. 5. Detail comparison with CIDRE and BaSiC on two typical test images. The edge of images processed by BaSiC and CIDRE has brightness distortion, and their background is mixed with the artifact of foreground color on the top row. Our method performs better in illumination correction and detail keeping.

Download Full Size | PDF

In FRTPC dataset experiment, the experimental results of the proposed illumination correction network are visually superior to the traditional methods, and also have sufficient competitiveness compared with the results of two state-of-the-art methods. Meanwhile, as shown in Fig. 4(b), the illumination information of the input image can be well described by the proposed method. Both CIDRE and BaSiC result images for comparison are already the best results obtained after estimation through the complete FRTPC dataset, and the R, G, and B channels are individually corrected through Matlab R2019a. The black areas on the edges of all images in Fig. 5 are extensions of the corresponding red box and are not part of the image itself.

3.4 Quantitative results

The subjective evaluation reflects the perception of an image by the human visual system. It is hard to find objective indicators consistent with personal evaluation [59]. Therefore, in different visual processing tasks, objective indicators are usually used to explain some important characteristics in the image. We evaluate the test image results of FRTPC dataset from three aspects of image information, distortion, and illumination uniformity by four non-reference evaluation indicators.

3.4.1 Assess the information amount of typical images

The information and details of the processed image should be preserved as much as possible. Entropy indicates how much information the image carries. It objectively evaluates the image from the perspective of information theory. Higher entropy value usually characterizes more details [60]. The image gray entropy is defined as follows:

(12)$$H(x) = \sum_{i=0}^{255}p_i log_2\frac{1}{p_i}={-}\sum_{i=0}^{255}p_ilog_2p_i,$$

(13)$$\sum_{i=0}^{255}p_i=1,$$

where $x$ is the gray image corresponding to the color image, $i$ is the gray level, and $p_i$ is the probability of the level $i$.

Table 1 demonstrates the calculation results of the information entropy of images in Fig. 4. The proposed method obtained the highest entropy value except for five original input images, which shows that ours can retain more information and details among competitors.

Table 1. Quantitative measurement results of entropy

View Table | View all tables in this article

3.4.2 Assess the distortion of typical images

There is visible color distortion in the upper right corner of each input image, for example, the image patch of the red box in the top row of Fig. 5(a). Consequently, there should be indicators to measure the degree of distortion of the image. But to our knowledge, there is no blind image quality assessment indicator that can specifically evaluate the distortion of medical images. We introduce two natural image evaluation indicators, the Natural Image Quality Evaluator (NIQE) [61] index, and the Spatial-Spectral Entropy-based Quality (SSEQ) [62] index to quantify the distortion of test images.

The NIQE index expresses the quality of the distorted image as a simple distance metric between the statistical features of the model and the distorted image, is formulated as:

(14)$$D(v_1,v_2,\Sigma_1,\Sigma_2)= \sqrt{(v_1-v_2)^T(\frac{\Sigma_1+\Sigma_2}{2})^{{-}1}(v_1-v_2)} ,$$

where $v_1,v_2$ and $\Sigma _1,\Sigma _2$ are the mean vectors and covariance matrices of the natural multivariate Gaussian (MVG) model and the distorted image’s MVG model [61]. Higher value represents a lower quality.

The SSEQ index matches well with human subjective viewpoints on image quality. It extracts a $12$-dimensional local entropy feature vector from the inputs and can assess the quality of distorted images across multiple distortion categories [62]. The SSEQ score typically has a value between $0$ and $100$ ($0$ represents the best quality, $100$ the worst).

As can be seen from Table 2 and Table 3, for the processing of five input images of Fig. 4, the proposed algorithm performs better than other four algorithms on these two indicators.

Table 2. Quantitative measurement results of NIQE

View Table | View all tables in this article

Table 3. Quantitative measurement results of SSEQ

View Table | View all tables in this article

3.4.3 Assess the illumination uniformity of typical images

Solving the uneven illumination of the original image is the purpose of the proposed algorithm. Fig. 4 and Fig. 5 demonstrate the superiority of our algorithm from the perspective of visual perception.

Inspired by the non-reference uneven illumination evaluation method for dermoscopy images [63], we take the Standard Deviation of Value (SDV) channel of the HSV(hue-saturation-value) color space as an objective indicator to assess the image illumination. In HSV color space, V represents the brightness of pixel points. The SDV score is defined as follows:

(15)$$SDV = \sqrt{\frac{\sum_{i=1}^{m}\sum_{j=1}^{n}(V(i,j)-\mu_v)^2}{m\times n-1}},$$

where $m,n$ represent the size of V channel of input image, $\mu _v$ represents its pixel mean, and $V(i,j)$ is the pixel value of the $i_{th}$ row and $j_{th}$ column. As shown in the bottom row of Fig. 6, it is capable of depicting the distribution of image illumination. The standard deviation reflects the degree of dispersion between the image pixel value and its pixel mean value. The smaller the standard deviation, the smaller the image contrast. Therefore, the smaller the SDV score, the more uniform the image illumination distribution.

Fig. 6. An example of V-channel image in HSV color space. In (f), the pixel brightness’s change of the V channel is smoother and there is no mutation compared with (a)-(e).

Download Full Size | PDF

Table 4 shows the SDV scores of corresponding images in Fig. 4. From Table 4, we can see that original images have the highest scores. The CIDRE and BaSiC have similar scores, and the SDV scores of MF are almost all the highest among compared algorithms. These match well with the visual effect presented in Fig. 4. It shows that SDV can accurately represent human visual perception. The proposed algorithm achieves the smallest SDV scores among competitors.

Table 4. Quantitative measurement results of SDV

View Table | View all tables in this article

3.4.4 Average scores of all test images of FRTPC dataset on these indicators

Table 5 lists the average scores of all test images of FRTPC dataset on four indicators. The proposed method achieves the best performance on all objective indicators.

Table 5. Average score of each indicator of all test images in FRTPC dataset

View Table | View all tables in this article

In summary, whether it is subjective visual perception or objective index scores, the proposed algorithm performs better than competitors.

4. Application on image mosaicking

Illumination correction of microscopic images is the basis of many computer vision tasks and pathological image analysis. In this section, we select two similar subsets of WSI dataset as the verification of the proposed algorithm in the application of image mosaicking.

4.1 WSI dataset

The WSI dataset is a public brightfield whole slide imaging dataset in $2020$. It contains many subsets, and each subset contains $100$ microscopic images. The image of WSI dataset produced shadow distortion due to uneven illumination and was acquired using a camera (CP$80$-$4$-C$500$, Optronis, Kehl, Germany) with a pixel size of $7\mu m$. They are $8$-bit RGB color images with a resolution of $2304 \times 1720$ pixels [31]. We select two similar subsets of them as our algorithm verification. We use the model trained in one of the subsets to predict the other subset, to obtain all the illumination-corrected images used for stitching.

Figure 7 presents the images of two WSI subsets and the results obtained by the proposed method. As shown in Fig. 7(b), our algorithm successfully depicts the light distribution of input images. The resulting images in Fig. 7(c) show a pleasant visual effect.

Fig. 7. Image tiles of two WSI subsets and results obtained by proposed method.

Download Full Size | PDF

4.2 Results of image mosaicking

Image mosaicking is to stitch two or more small field images into a large field image [64]. Its purpose is to enlarge the visual field of the image and obtain the complete information of the scanned sample. Well-stitched pathological images can facilitate the observation and diagnosis of pathologists and reduce their workload. However, if the illumination of images used for stitching is uneven, it will cause obvious seams in the stitching area.

Figure 8 exhibits two whole-slide images after the two WSI subsets are stitched separately. Fig. 9 is the results of separately stitching each image in the two subsets after illumination correction by our algorithm. Comparing Fig. 8 and Fig. 9, after processing by our method and then stitching, the visual effect of two whole-slide images are significantly improved, which can better assist the doctor in diagnosis.

Fig. 8. Two WSI subsets are stitched respectively.

Download Full Size | PDF

Fig. 9. Stitched results after the illumination correction of image tiles of two WSI subsets by proposed method respectively.

Download Full Size | PDF

In this section, the stitching tool used is the Grid stitching plugin in Fiji and all used images are processed the same way.

5. Discussion

The proposed method predicts the illumination information of the input image through the feature encoder module and the feature decoder module firstly. Moreover, the illumination information and original image features are combined through the detail supplement module to obtain an image with uniform illumination. In this section, we first implement the proposed algorithm in parallel to further verify the effectiveness of this network design idea for the treatment of uneven illumination of images. Then, we have a discussion between the proposed encoder-decoder structure and vanilla U-Net.

5.1 Parallel implementation (PI) of proposed method

The parallelized network structure is shown in Fig. 10. Fig. 11(b) presents the prediction of the image illumination distribution by the parallelized network, and Fig. 11(c) is its final outputs. Compared to Fig. 4(b), Fig. 11(b) depicts the illumination information in reverse. The feature encoder module and the feature decoder module can only see the places with strong light, and the places with weak light are filled by the detail supplement module. The loss function plays a vital role in this process. From the visual effect, the illumination in the upper right corner of the image in Fig. 11(c) is darker than that in Fig. 4(g). It reduces the perceived quality of the image.

Fig. 10. Parallelized network architecture of proposed method.

Download Full Size | PDF

Fig. 11. Visual outputs of proposed parallelized network.

Download Full Size | PDF

The third column of Table 6 shows the quantitative results of the parallelized network in test images of FRTPC dataset. Compared with the proposed network, the parallelized network achieves higher amount of image information and less average running time per test image. In other objective indicators, it performs worse than the proposed method.

Table 6. Average score by parallel implementation and different encoder-decoder structures in FRTPC dataset test images

View Table | View all tables in this article

Although the results of parallelized network do not surpass the proposed method in visual perception and objective indicators, it still has advantages compared with competitors. It further demonstrates the effectiveness and potential of this network architecture and design idea in dealing with uneven illumination of microscopic images.

The parameters of parallelized network are consistent with the network parameters in Fig. 1. Actually, the residual connection of the detail supplement module in Fig. 10 is not needed. This is just to correspond with Fig. 1.

5.2 Discussion of encoder-decoder architecture

Vanilla U-Net has achieved a great success in medical image segmentation, but it is not suitable to use it directly in the low-level vision tasks, due to the lack of global information [36]. Even in the framework of the proposed method, it is not acceptable to replace the feature encoder module and the feature decoder module with vanilla U-Net architecture, which will increase the huge amount of computation. In order to compare the proposed encoder-decoder structure with the traditional U-Net architecture, we design gradually in-depth comparative experiments, as shown in Table 7.

Table 7. Gradually replace the important operation in proposed encoder-decoder structure through the corresponding operation in vanilla U-Net

View Table | View all tables in this article

Where Deconv means regular deconvolution, Conv represents same convolution. Concat, SC, NN-Resize represent Concatenate, Skip Connection and NN-Resize Conv$\&$ReLU in Fig. 1, respectively. Before doing Max pooling, a convolution layer will be executed first. According to Table 7, we conducted comparative experiments. Except for the replaced operation, other parameters are consistent with Ours. The qualitative and quantitative results are shown in Fig. 12 and Table 6, respectively.

Fig. 12. Visual detail comparison with replaced encoder-decoder structures on two typical test images. Black artifacts can be seen in enlarged red boxes of (c), (d) and (f), and high-frequency checkerboard artifacts appear in their backgrounds. (b) shows better visual effect.

Download Full Size | PDF

From Fig. 12(c)-(e), we can find that there are high-frequency checkerboard artifacts in their background, and as the proposed encoder-decoder structure is gradually replaced, the checkerboard artifacts become more and more serious. Furthermore, we can see black artifacts in enlarged red boxes of Fig. 12(c)-(e). These two problems do not appear in Ours, Fig. 12(b) shows the best visual effect. As can be seen from Table 6, the proposed method is slightly faster than other replaced encoder-decoder structures in image processing speed. Meanwhile, it remains the best average scores on NIQE, SSEQ and SDV. Therefore, compared with replaced structures by vanilla U-Net, the proposed encoder-decoder structure achieves better results both qualitatively and quantitatively in deal with the uneven illumination of pathological image.

6. Conclusion

In this paper, we propose an effective FCN architecture to directly process color microscopic image with uneven illumination. Our method achieves the prediction of the illumination information distribution of input image and the corresponding correction of uneven illumination through three modules connected in series. Compared with the classic methods and the state-of-the-art methods, the proposed algorithm is very competitive in visual perception and objective evaluation. By preprocessing image tiles of WSI sequences though our method, the visual effects after image mosaicking are greatly improved. The parallel implementation of proposed method further proves the effectiveness of this network architecture in deal with the uneven illumination of microscopic image. Gradually in-depth comparative experiments show that the proposed encoder-decoder structure is better than vanilla U-Net architecture in the correction of uneven illumination of pathological image.

In the future, we will continue to explore the potential application of this network. Meanwhile, self-supervised learning to achieve uneven illumination correction of microscopic image will be an exciting research direction.

Funding

National Natural Science Foundation of China (11971269, 12071263, 61671276); Natural Science Foundation of Shandong Province (ZR2019MF045); National Science Fund for Distinguished Young Scholars (61625102).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the FRTPC dataset results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request. Data underlying the WSI dataset results presented in this paper are available in Ref. [31].

References

1. B. Likar, J. B. Maintz, M. A. Viergever, and F. Pernus, “Retrospective shading correction based on entropy minimization,” J. Microsc. 197(3), 285–295 (2000). [CrossRef]

2. H. K. Lee, M. S. Uddin, S. Sankaran, S. Hariharan, and S. Ahmed, “A field theoretical restoration method for images degraded by non-uniform light attenuation : an application for light microscopy,” Opt. Express 17(14), 11294–11308 (2009). [CrossRef]

3. K. Smith, Y. Li, F. Piccinini, G. Csucs, C. Balazs, A. Bevilacqua, and P. Horvath, “CIDRE: an illumination-correction method for optical microscopy,” Nat. Methods 12(5), 404–406 (2015). [CrossRef]

4. D. B. Goldman, “Vignette and exposure calibration and compensation,” IEEE Transactions on Pattern Analysis Mach. Intell. 32(12), 2276–2288 (2010). [CrossRef]

5. N. Dey, “Uneven illumination correction of digital images: A survey of the state-of-the-art,” Optik 183, 483–495 (2019). [CrossRef]

6. W. Pei, Y. Zhu, C. Liu, and Z. Xia, “Non-uniformity correction for slm microscopic images,” Image Vis. Comput. 27(6), 782–789 (2009). [CrossRef]

7. F. Al-Tam, A. dos Anjos, and H. R. Shahbazkia, “Iterative illumination correction with implicit regularization,” Signal, Image Video Process. 10(5), 967–974 (2016). [CrossRef]

8. F. Piccinini, E. Lucarelli, A. Gherardi, and A. Bevilacqua, “Multi-image based method to correct vignetting effect in light microscopy images,” J. Microsc. 248(1), 6–22 (2012). [CrossRef]

9. F. J. W.-M. Leong, M. Brady, and J. O. D. Mcgee, “Correction of uneven illumination (vignetting) in digital microscopy images,” J. Clin. Pathol. 56(8), 619–621 (2003). [CrossRef]

10. I. T. Young, “Shading correction: Compensation for illumination and sensor inhomogeneities,” Curr. protocols immunology 14 (2000).

11. D. Tomazevic, B. Likar, and F. Pernus, “Comparative evaluation of retrospective shading correction methods,” J. Microsc. 208(3), 212–223 (2002). [CrossRef]

12. G. D. Marty, “Blank-field correction for achieving a uniform white background in brightfield digital photomicrographs,” BioTechniques 42(6), 716–720 (2007). [CrossRef]

13. G. Babaloukas, N. Tentolouris, S. Liatis, A. Sklavounou, and D. Perrea, “Evaluation of three methods for retrospective correction of vignetting on medical microscopy images utilizing two open source software tools,” J. Microsc. 244(3), 320–324 (2011). [CrossRef]

14. S. Singh, M. Bray, T. Jones, and A. Carpenter, “Pipeline for illumination correction of images for high-throughput microscopy,” J. Microsc. 256(3), 231–236 (2014). [CrossRef]

15. F. Piccinini and A. Bevilacqua, “Colour vignetting correction for microscopy image mosaics used for quantitative analyses,” BioMed Res. Int. 2018, 1–15 (2018). [CrossRef]

16. A. E. Carpenter, T. R. Jones, M. R. Lamprecht, C. Clarke, I. H. Kang, O. Friman, D. A. Guertin, J. H. Chang, R. A. Lindquist, J. Moffat, P. Golland, and D. M. Sabatini, “CellProfiler: image analysis software for identifying and quantifying cell phenotypes,” Genome Biol. 7(10), R100–11 (2006). [CrossRef]

17. Y.-N. Sun, C. H. Lin, C. C. Kuo, C.-L. Ho, and C.-J. Lin, “Live cell tracking based on cellular state recognition from microscopic images,” J. Microsc. 235(1), 94–105 (2009). [CrossRef]

18. H. Xiao, Y. Li, J. Du, and A. Mosig, “Ct3d: tracking microglia motility in 3D using a novel cosegmentation approach,” Bioinformatics 27(4), 564–571 (2011). [CrossRef]

19. C. Sun, R. Beare, V. Hilsenstein, and P. T. Jackway, “Mosaicing of microscope images with global geometric and radiometric corrections,” J. Microsc. 224(2), 158–165 (2006). [CrossRef]

20. D. Gareau, Y. Patel, Y. Li, I. Aranda, A. Halpern, K. Nehal, and M. Rajadhyaksha, “Confocal mosaicing microscopy in skin excisions: a demonstration of rapid surgical pathology,” J. Microsc. 233(1), 149–159 (2009). [CrossRef]

21. K. Li, E. D. Miller, M. Chen, T. Kanade, L. E. Weiss, and P. G. Campbell, “Cell population tracking and lineage construction with spatiotemporal context,” Med. Image Anal. 12(5), 546–566 (2008). [CrossRef]

22. F. L. Gimeno, S. C. Gatto, J. O. Croxatto, J. I. Ferro, and J. E. Gallo, “In vivo lamellar keratoplasty using platelet-rich plasma as a bioadhesive,” Eye 24(2), 368–375 (2010). [CrossRef]

23. Y. Zheng, J. Yu, S. B. Kang, S. Lin, and C. Kambhamettu, “Single-image vignetting correction using radial gradient symmetry,” in IEEE Conference on Computer Vision and Pattern Recognition (2008), pp. 1–8.

24. S. Lyu, “Estimating vignetting function from a single image for image authentication,” in Proceedings of the 12th ACM Workshop on Multimedia and Security (2010), pp. 3–12.

25. V. Ljosa and A. E. Carpenter, “Introduction to the quantitative analysis of two-dimensional fluorescence microscopy images for cell-based screening,” PLoS Comput. Biol. 5(12), e1000603 (2009). [CrossRef]

26. T. J. Collins, “ImageJ for microscopy,” BioTechniques 43(1S), S25–S30 (2007). [CrossRef]

27. R. W. Solomon, “Free and open source software for the manipulation of digital images,” AJR, Am. J. Roentgenol. 192(6), W330–W334 (2009). [CrossRef]

28. J. Schindelin, I. Arganda-Carreras, E. Frise, V. Kaynig, M. Longair, T. Pietzsch, S. Preibisch, C. T. Rueden, S. Saalfeld, B. Schmid, J.-Y. Tinevez, D. J. White, V. Hartenstein, K. Eliceiri, P. Tomancak, and A. Cardona, “Fiji: an open-source platform for biological-image analysis,” Nat. Methods 9(7), 676–682 (2012). [CrossRef]

29. O. Chernavskaia, S. Guo, T. Meyer, N. Vogler, D. Akimov, S. Heuke, R. Heintzmann, T. Bocklitz, and J. Popp, “Correction of mosaicking artifacts in multimodal images caused by uneven illumination,” J. Chemom. 31(6), e2901 (2017). [CrossRef]

30. H. Lee and J. Kim, “Retrospective correction of nonuniform illumination on bi-level images,” Opt. Express 17(26), 23880–23893 (2009). [CrossRef]

31. Y.-O. Tak, A. Park, J. Choi, J. Eom, H.-S. Kwon, and J. B. Eom, “Simple shading correction method for brightfield whole slide imaging,” Sensors 20(11), 3084 (2020). [CrossRef]

32. T. Peng, K. Thorn, T. Schroeder, L. Wang, F. J. Theis, C. Marr, and N. Navab, “A BaSiC tool for background and shading correction of optical microscopy images,” Nat. Commun. 8(1), 14836 (2017). [CrossRef]

33. B. Lin, S. Fu, C. Zhang, F. Wang, and Y. Li, “Optical fringe patterns filtering based on multi-stage convolution neural network,” Opt. Lasers Eng. 126, 105853 (2020). [CrossRef]

34. K. G. Lore, A. Akintayo, and S. Sarkar, “LLNet: A deep autoencoder approach to natural low-light image enhancement,” Pattern Recognit. 61, 650–662 (2017). [CrossRef]

35. C. Zhang, K. Wang, Y. An, K. He, T. Tong, and J. Tian, “Improved generative adversarial networks using the total gradient loss for the resolution enhancement of fluorescence images,” Biomed. Opt. Express 10(9), 4742–4756 (2019). [CrossRef]

36. Z. Meng, R. Xu, and C. M. Ho, “GIA-Net: Global information aware network for low-light imaging,” ECCV Workshops 327–342 (2020).

37. C. Chen, Q. Chen, J. Xu, and V. Koltun, “Learning to see in the dark,” in IEEE Conference on Computer Vision and Pattern Recognition (2018), pp. 3291–3300.

38. W. Wang, C. Wei, W. Yang, and J. Liu, “GLADNet: Low-light enhancement network with global awareness,” in 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018) (2018), pp. 751–755.

39. Y. Zhang, J. Zhang, and X. Guo, “Kindling the Darkness: A practical low-light image enhancer,” in Proceedings of the 27th ACM International Conference on Multimedia (2019), pp. 1632–1640.

40. L. Hu, S. Hu, W. Gong, and K. Si, “Image enhancement for fluorescence microscopy based on deep learning with prior knowledge of aberration,” Opt. Lett. 46(9), 2055–2058 (2021). [CrossRef]

41. X. Cao, S. Rong, Y. Liu, T. Li, Q. Wang, and B. He, “NUICNet: Non-uniform illumination correction for underwater image using fully convolutional network,” IEEE Access 8, 109989–110002 (2020). [CrossRef]

42. X.-F. Mei, F.-Y. Xie, and Z.-G. Jiang, “Uneven illumination removal based on fully convolutional network for dermoscopy images,” in 13th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP) (2016), pp. 243–247.

43. I. Vishniakou and J. D. Seelig, “Wavefront correction for adaptive optics with reflected light and deep neural networks,” Opt. Express 28(10), 15459–15471 (2020). [CrossRef]

44. S. Goswami and S. K. Singh, “A simple deep learning based image illumination correction method for paintings,” Pattern Recognit. Lett. 138, 392–396 (2020). [CrossRef]

45. O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional networks for biomedical image segmentation,” in International Conference on Medical Image Computing and Computer-Assisted Intervention (2015), pp. 234–241.

46. V. Dumoulin and F. Visin, “A guide to convolution arithmetic for deep learning,” arXiv preprint arXiv:1603.07285 (2016).

47. W. Shi, J. Caballero, L. Theis, F. Huszar, A. P. Aitken, C. Ledig, and Z. Wang, “Is the deconvolution layer the same as a convolutional layer,” (2016).

48. X. Gu, J. Liu, X. Zou, and P. Kuang, “Using checkerboard rendering and deconvolution to eliminate checkerboard artifacts in images generated by neural networks,” in 14th International Computer Conference on Wavelet Active Media Technology and Information Processing (ICCWAMTIP) (2017), pp. 197–200.

49. A. Odena, V. Dumoulin, and C. Olah, “Deconvolution and checkerboard artifacts, ’ http://distill.pub/2016/deconv-checkerboard,”.

50. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015), pp. 1–9.

51. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), pp. 770–778.

52. K. He, X. Zhang, S. Ren, and J. Sun, “Identity mappings in deep residual networks,” in European Conference on Computer Vision (2016), pp. 630–645.

53. A. Veit, M. Wilber, and S. Belongie, “Residual networks behave like ensembles of relatively shallow networks,” in NIPS Proceedings of the 30th International Conference on Neural Information Processing Systems, vol. 29 (2016), pp. 550–558.

54. D. Balduzzi, M. Frean, L. Leary, J. P. Lewis, K. W.-D. Ma, and B. McWilliams, “The shattered gradients problem: if resnets are the answer, then what is the question,” in ICML Proceedings of the 34th International Conference on Machine Learning, vol. 70 (2017), pp. 342–350.

55. H. Chen, S. Fu, H. Wang, H. Lv, C. Zhang, F. Wang, and Y. Li, “Feature-oriented singular value shrinkage for optical coherence tomography image,” Opt. Lasers Eng. 114, 111–120 (2019). [CrossRef]

56. Z. Wang, A. Bovik, H. Sheikh, and E. Simoncelli, “Image quality assessment: from error visibility to structural similarity,” IEEE Transactions on Image Process. 13(4), 600–612 (2004). [CrossRef]

57. H. Zhao, O. Gallo, I. Frosio, and J. Kautz, “Loss functions for image restoration with neural networks,” IEEE Transactions on Comput. Imaging 3(1), 47–57 (2017). [CrossRef]

58. D. P. Kingma and J. L. Ba, “Adam: A method for stochastic optimization,” in International Conference on Learning Representations (2015).

59. S. Wang, J. Zheng, H.-M. Hu, and B. Li, “Naturalness preserved enhancement algorithm for non-uniform illumination images,” IEEE Transactions on Image Process. 22(9), 3538–3548 (2013). [CrossRef]

60. Z. Ye, H. Mohamadian, and Y. Ye, “Discrete entropy and relative entropy study on nonlinear clustering of underwater and arial images,” in IEEE International Conference on Control Applications (2007), pp. 313–318.

61. A. Mittal, R. Soundararajan, and A. C. Bovik, “Making a completely blind image quality analyzer,” IEEE Signal Process. Lett. 20(3), 209–212 (2013). [CrossRef]

62. L. Liu, B. Liu, H. Huang, and A. C. Bovik, “No-reference image quality assessment based on spatial and spectral entropies,” Signal Process. Commun. 29(8), 856–863 (2014). [CrossRef]

63. Y. Lu, F. Xie, Y. Wu, Z. Jiang, and R. Meng, “No reference uneven illumination assessment for dermoscopy images,” IEEE Signal Process. Lett. 22(5), 534–538 (2015). [CrossRef]

64. R. Szeliski, “Image alignment and stitching: A tutorial,” Foundations Trends Comput. Graph. Vis. 2(1), 1–104 (2007). [CrossRef]

Images	Input	MF	PF	CIDRE	BaSiC	Ours
i	6.9898	6.5969	6.4893	6.4459	6.4658	6.6623
ii	6.3061	4.8317	4.3885	4.2286	4.3074	5.0411
iii	6.6586	5.9706	5.8010	5.7183	5.7505	6.1527
iv	6.9434	6.5632	6.5436	6.4386	6.4590	6.6308
v	7.1425	6.7945	6.7609	6.7154	6.7332	6.8751

Images	Input	MF	PF	CIDRE	BaSiC	Ours
i	4.6171	4.6949	4.6040	4.5967	4.5997	4.5706
ii	6.4695	5.9587	5.9487	5.8353	5.8445	5.6839
iii	5.4235	5.2366	5.1471	5.1080	5.1523	5.0434
iv	4.4422	4.5044	4.4114	4.4072	4.4065	4.5623
v	5.1091	5.0732	5.0157	4.9985	5.0105	4.7416

Images	Input	MF	PF	CIDRE	BaSiC	Ours
i	31.9332	35.6416	36.1279	37.2845	35.9402	32.6816
ii	53.3642	49.1579	51.0982	51.0336	51.5610	47.3677
iii	35.4995	31.3112	31.1288	31.0675	30.9224	30.3380
iv	29.3889	29.4557	29.5667	29.9134	30.4084	28.9607
v	44.4081	44.2152	44.2851	44.0795	44.1288	40.9653

Images	Input	MF	PF	CIDRE	BaSiC	Ours
i	0.1033	0.0626	0.0589	0.0579	0.0574	0.0492
ii	0.0919	0.0359	0.0362	0.0354	0.0353	0.0245
iii	0.0932	0.0444	0.0429	0.0424	0.0421	0.0330
iv	0.0976	0.0559	0.0513	0.0504	0.0498	0.0428
v	0.0984	0.0514	0.0484	0.0477	0.0472	0.0405

Indicators	Input	MF	PF	CIDRE	BaSiC	Ours
Entropy	6.7469	5.9431	5.7650	5.6741	5.7166	6.0690
NIQE	5.3284	5.1558	5.1207	5.0876	5.0909	4.9238
SSEQ	39.6771	38.2165	38.7818	38.8223	39.0060	36.0620
SDV	0.0974	0.0488	0.0480	0.0474	0.0471	0.0382

Correction of uneven illumination in color microscopic image based on fully convolutional network

Abstract

1. Introduction

2. Methodology

2.1 Feature encoder module

2.2 Feature decoder module

2.3 Detail supplement module

2.4 Loss function

3. Experimental results

3.1 Dataset

3.2 Implementation details

3.3 Qualitative results

3.4 Quantitative results

3.4.1 Assess the information amount of typical images

3.4.2 Assess the distortion of typical images

3.4.3 Assess the illumination uniformity of typical images

3.4.4 Average scores of all test images of FRTPC dataset on these indicators

4. Application on image mosaicking

4.1 WSI dataset

4.2 Results of image mosaicking

5. Discussion

5.1 Parallel implementation (PI) of proposed method

5.2 Discussion of encoder-decoder architecture

6. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (12)

Tables (7)

Equations (15)

Optics Express

Indicators	Ours	OursPI	OursDeconv	OursDeconvPool	OursU-net
Entropy	6.0690	6.1981	6.0493	6.0825	6.1053
NIQE	4.9238	4.9949	4.9504	4.9957	5.0085
SSEQ	36.0620	36.7793	37.8691	37.1735	36.4496
SDV	0.0382	0.0441	0.0398	0.0424	0.0422
Time	0.6429 s	0.6004 s	0.6613 s	0.6626 s	0.6436 s

	Up-sampling		Down-sampling		Feature merge
	Deconv	NN-Resize	Max pooling	Stride=2, Conv	Concat	SC
Vanilla U-Net	✓		✓		✓
Ours		✓		✓		✓
OursDeconv	✓			✓		✓
OursDeconvPool	✓		✓			✓
OursU-net	✓		✓		✓