Retina inspired tone mapping method for high dynamic range images

Xian-Shi Zhang; Kai-fu Yang; Jun Zhou; Yong-Jie Li

doi:10.1364/OE.380555

1. Introduction

In natural scenes, the dynamic range of illumination may reach beyond 5 logarithmic units. Modern technologies enable us to capture this high dynamic range (HDR) with full precision and maintain all relevant information in HDR images. These images are much more precise than normal images [1]. Unfortunately, the dynamic range of regular screens that usually use a byte per color channel per pixel to display images is less than 2 log units. The dynamic range gap between the real world and regular screens obstructs a full use of the advantages of HDR images. As shown in Fig. 1(a)-(d), a shorter exposure is used to make the tree branches visible under the sunlight, which, however, will lead to an over-dark shack. On the other hand, increasing exposures can make the details far away from the light source visible, but will make the regions near the sun saturated. In order to make every all the parts of an HDR image simultaneously visible on a regular screen, the dynamic range of HDR images needs to be regularized to match the screen. As shown in Fig. 1(e), adjusted by the proposed method, both the shack interior, where is lack of illumination, and the treetop, where is under full illumination, are all clearly visible. This adjustment, aiming to perceptually reproduce the real scenes on the displays of limited dynamic range, is usually called Tone Mapping (TM) [2].

Fig. 1. An HDR image displayed before and after TM. (a)-(d) A same HDR scene displayed directly by scaling its intensity with four different factors; (e) The same image displayed after tone mapping by our method.

Download Full Size | PDF

Nowadays, thanks to various TM methods, the representation quality of HDR images on regular screens has been significantly improved [1]. However, existing TM methods still have drawbacks, which can be briefly summarized as follows. First, halo artifacts around the edges of high contrast is a common problem. Second, in the results of some TM methods, especially the layer decomposition-based ones, small textural details may be lost. Third, it is hard to maintain stable performance for various scenes. For example, local TM methods work better on segment matching but worse on scene reproduction than global TM methods [2]. Fourth, some TM methods may cause unexpected changes in color appearance due to the distortion of contrast relationships.

To overcome these flaws, the proposed method attempt to mimic the retinal processing mechanisms that express huge amounts of visual information by their limited resources. According to recent physiological findings, the retina not only provides a place for phototransduction but also executes a diverse set of visual tasks with the ingenious cooperations among various neurons in it [3]. For example, horizontal cells (HCs) adaptively adjust their receptive field (RF) size based on the local brightness to regulate the signals. Bipolar cells (BCs) receive opposed signals from the RF center and surround to extract details from the background. Benefiting from these mechanisms, the proposed method enhances the visibility of various lighting regions, avoids halo artifacts, improves textural details and provides robust performance for diverse HDR scenes. We think that the proposed method may establish links between the retinal mechanisms and the existing TM methods. With this connection, apart from proposing a novel TM method for computer vision, this work also provides new computational insights into information compression mechanisms in biological vision.

This paper is based partly on our previous work presented at a conference [4] and is substantially refined and extended here as follows: 1) The refined method takes into account the non-selective HCs ignored by the conference paper for more physiological plausibility. 2) Some manual parameters in the conference paper are automatically determined by the features of the input image now. 3) The previous conference paper enhanced the color excessively and led to an oversaturated color appearance. Now our refined method can gain higher scores in three different objective metrics. 4) The comparison in the conference paper is made on limited scenes against some classical TM methods that are proposed before 2010. Now we compare the refined method against state-of-the-art methods, most of which are proposed from 2015 to 2019, on a relatively large dataset.

2. Related work

Early TM methods [5] inspired by photographic practices were developed to fusion-based approaches [6], while the methods based on the histogram were developed to different approaches, such as difference representation [7], clustering [8], or Gamma adaptation [9]. To avoid color appearance changes caused by luminance compression, Wang et al. introduced a color correction method [10]. Some TM methods try to compress the dynamic range of illumination only. Raman et al. [11] and Paris et al. [12] made use of the bilateral filter to preserve edges. Choudhury and Medioni used hierarchical Laplacian pyramid framework to enhance sharpness [13]. Eilertsen et al. introduced noise-aware tone curves to reduce the visibility of noise [14]. Fu et al. proposed a probabilistic method and extended it to a weighted variational version [15,16]. From the intrinsic constraints of reflectance and illumination, Yue et al. proposed an image decomposition prior [17]. Liang et al. compressed the dynamic range directly in the gradient field to eliminate the contrast reversals [18]. In addition, due to the similar ways to handle the illumination, some low light image enhancement techniques, such as LIME [19], can also be used as a TM method.

Along another line, several TM methods are inspired by the biological visual system. Many models are built on the well-known human vision theory of Retinex [20–23]. Laparra et al. obtained a perceptual dissimilarity measure from the human visual system [24] for evaluating TM. Meylan et al. [25] and Tamburrino et al. [26] imitated the adaptation of photoreceptors by local operators, while Kim et al. proposed a consistency method by global adaptation [27]. Alleysson et al. [28] and Hateren [29] took HCs into account. Benoit et al. modeled the Parvocellular pathway from the retina to the V1 cortex [30]. Banterle et al. segmented the HDR images based on psychophysical experiments when realizing dynamic range compression [31].

Recently, some deep learning-based TM methods were reported [32,33]. These methods can be considered as an imitation of the connectivity patterns of visual neurons. How to get enough training data is the main challenge faced by these methods. Besides that, lots of computational resources are also indispensable during the training stage.

3. The proposed method

3.1 General description

When light falls into the eyes, it will be translated to neural signals by retinal photoreceptors. Then the signals will be transmitted through bipolar cells (BCs) and many other neurons to ganglion cells under the regularization of horizontal cells (HCs) and amacrine cells, and finally, be projected to the brain by ganglion cells. The significantly decreased number of neurons from photoreceptors to BCs implies that in the retina, the process of dynamic range compression of the visual signals occurs before the ganglion cell layer [35]. Therefore currently we do not take amacrine cells and ganglion cells into account. The flowchart of our method is shown in Fig. 2. Basically, the vertical pathway from photoreceptors to BCs is used to eliminate the redundant information and enhance the textural details, while the lateral pathway from the laterally extended HCs to photoreceptors works to provide a local adaptive gain adjustment. Although it is still a quite simplified framework compared with the biological retina, the proposed method emulates the key points of the retinal information compression mechanisms and renders HDR images very well.

Fig. 2. The flowchart of the proposed method. A schematic of the retina [34] is placed on the left side. The input HDR image, the input of BCs, and the final output are posted on the right sides of the corresponding levels.

Download Full Size | PDF

3.2 Lateral pathway

3.2.1 Adaptive modulation from horizontal cells

In the retina, there are different types of HCs. Some HCs contacting photoreceptors with more or less selectivity [36] are called selective HCs, while some of the others contacting photoreceptors indiscriminately [37] are called non-selective HCs. We take both of them into account. Denoted the input image as $f_{c}(x,y),c\in \{R,G,B\}$, the selective HCs as $HCin_{c}(x,y),c\in \{R,G,B\}$, and the non-selective HCs as $HCin_{L}(x,y)$, HCs collect the photoreceptor signals according to

(1)$$\begin{aligned}HCin_{R}&=R_{in}=f_R(x,y) \\ HCin_{G}&=G_{in}=f_G(x,y) \\ HCin_{B}&=B_{in}=f_B(x,y) \\ HCin_{L}&=L_{in}=\frac{(f_R(x,y)+f_G(x,y)+f_B(x,y)}{3} \end{aligned}$$

and the HC feedback that will modulate the photoreceptor signals is computed as

(2)$$\begin{aligned}&HCbk_{c}(x,y)\!=\!log(L_{in})\!\times\!(HCin_{L}\!\ast\!G_{\sigma_L(x,y)})(x,y) \\ &\!+\!(1\!-\!log(L_{in}))\!\times\!(HCin_{c}\!\ast\!G_{\sigma_c(x,y)})(x,y) \end{aligned}$$

where $c\in \{R,G,B\}$, $\ast$ is a convolution operator, and $G(x,y;\sigma _c(x,y))$ is a 2D Gaussian filter written as

(3)$$G_{\sigma_c(x,y)}(x,y)=\frac{1}{2\pi\sigma_{c}^{2}(x,y)}\exp \left( \frac{-(x^{2}+y^{2})}{2\sigma_{c}^{2}(x,y)} \right)$$

where $\sigma _{c}(x,y)$ is the standard deviation of the receptive field which takes shape through local gap junctional coupling.

The receptive field (RF) of HCs is modulated by light [38]. Under dim starlight or bright daylight conditions, a low conductance of the gap junctions results in a small RF size, while under twilight conditions, a high conductance leads to a large RF size [39]. For simplicity, we use a piecewise linear function with four subdomains to imitate this light-induced coupling of HCs, which is described as

(4)$$\sigma_n(x,y) = \begin{cases} \frac{sigma}{5} ~~ & \frac{3s}{20}\!<\!\left|HCin_{n}(x,y)\!-\!m\right|\\ \frac{2sigma}{5} & \frac{2s}{20}\!<\!\left|HCin_{n}(x,y)\!-\!m\right|\!\leq\!\frac{3s}{20}\\ \frac{3sigma}{5} & \frac{s}{20}\!<\!\left|HCin_{n}(x,y)\!-\!m\right|\!\leq\!\frac{2s}{20}\\ sigma & \left| HCin_{n}(x,y)\!-\!m\right| \!\leq\!\frac{s}{20}\end{cases}$$

where $n\in \{R,G,B,L\}$, $m$ is the average intensity of all pixels, while $s$ is their standard deviation. $sigma$, corresponding to the strongest coupling, is set to be 1.0 according to physiological recordings [38].

3.2.2 Adaptive photoreceptor responses

The input signals of BCs is

(5)$$BCin_{c}(x,y)\!=\!\cfrac{f_{c}^{l}(x,y)}{m^{l}\!+\!HCbk_{c}^{l}(x,y)}$$

where $c\in \{R,G,B\}$. This function is inspired mainly by the Naka-Rushton empirical equation, a S-potential response as a function of light stimuli recorded in distal retinal neurons [1]. The average intensity of input pixels $m$ and the parameter $l$ correspond respectively to the semi-saturation constant and the sensitivity control exponent in the Naka-Rushton equation. In this work, $l$ is an image-specific parameter adaptively computed as

(6)$$l=0.8+\frac{0.4}{\exp(s)}$$

A higher $s$ value, which means a higher dynamic range, leads to a lower $l$ value, which means a higher compression ratio. In contrast, a lower $s$ value leads to a higher $l$ value, which helps preserve more contrast.

3.3 Vertical pathway

3.3.1 Process in bipolar cells

Signals received from RF center and surround have opposed contributions to the output of Bipolar cells (BCs) [40]. This center-surround antagonistic mechanism assists to compress redundant signals and boost spatial resolution. The BC output $BCout_{c}(x,y),c\in \{R,G,B\}$, which is the final output of the proposed method, is computed as

(7)$$BCout_{c}(x,y)\!=\!\max\Bigl[0,(BCin_{c}\!\ast\!DoG_{\sigma_{cen},\sigma_{sur}})(x,y)\Bigr]$$

where

(8)$$DoG_{\sigma_{cen},\sigma_{sur}}(x,y)=G_{\sigma_{cen}}(x,y)-k\cdot G_{\sigma_{sur}}(x,y)$$

where $k$ is the inhibitory coefficient of the RF surround, $\sigma _{cen}$ and $\sigma _{sur}$ define the size of RF center and its surround. In this work, these parameters are empirically selected to be 0.3, 0.5 and 1.0, respectively.

4. Experiments

4.1 Adaptive HCs RF size

The proposed method introduces the adaptive mechanisms of RF structures for HCs to avoid the halo artifact that is one of the greatest challenges in TM methods. We use a piecewise linear function to quantify this adaptation. To determine a proper piecewise function, we tested it with different numbers of sub-domains. Figure 3 demonstrates the relationship among performance, efficiency, and the number of piecewise function sub-domains. Blue circles in Fig. 3 denote the performance of the proposed method quantified by TMQI [41]. The amount of sub-domain decides how many different RF sizes involved. When there is only one sub-domain, the RF size is constant, which results in conspicuous dark halos around the candlelights and the lowest TMQI. When the sub-domain number is greater than one, conspicuous dark halos gradually fade due to the localized HC interaction. TMQI increases obviously when the sub-domain number increases from 1 to 3, while TMQI keeps almost unchanged when the sub-domains are greater than 4. On the other hand, as represented by the red asterisks in Fig. 3, the calculation time increases linearly with the number of sub-domains. To balance the performance and efficiency, we set the sub-domain number to 4, as described by Eq. (4).

Fig. 3. The relationship among performance, efficiency, and number of piecewise function sub-domains.

Download Full Size | PDF

4.2 Visual comparison

Figure 4 shows visual comparisons among the results of our method and some recent TM methods. The original HDR images are cited from Mark Fairchild’s HDR Photographic Survey [42]. Generally speaking, in our results, the visibility of the structures and color appearance are natural as the scenes should be. For example, in the night shot of the BMW Zentrum on the last column, the indoor areas near the lights are overexposed in the results of some methods [18,23], while due to the weak brightness, the details of grassland are not visible in Fu’s result [16]. In contrast, only our method, Yang’s method [9], Guo’s method [19], and Eilertsen’s method [14] display both the bright and dark regions well. Yang’s method [9] compresses the dynamic range well but suffers from over-saturation of color appearance. In some cases, e.g., in the night shot of the Waffle House on the fourth column, the noises in the dark areas are un-wantedly amplified. Guo’s method [19] handles contrast well but may introduce some artifacts, e.g., the dark shadows around the "FRONTIER" sign in the first image and the abnormal bright sky in the second column labeled by red-dashed circles. Eilertsen’s method [14] introduces similar distortions, e.g., the dark halos labeled by yellow-dashed circles around the light bulbs in the third and fifth column images. Comparatively, our algorithm achieves a good balance between the dynamic range compression, the detail enhancement, and the naturalness preservation.

Fig. 4. Comparisons with some representative tone mapping methods on several HDR images from the dataset of Mark Fairchild’s Survey [42]. Note that the brightness of the original images listed in the first row is globally increased for better visualization.

Download Full Size | PDF

4.3 Observer test

We invited 23 observers including seven female and sixteen male graduate students to rank these compared methods. Their ages were between 22 and 28 years. They had ordinary color vision and normal or corrected to normal visual acuity. The compared images were displayed on a Dell 19.5-inch E2016HL LCD display (typical luminance output $250~cd/m^{2}$) with its default color settings which are close to the sRGB model. This experiment was conducted in a dark room minimizing the effects of ambient light. The subjects were required to watch the images on the LCD display at a perpendicular distance of 0.5m to the display and grade the compared images of each scene from the best to the worst, scored as 0 to 6. If subjects thought two images had equal quality, they would give the same score to these two images. The subjects were allowed to take as much time as needed to decide the scores and to revise their scores if necessary. This test contains 20 scenes from HDR Photographic Survey [42], 5 of them are shown in Fig. 4. Subjects took approximately 50 minutes on average to finish the whole test. After that, we averaged the scores of each method and yielded the ranking.

As shown in Table 1, on average both female and male group evaluate the results of our proposed method as the best, and the results of Liang’s method [18] as the second-best. The difference between scores of the proposed method and Liang’s method [18] is statistically significant ($p=0.0051$). In contrast, the difference between the scores of Guo’s method [19], which is considered as the third-best by female group, and Eilertsen’s method [14], which is thought to be the third-best by male group, is not statistically significant ($p=0.2978$).

Table 1. Results of the observer tests. The listed scores of one subject are averaged over 20 scenes.

View Table

4.4 Quantitative comparison

We further conducted a quantitative comparison against several state-of-the-art TM methods on Mark Fairchild’s HDR Photographic Survey [42]. This dataset contains 105 HDR images, some of which are shown in Fig. 4.

We quantified this comparison with three metrics: TMQI [41], BTMQI [43], and HIGRADE [44]. For TMQI and HIGRADE, a higher score means a better tone-mapped image, while for BTMQI, a smaller value represents better image quality. According to the original literature, as for the computation of TMQI, the reference images are their corresponding HDR images.

Figure 5 shows the distributions of these three metrics obtained by each method over all the images in the dataset as a violin plot. In terms of TMQI, the median of the proposed method is 0.91, the mean is 0.88, the maximum is 0.98, and the minimum is 0.71. In terms of BTMQI, the median of the proposed method is 3.68, the mean is 3.76, the maximum is 7.30, and the minimum is 1.0. In terms of HIGRADE-1, the median of the proposed method is 0.17, the mean is 0.15, the maximum is 1.28, and the minimum is -1.06. On all three objective metrics, our method ranks 1st, showing a robust performance consistent with the subjective score.

Fig. 5. Quantitative comparisons on all the 105 images of Mark Fairchild’s HDR Survey dataset.

Download Full Size | PDF

4.5 Computational time

The computational complexity of the proposed method is $O(n)$. Here we compared the computational time of multiple TM methods. For each method, the processing time on images with different sizes was recorded by the MATLAB functions (Intel i7-6700K 4.0 GHz CPU with 16GB RAM, Windows 7 Pro 64) and represented in logarithmic form. As shown in Fig. 6, our method takes a moderate computational time, which increases nearly linearly with the image size and ranks 7th in all the 22 TM methods compared here.

Fig. 6. Comparison of the processing time on images with different sizes. Histograms of different colors represent the processing time. The numbers on the right of the histograms are the processing time in seconds on a $4288\times 2846$ pixel image.

Download Full Size | PDF

5. Discussion and conclusion

Although hardware solutions have achieved large progress [45], HDR display devices with more than 5 logarithmic units dynamic range are too expensive to be widely used due to the requirement of high-amplitude drive circuits and high luminance light sources [46]. Up to the present time, the most affordable approaches to display HDR images are still to use TM techniques to map them into a byte per color channel per pixel and represent on regular screens. This paper proposes a robust TM method inspired by the retinal processing mechanisms, especially the adaptive size of HCs RF and the antagonistic structure of BCs RF. In the proposed method, HCs shift the photoreceptor sensitivity to regulate the signals, while BCs partially filter out redundant information. The most similar works to the proposed method are perhaps that of [28] and [29], which also introduce the HC feedback into TM. However, they do not take the light-induced HC coupling into account, which results in a fixated RF size of HCs and hence the possible distortion around the high-contrast borders as shown in Fig. 3. Different from the former works, our method introduces the dynamic gap junction, an important mode of retinal neuronal communication [39], between HCs to automatically regulate signals by varying the RF size. This regulation can greatly reduce the global interaction in HCs and hence prevent halo artifacts. Besides that, the proposed method adaptively sets the parameters based on the image features, providing a robust performance on diverse scenes.

Although our method is inspired by the mechanisms in the retina, it does not mean that the proposed method exactly mimics all the retinal details. The proposed method focuses on the retinal mechanisms involved in information compression, such as the feedback circuits from HCs to photoreceptors and the RF structure of BCs. Our method does not concern some concepts in visual perception, such as viewing distance, absolute units, and the difference between RGB and LMS space. Though the proposed method performs well in dynamic range compression, it is a quite simplified framework compared with the biological retina and therefore has a broad space for further improvement. Physiologically, the major omission in our method is the OFF pathway, which is also considered important to the human visual system [35]. Introducing this pathway into our method may improve performance under extremely high dynamic range scenes. Another important future work is to reduce computation time. As described in section 4.5, the proposed method costs nearly 0.2 seconds to process an image with size of $536\times 356$ pixels. Despite the future efforts needed, the proposed method is simple yet novel and has great potential for real-time applications, such as the tone mapping of HDR videos.

Funding

National Natural Science Foundation of China (61703075, 61806041); Sichuan Province Science and Technology Support Program (2017SZDZX0019); Guangdong Key R&D Project (2018B030338001).

Acknowledgements

The authors are very grateful to Mark D. Fairchild for the original HDR images.

Disclosures

The authors declare no conflicts of interest.

References

1. E. Reinhard, W. Heidrich, P. Debevec, S. Pattanaik, G. Ward, and K. Myszkowski, High dynamic range imaging: acquisition, display, and image-based lighting (Morgan Kaufmann, 2010).

2. X. Cerda-Company, C. A. Parraga, and X. Otazu, “Which tone-mapping operator is the best? a comparative study of perceptual quality,” J. Opt. Soc. Am. A 35(4), 626–638 (2018). [CrossRef]

3. T. Gollisch and M. Meister, “Eye smarter than scientists believed: neural computations in circuits of the retina,” Neuron 65(2), 150–164 (2010). [CrossRef]

4. X.-S. Zhang and Y.-J. Li, “A retina inspired model for high dynamic range image rendering,” in International Conference on Brain Inspired Cognitive Systems, (Springer, 2016), pp. 68–79.

5. D. Lischinski, Z. Farbman, M. Uyttendaele, and R. Szeliski, “Interactive local adjustment of tonal values,” ACM Trans. Graph. 25(3), 646–653 (2006). [CrossRef]

6. N. D. B. Bruce, “Expoblend: Information preserving exposure blending based on normalized log-domain entropy,” Comput. & Graph. 39, 12–23 (2014). [CrossRef]

7. C. Lee, C. Lee, and C.-S. Kim, “Contrast enhancement based on layered difference representation of 2d histograms,” IEEE Trans. on Image Process. 22(12), 5372–5384 (2013). [CrossRef]

8. M. Oskarsson, “Democratic tone mapping using optimal k-means clustering,” in Image Analysis Scandinavian Conference, R. R. Paulsen and K. S. Pedersen, eds. (Springer, 2015), pp. 354–365.

9. K.-F. Yang, H. Li, H. Kuang, C.-Y. Li, and Y.-J. Li, “An adaptive method for image dynamic range adjustment,” IEEE Trans. Circuits Syst. Video Technol. 29(3), 640–652 (2019). [CrossRef]

10. S. Wang, J. Zheng, H.-M. Hu, and B. Li, “Naturalness preserved enhancement algorithm for non-uniform illumination images,” IEEE Trans. on Image Process. 22(9), 3538–3548 (2013). [CrossRef]

11. S. Raman and S. Chaudhuri, “Bilateral filter based compositing for variable exposure photography,” in Eurographics 2009 - Short Papers, P. Alliez and M. A. Magnor, eds. (Eurographics Association, 2009), pp. 1–4.

12. S. Paris, S. W. Hasinoff, and J. Kautz, “Local laplacian filters: Edge-aware image processing with a laplacian pyramid,” ACM Trans. Graph. 30(4), 1–68 (2011). [CrossRef]

13. A. Choudhury and G. Medioni, “Hierarchy of nonlocal means for preferred automatic sharpness enhancement and tone mapping,” J. Opt. Soc. Am. A 30(3), 353–366 (2013). [CrossRef]

14. G. Eilertsen, R. K. Mantiuk, and J. Unger, “Real-time noise-aware tone mapping,” ACM Trans. Graph. 34(6), 1–15 (2015). [CrossRef]

15. X. Fu, Y. Liao, D. Zeng, Y. Huang, X.-P. Zhang, and X. Ding, “A probabilistic method for image enhancement with simultaneous illumination and reflectance estimation,” IEEE Trans. on Image Process. 24(12), 4965–4977 (2015). [CrossRef]

16. X. Fu, D. Zeng, Y. Huang, X.-P. Zhang, and X. Ding, “A weighted variational model for simultaneous reflectance and illumination estimation,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2016), pp. 2782–2790.

17. H. Yue, J. Yang, X. Sun, F. Wu, and C. Hou, “Contrast enhancement based on intrinsic image decomposition,” IEEE Trans. on Image Process. 26(8), 3981–3994 (2017). [CrossRef]

18. Z. Liang, J. Xu, D. Zhang, Z. Cao, and L. Zhang, “A hybrid l1-l0 layer decomposition model for tone mapping,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2018), pp. 4758–4766.

19. X. Guo, Y. Li, and H. Ling, “Lime: Low-light image enhancement via illumination map estimation,” IEEE Trans. on Image Process. 26(2), 982–993 (2017). [CrossRef]

20. L. Meylan and S. Susstrunk, “High dynamic range image rendering with a retinex-based adaptive filter,” IEEE Trans. on Image Process. 15(9), 2820–2830 (2006). [CrossRef]

21. S. Park, S. Yu, B. Moon, S. Ko, and J. Paik, “Low-light image enhancement using variational optimization-based retinex model,” IEEE Transactions on Consumer Electronics 63(2), 178–184 (2017). [CrossRef]

22. X. Ren, M. Li, W.-H. Cheng, and J. Liu, “Joint enhancement and denoising method via sequential decomposition,” in IEEE International Symposium on Circuits and Systems, (IEEE, 2018), pp. 1–5.

23. M. Li, J. Liu, W. Yang, X. Sun, and Z. Guo, “Structure-revealing low-light image enhancement via robust retinex model,” IEEE Trans. on Image Process. 27(6), 2828–2841 (2018). [CrossRef]

24. V. Laparra, A. Berardino, J. Ballé, and E. Simoncelli, “Perceptually optimized image rendering,” J. Opt. Soc. Am. A 34(9), 1511–1525 (2017). [CrossRef]

25. L. Meylan, D. Alleysson, and S. Süsstrunk, “Model of retinal local adaptation for the tone mapping of color filter array images,” J. Opt. Soc. Am. A 24(9), 2807–2816 (2007). [CrossRef]

26. D. Tamburrino, D. Alleysson, L. Meylan, and S. Süsstrunk, “Digital camera workflow for high dynamic range images using a model of retinal processing,” Proc. SPIE 6817, 68170J (2008). [CrossRef]

27. M. H. Kim and J. Kautz, “Consistent tone reproduction,” in Proceedings of the Tenth International Conference on Computer Graphics and Imaging, (IASTED/ACTA Press, 2008), pp. 152–159.

28. D. Alleysson, L. Meylan, and S. Süsstrunk, “HDR CFA image rendering,” in European Signal Processing Conference, (IEEE, 2006), pp. 1–4.

29. J. Van Hateren, “Encoding of high dynamic range video with a model of human cones,” ACM Trans. Graph. 25(4), 1380–1399 (2006). [CrossRef]

30. A. Benoit, A. Caplier, B. Durette, and J. Hérault, “Using human visual system modeling for bio-inspired low level image processing,” Comput. Vis. Image Underst. 114(7), 758–773 (2010). [CrossRef]

31. F. Banterle, A. Artusi, E. Sikudová, T. Bashford-Rogers, P. Ledda, M. Bloj, and A. Chalmers, “Dynamic range compression by differential zone mapping based on psychophysical experiments,” in Symposium on Applied Perception, P. Khooshabeh, M. Harders, R. McDonnell, and V. Sundstedt, eds. (ACM, 2012), pp. 39–46.

32. M. Gharbi, J. Chen, J. T. Barron, S. W. Hasinoff, and F. Durand, “Deep bilateral learning for real-time image enhancement,” ACM Trans. Graph. 36(4), 1–12 (2017). [CrossRef]

33. J. Cai, S. Gu, and L. Zhang, “Learning a deep single image contrast enhancer from multi-exposure images,” IEEE Trans. on Image Process. 27(4), 2049–2062 (2018). [CrossRef]

34. P. Berens and T. Euler, “Neuronal diversity in the retina,” e-Neuroforum 23(2), 93–101 (2017). [CrossRef]

35. C. Joselevitch, “Human retinal circuitry and physiology,” Psychol. & Neurosci. 1(2), 141–165 (2008). [CrossRef]

36. D. M. Dacey, B. B. Lee, D. K. Stafford, J. Pokorny, and V. C. Smith, “Horizontal cells of the primate retina: cone specificity without spectral opponency,” Science 271(5249), 656–659 (1996). [CrossRef]

37. J. D. Crook, M. B. Manookin, O. S. Packer, and D. M. Dacey, “Horizontal cell feedback without cone type-selective inhibition mediates "red-green" color opponency in midget ganglion cells of the primate retina,” J. Neurosci. 31(5), 1762–1772 (2011). [CrossRef]

38. M. Srinivas, M. Costa, Y. Gao, A. Fort, G. I. Fishman, and D. C. Spray, “Voltage dependence of macroscopic and unitary currents of gap junction channels formed by mouse connexin50 expressed in rat neuroblastoma cells,” The J. Physiol. 517(3), 673–689 (1999). [CrossRef]

39. S. A. Bloomfield and B. Völgyi, “The diverse functional roles and regulation of neuronal gap junctions in the retina,” Nat. Rev. Neurosci. 10(7), 495–506 (2009). [CrossRef]

40. R. W. Rodieck and J. Stone, “Response of cat retinal ganglion cells to moving visual patterns,” J. Neurophysiol. 28(5), 819–832 (1965). [CrossRef]

41. H. Yeganeh and Z. Wang, “Objective quality assessment of tone-mapped images,” IEEE Trans. on Image Process. 22(2), 657–667 (2013). [CrossRef]

42. M. Fairchild, “HDR Photographic Survey,” http://rit-mcsl.org/fairchild//HDR.html.

43. K. Gu, S. Wang, G. Zhai, S. Ma, X. Yang, W. Lin, W. Zhang, and W. Gao, “Blind quality assessment of tone-mapped images via analysis of information, naturalness, and structure,” IEEE Transactions on Multimedia 18(3), 432–443 (2016). [CrossRef]

44. D. Kundu, D. Ghadiyaram, A. C. Bovik, and B. L. Evans, “No-reference quality assessment of tone-mapped HDR pictures,” IEEE Trans. on Image Process. 26(6), 2957–2971 (2017). [CrossRef]

45. G. Tan, Y. Huang, M.-C. Li, S.-L. Lee, and S.-T. Wu, “High dynamic range liquid crystal displays with a mini-led backlight,” Opt. Express 26(13), 16572–16584 (2018). [CrossRef]

46. M. Xu and H. Hua, “High dynamic range head mounted display based on dual-layer spatial modulation,” Opt. Express 25(19), 23320–23333 (2017). [CrossRef]

Subject	Female							Male
No.	2	5	9	13	15	21	22	1	3	4	6	7	8	10
Our	1.00	1.00	2.50	1.50	2.17	1.83	2.83	1.50	1.17	0.67	0.33	1.00	1.17	1.17
[9]	4.00	3.83	3.33	5.50	3.67	2.33	4.50	2.33	4.33	4.83	3.17	3.83	1.67	3.50
[19]	2.83	2.50	3.00	1.67	1.33	2.17	2.17	3.83	1.83	2.33	4.83	2.50	3.33	3.833
[18]	2.33	2.17	2.83	1.83	2.50	1.00	1.67	1.00	1.17	1.00	0.83	2.17	0.83	3.00
[16]	3.67	4.00	3.83	2.83	1.83	4.67	2.17	5.00	4.00	3.83	6.00	4.00	6.00	4.83
[14]	3.83	2.33	2.67	2.67	3.00	3.83	3.17	2.33	4.33	2.33	2.67	2.33	3.17	3.00
[23]	4.17	4.33	4.17	5.00	4.83	5.17	4.50	5.00	4.17	5.67	3.17	4.33	4.83	3.83

Subject	Male									Average
No.	11	12	14	16	17	18	19	20	23	Female	Male	All
Our	1.83	3.50	0.83	0.00	2.00	2.00	1.00	0.33	0.33	1.83 $\pm$ 0.71	1.18 $\pm$ 0.87	1.38 $\pm$ 0.86
[9]	3.83	4.67	3.83	5.00	5.67	5.50	2.50	5.83	3.50	3.88 $\pm$ 0.98	4.00 $\pm$ 1.22	3.96 $\pm$ 1.13
[19]	2.83	1.33	3.67	2.83	1.17	4.33	2.33	1.33	2.33	2.24 $\pm$ 0.60	2.79 $\pm$ 1.10	2.62 $\pm$ 1.00
[18]	2.33	3.00	3.33	2.50	2.50	3.00	1.50	3.33	2.00	2.05 $\pm$ 0.61	2.09 $\pm$ 0.92	2.08 $\pm$ 0.82
[16]	4.17	2.17	4.67	4.00	2.67	5.67	4.67	3.00	3.50	3.29 $\pm$ 1.04	4.26 $\pm$ 1.12	3.96 $\pm$ 1.16
[14]	2.67	2.00	3.00	2.33	1.33	3.50	3.67	2.33	3.50	3.07 $\pm$ 0.58	2.78 $\pm$ 0.74	2.87 $\pm$ 0.70
[23]	5.00	2.17	3.83	4.33	5.33	5.67	5.33	4.83	5.83	4.60 $\pm$ 0.41	4.58 $\pm$ 0.99	4.59 $\pm$ 0.85

Retina inspired tone mapping method for high dynamic range images

Abstract

1. Introduction

2. Related work

3. The proposed method

3.1 General description

3.2 Lateral pathway

3.2.1 Adaptive modulation from horizontal cells

3.2.2 Adaptive photoreceptor responses

3.3 Vertical pathway

3.3.1 Process in bipolar cells

4. Experiments

4.1 Adaptive HCs RF size

4.2 Visual comparison

4.3 Observer test

4.4 Quantitative comparison

4.5 Computational time

5. Discussion and conclusion

Funding

Acknowledgements

Disclosures

References

Cited By

Figures (6)

Tables (1)

Equations (8)

Optics Express