Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Robust disparity estimation based on color monogenic curvature phase

Open Access Open Access

Abstract

Disparity estimation for binocular images is an important problem for many visual tasks such as 3D environment reconstruction, digital hologram, virtual reality, robot navigation, etc. Conventional approaches are based on brightness constancy assumption to establish spatial correspondences between a pair of images. However, in the presence of large illumination variation and serious noisy contamination, conventional approaches fail to generate accurate disparity maps. To have robust disparity estimation in these situations, we first propose a model - color monogenic curvature phase to describe local features of color images by embedding the monogenic curvature signal into the quaternion representation. Then a multiscale framework to estimate disparities is proposed by coupling the advantages of the color monogenic curvature phase and mutual information. Both indoor and outdoor images with large brightness variation are used in the experiments, and the results demonstrate that our approach can achieve a good performance even in the conditions of large illumination change and serious noisy contamination.

© 2012 Optical Society of America

1. Introduction

Disparity estimation for binocular images is an important problem for many visual tasks such as 3D environment reconstruction, digital hologram, virtual reality, robot navigation, etc. Typically, a matching cost is calculated at every pixel for all disparities under consideration. Conventional approaches usually assume constant intensities for matching image positions. Commonly used pixel-based matching costs are absolute differences, squared differences, sampling-insensitive absolute differences [1]. Window-based matching costs include the sum of absolute, squared differences and normalized cross correlations [2]. However, in the presence of illumination change, constant intensity constraint cannot hold any more and the corresponding disparity map thus contains a lot of errors. Mutual information, as an alternative of matching cost, has been used to compute visual correspondence because of its power to handle some brightness variations [3, 4]. In [5], Geiger et al. proposed a fast and efficient large-scale stereo matching approach. This work can achieve the state-of-the-art performance without the need for global optimization, however, it suffered from large illumination change.

In contrast to intensity, phase information, as an important feature of image, has the advantage of being invariant to illumination change. Different to gradient information, phase has different responses to lines and edges. It contains most significant structure information and the original image can be reconstructed based on only the phase information [6]. In [7, 8], the rotationally invariant monogenic phase model was proposed for gray images. Later on, Demarcq et al. [9] generalized it to handle color images. Unfortunately, the monogenic phase cannot yield accurate results for highly curved lines and edges. In our previous work [10], we proposed monogenic curvature phase to model curved lines and edges. Although it has been applied to compute visual correspondence with good performance [11], only gray images can be processed and multiscale information was not taken into consideration.

The main goal of this paper is to estimate robust disparities in the large illumination change and serious noise contamination environment. To this end, we first propose a model - color monogenic curvature phase to describe the features of color images by embedding the monogenic curvature signal into the quaternion framework. Then we present a multiscale method to compute the disparity map by coupling the advantages of mutual information and the color monogenic curvature phase. To illustrate the efficiency of the proposed approach, we include both indoor and outdoor images with large illumination change for the experiments. Presented experimental results demonstrate that our approach can achieve a good performance even in the conditions of large brightness variation and serious noise corruption.

2. Color monogenic curvature phase

2.1. Monogenic curvature signal

Given a 2D gray image f(x, y), (x, y) ∈ R2, the monogenic curvature signal [10] is defined as

fmc=[f1f2f3]T,
The first component f1 can be obtained as
f1=[(f*h1)*h1][(f*h2)*h2][(f*h1)*h2]2,
where * represents the convolution operator, h1=x2π(x2+y2)3/2 and h2=y2π(x2+y2)3/2 are the two parts of Riesz kernel [7]. Applying the second order Hilbert transform [12] to f1 will yield the other two components f2 and f3 of the monogenic curvature signal. In the frequency domain, the second order Hilbert transform reads H2 = [cos2α sin2α]T, where α is the polar coordinate. The other two components of the monogenic curvature signal are respectively given by
f2=1{F1cos2α},
f3=1{F1sin2α},
where −1 refers to the inverse Fourier transform and F1 is the Fourier transformed result of f1.

Convolving the monogenic curvature signal with the Poisson kernel hp=s2π(x2+y2+s2)3/2 thus results in the monogenic curvature scale-space fmc(x, y, s) with s being the scale parameter. The monogenic curvature scale-space performs a split of identity, from it, three independent local features, i.e. the amplitude, main orientation and monogenic curvature phase, can be simultaneously obtained as

A(x,y,s)=f(x,y,s)12+f(x,y,s)22+f(x,y,s)32,
θ(x,y,s)=12atan2(f3(x,y,s),f2(x,y,s)),θ(π2,π2],
Φ(x,y,s)=u(x,y,s)|u(x,y,s)|atan2(|u(x,y,s)|,f1(x,y,s)),Φ(π,π],
where atan2(·) ∈ (−π, π] and u(x, y, s) = [f2(x, y, s) f3(x, y, s)]T.

2.2. Color monogenic curvature scale-space

The monogenic curvature scale-space was designed to describe the characteristics of gray images, unfortunately, color information is not incorporated in this model. In [13, 14], quaternion was introduced to represent color images. For a color image f(x,y) in RGB color space, it can be represented by encoding three channels as a pure quaternion

f(x,y)=fr(x,y)i+fg(x,y)j+fb(x,y)k,
where i, j and k are three imaginary units, fr, fg and fb indicate the red, green and blue channels of the color image. We are thus inspired to extend the monogenic curvature scale-space to the color domain by embedding it into the framework of quaternion. Similar to the color image representation, corresponding components of monogenic curvature scale-space are considered as three channels to be encoded in a pure quaternion. Therefore, the color monogenic curvature scale-space fcmc can be constructed as
fcmc=frmci+fgmcj+fbmckwith
fnmc=[fn1(x,y,s)fn2(x,y,s)fn3(x,y,s)],n{r,g,b},
where fnmc refers to the monogenic curvature scale-space of the nth color channel.

The corresponding color monogenic curvature phase Φcmc is given by

Φcmc=Φri+Φgj+Φbkwith
Φn=un(x,y,s)|un(x,y,s)|atan2(|un(x,y,s)|,fn1(x,y,s)),Φn(π,π],n{r,g,b}.
Figure 1 illustrates the computed color monogenic curvature phase results at the first scale. Top row contains three test images taken from [15], they are captured under different camera exposure and lighting conditions. Bottom row includes corresponding color monogenic curvature phase images. It is shown that the color monogenic curvature phase is very robust against large illumination variation.

 figure: Fig. 1

Fig. 1 Top row: Test images taken from [15]. Bottom row: Corresponding color monogenic curvature phase images.

Download Full Size | PDF

3. Disparity estimation

To deal with stereo analysis in the environment of large brightness variation and noisy corruption, we propose a multiscale method by combining the advantages of mutual information and the color monogenic curvature phase. Figure 2 illustrates the structure of the proposed multiscale disparity estimation approach. Given a color image pair Il and Ir, the corresponding phase information Φl and Φr can be extracted by applying the color monogenic curvature phase model. Based on Φl and Φr, two pyramids are correspondingly constructed by down-sampling the original phase images. At each scale s, the disparity map can be computed by using the mutual information of two phase images Φl,s and Φr,s as the matching cost. From the coarsest scale, the estimated disparity map is used in the next scale for initialization, and this continues to the finest scale.

 figure: Fig. 2

Fig. 2 Multiscale disparity estimation based on the color monogenic curvature phase.

Download Full Size | PDF

At a given scale, the mutual information of the color monogenic curvature phase image pairs Φl,s and Φr,s can be defined as

CPMI(Φl,s,Φr,s)=H(Φl,s)+H(Φr,s)H(Φl,s,Φr,s),
where Hl,s) and Hr,s) are the Shannon entropy which can be given by
H(Φ)=EΦ[log(P(Φ))]=ϕiΩϕlog(P(Φ=ϕi))P(Φ=ϕi),
where EΦ indicates the expected value function of Φ, P(Φ) is the probability of Φ, Ωϕ refers to the domain over which the random variable can range and ϕi is an event in this domain. Hl,s, Φr,s) indicates the joint entropy of Φl,s and Φr,s, it is represented in the following form
H(Φl,s,Φr,s)=EΦl,s[EΦr,s[log(P(Φl,s,Φr,s))]],
where E refers to the expectation, Pl,s, Φr,s) is the joint distribution of Φl,s and Φr,s. Since Eq. (13) defines the mutual information for the whole phase image, similar to [16], we approximate the whole mutual information as the sum of the pixel-wise mutual information and use it as a data cost, that is
CPMI(Φl,s,Φr,s)pcpmi(Φl,s(p),Φr,s(p+dp)),
where dp refers to the disparity at the pixel p.

Typically, disparity estimation can be obtained by minimizing the following energy expression

E=Edata+Esmooth,
where Edata is a matching cost which works as a similarity measure and Esmooth is the smooth energy which penalizes disparity differences. In this paper, we use the mutual information of the color monongeic curvature phase image as a matching cost. Based on the approximation, the pixel-wise data energy Edata can be formulated as
Edata=pcpmi(Φl,s(p),Φr,s(p+dp)).
We use a truncated quadratic function as the smoothness energy, which is defined as
Esmooth=pq𝒩(p)Vpq(dp,dq),
where 𝒩(p) is the neighbourhood pixels of the pixel p, and Vpq is represented as
Vpq(dp,dq)=λmin(|dpdq|2,Vmax),
with λ being a weighting parameter. The Graph-cuts expansion algorithm proposed in [17] is employed to minimize the energy function for the dense disparity computation.

4. Experimental results

In Section 4, we present some experimental results to demonstrate the efficiency of our proposed approach. First, we take two datasets “baby1” and “lampshade1” from the middlebury stereo benchmark [15, 18] as the test images. These two datasets respectively contain rich and less texture information. All images are captured under the conditions of three real different lighting sources and three different exposures, and illuminations are not equally changed over whole images. The images are rectified and radial distortion has been removed. In [5], the intensity-based approach has been proved to achieve the state-of-the-art performance of stereo matching, therefore, we include the estimated disparity results from the gray monogenic curvature phase based approach [11] and this approach for comparison.

Figure 3 shows estimated disparities for “baby1” by three different methods. Top row from left to right are two views of “baby1” which has the large brightness change, and the disparity ground truth. Bottom row illustrates estimated results using the intensity-based approach, gray monogenic curvature phase based approach and the proposed approach. It is shown that for two images with large illumination variation, intensity based approach fails to generate good results because of its sensitivity to the brightness change, the proposed method produces the best results, gray monogenic curvature phase based method performs slightly worse because no color information is incorporated and the multiscale implementation is not taken into consideration. The “lampshade1” dataset contains images with less texture information, which makes disparity estimation more difficult than that of the rich texture images. Figure 4 demonstrates the corresponding results from three methods. Top row contains two views of “lampshade1” with some brightness change and the ground truth disparity. Bottom row from left to right are estimated disparities from the intensity based, gray monogenic curvature phase based and the proposed approaches. Due to the low texture images, these three approaches generate not very good disparity maps, however, our approach still performs the best among them.

 figure: Fig. 3

Fig. 3 Top row from left to right: Two views of “baby1” taken under different lighting and camera exposure conditions and the disparity ground truth. Bottom row from left to right: Estimated disparity maps based on [5], gray monogenic curvature phase [11] and the proposed method.

Download Full Size | PDF

 figure: Fig. 4

Fig. 4 Top row from left to right: Two views of “lampshade1” taken under different lighting and camera exposure conditions and the disparity ground truth. Bottom row from left to right: Estimated disparity maps based on [5], gray monogenic curvature phase [11] and the proposed method.

Download Full Size | PDF

In order to quantitatively evaluate the performance of our approach, we use different lighting combinations with the same camera exposure as input image pairs and compute errors in unoccluded areas. Figure 5 shows disparity errors in unoccluded areas with respect to different lighting combinations for “baby1” and “lampshade1”. The horizontal axis represents the combination of lighting conditions, e.g. “1/3” means the left image is taken under lighting condition 1 and the right image is taken under lighting condition 3. In this figure, “Color phase” indicates the proposed method, “PMI” refers to the gray monogenic curvature phase based approach [11] and “Intensity” represents the intensity based approach [5]. It is shown that the larger the lighting condition difference is the larger the errors are for all these approaches, however, our approach performs the best. To test the robustness of the proposed approach, we use noise contaminated image pairs with the same lighting condition to check the estimated errors. Figure 6 demonstrates disparity errors in unoccluded areas with respect to signal to noise ratios for “baby1” and “lampshade1”. With the increase of signal to noise ratio, estimated errors are correspondingly decreased, and our approach still outperforms others.

 figure: Fig. 5

Fig. 5 Disparity errors in unoccluded areas with respect to different lighting combinations for “baby1” and “lampshade1”.

Download Full Size | PDF

 figure: Fig. 6

Fig. 6 Disparity errors in unoccluded areas with respect to different signal to noise ratios for “baby1” and “lampshade1”.

Download Full Size | PDF

Up to now, the experimental image pairs are captured in the indoor environment with ground truth. To investigate more about the performance of our approach in the outdoor environment, we use two different cameras arranged in a stereo vision system to capture outdoor images with strong lighting changes for the experiment. Figures 7 and 8 illustrate different views of outdoor images with large illumination variation and the estimated disparity maps using intensity-based approach [5], gray monogenic curvature phase [11] and the proposed method. It is shown that the intensity-based approach cannot generate good disparity map due to the large brightness change, the gray monogenic curvature phase performs much better than the intensity-based one, and our approach works the best.

 figure: Fig. 7

Fig. 7 Top row from left to right: Two views of outdoor images with large brightness change. Bottom row from left to right: Estimated disparity maps based on [5], gray monogenic curvature phase [11] and the proposed method.

Download Full Size | PDF

 figure: Fig. 8

Fig. 8 Top row from left to right: Two views of outdoor images with large brightness change. Bottom row from left to right: Estimated disparity maps based on [5], gray monogenic curvature phase [11] and the proposed method.

Download Full Size | PDF

5. Conclusions

This paper addresses the problem of estimating robust disparity maps in the large illumination change and noisy contamination environment. Conventional approaches are based on the brightness constancy assumption, however, they fail to generate accurate disparities in this special case. To have robust disparity estimation, we first propose a model - color monogenic curvature phase by embedding the monogenic curvature signal into the quaternion representation, this results in the generalization of the monongeic curvature phase to the color domain. Then, we propose a multiscale framework to estimate disparities by coupling the advantages of mutual information and the color monongeic curvature phase. We use both indoor and outdoor images with large illumination change in the experiments. Demonstrated results prove that our approach outperforms the intensity based and monogenic curvature phase based approaches, and it can a achieve a good performance even in the conditions of large brightness variation and noise corruption.

Acknowledgment

This work has been supported by National Natural Science Foundation of China ( 61103071, 61105122, 61103072), Natural Science Foundation of Shanghai, China ( 11ZR1440200), Research Fund for the Doctoral Program of Higher Education of China ( 20110072120065) and the Key Basic Program of Science and Technology Commission of Shanghai Municipality of China ( 10DJ1400300).

References and links

1. S. Birchfield and C. Tomasi, “A pixel dissimilarity measure that is insensitive to image sampling,” IEEE Trans. Pattern Anal. Mach. Intell. 20, 401–406 (1998). [CrossRef]  

2. H. Moravec, “Toward automatic visual obstacle avoidance,” in Proceedings of 5th International Joint Conference on Artificial Intelligence, (Morgan Kaufmann, 1977), pp. 584–590.

3. C. Fookes, M. Bennamoun, and A. Lamanna, “Improved stereo image matching using mutual information and hierarchical prior probabilities,” in Proceedings of 16th International Conference on Pattern Recognition, (IEEE, 2002), pp. 937–940.

4. I. Sarkar and M. Bansal, “A wavelet-based multiresolution approach to solve the stereo correspondence problem using mutual information,” IEEE Trans. Syst. Man. Cybern., B: Cybern. 37, 1009–1014 (2007). [CrossRef]  

5. A. Geiger, M. Roser, and R. Urtasun, “Efficient large-scale stereo matching,” in Proceedings of 10th Asian conference on Computer vision - Volume Part I, (Springer-Verlag, 2011), pp. 25–38.

6. A. V. Oppenheim, “The importance of phase in signals,” Proc. IEEE 69, 529–541 (1981). [CrossRef]  

7. M. Felsberg and G. Sommer, “The monogenic signal,” IEEE Trans. Signal Process. 49, 3136–3144 (2001). [CrossRef]  

8. M. Felsberg and G. Sommer, “The monogenic scale-space: a unifying approach to phase-based image processing in scale-space,” J. Math. Imaging Vision 21, 5–26 (2004). [CrossRef]  

9. G. Demarcq, L. Mascarilla, M. Berthier, and P. Courtellemont, “The color monogenic signal: application to color edge detection and color optical flow,” J. Math. Imaging Vision 40, 269–284 (2011). [CrossRef]  

10. D. Zang and G. Sommer, “Signal modeling for two-dimensional image structures,” J. Visual Commun. Image 18, 81–99 (2007). [CrossRef]  

11. D. Zang, J. Li, and D. Zhang, “Robust visual correspondence computation using monogenic curvature phase based mutual information,” Opt. Lett. 37, 10–12 (2012). [CrossRef]   [PubMed]  

12. F. Brackx, B. D. Knock, and H. D. Schepper, “Generalized multidimensional hilbert transforms in clifford analysis,” Int. J. Math. Math. Sci. 2006, 98145 (2006). [CrossRef]  

13. S. J. Sangwine, “Fourier transforms of color images using quaternion or hypercomplex numbers,” Electron. Lett. 32, 1979–1980 (1996). [CrossRef]  

14. N. L. Bihan and S. J. Sangwine, “Quaternion principal component analysis of color images,” in Proceedings of IEEE International Conference on Image Processing, (IEEE, 2003), pp. 809–812.

15. http://vision.middlebury.edu/stereo/.

16. J. Kim, V. Kolmogorov, and R. Zabih, “Visual correspondence using energy minimization and mutual information,” in Proceedings of IEEE International Conference on Computer Vision, (IEEE, 2003), pp. 1033–1040.

17. Y. Boykov, O. Veksler, and R. Zabih, “Fast approximate energy minimization via graph cuts,” IEEE Trans. Pattern Anal. Mach. Intell. 23, 1222–1239 (2001). [CrossRef]  

18. D. Scharstein and C. Pal, “Learning conditional random fields for stereo,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, (IEEE, 2007), pp. 1–8.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (8)

Fig. 1
Fig. 1 Top row: Test images taken from [15]. Bottom row: Corresponding color monogenic curvature phase images.
Fig. 2
Fig. 2 Multiscale disparity estimation based on the color monogenic curvature phase.
Fig. 3
Fig. 3 Top row from left to right: Two views of “baby1” taken under different lighting and camera exposure conditions and the disparity ground truth. Bottom row from left to right: Estimated disparity maps based on [5], gray monogenic curvature phase [11] and the proposed method.
Fig. 4
Fig. 4 Top row from left to right: Two views of “lampshade1” taken under different lighting and camera exposure conditions and the disparity ground truth. Bottom row from left to right: Estimated disparity maps based on [5], gray monogenic curvature phase [11] and the proposed method.
Fig. 5
Fig. 5 Disparity errors in unoccluded areas with respect to different lighting combinations for “baby1” and “lampshade1”.
Fig. 6
Fig. 6 Disparity errors in unoccluded areas with respect to different signal to noise ratios for “baby1” and “lampshade1”.
Fig. 7
Fig. 7 Top row from left to right: Two views of outdoor images with large brightness change. Bottom row from left to right: Estimated disparity maps based on [5], gray monogenic curvature phase [11] and the proposed method.
Fig. 8
Fig. 8 Top row from left to right: Two views of outdoor images with large brightness change. Bottom row from left to right: Estimated disparity maps based on [5], gray monogenic curvature phase [11] and the proposed method.

Equations (20)

Equations on this page are rendered with MathJax. Learn more.

f m c = [ f 1 f 2 f 3 ] T ,
f 1 = [ ( f * h 1 ) * h 1 ] [ ( f * h 2 ) * h 2 ] [ ( f * h 1 ) * h 2 ] 2 ,
f 2 = 1 { F 1 cos 2 α } ,
f 3 = 1 { F 1 sin 2 α } ,
A ( x , y , s ) = f ( x , y , s ) 1 2 + f ( x , y , s ) 2 2 + f ( x , y , s ) 3 2 ,
θ ( x , y , s ) = 1 2 atan 2 ( f 3 ( x , y , s ) , f 2 ( x , y , s ) ) , θ ( π 2 , π 2 ] ,
Φ ( x , y , s ) = u ( x , y , s ) | u ( x , y , s ) | atan 2 ( | u ( x , y , s ) | , f 1 ( x , y , s ) ) , Φ ( π , π ] ,
f ( x , y ) = f r ( x , y ) i + f g ( x , y ) j + f b ( x , y ) k ,
f c m c = f r m c i + f g m c j + f b m c k with
f n m c = [ f n 1 ( x , y , s ) f n 2 ( x , y , s ) f n 3 ( x , y , s ) ] , n { r , g , b } ,
Φ c m c = Φ r i + Φ g j + Φ b k with
Φ n = u n ( x , y , s ) | u n ( x , y , s ) | atan 2 ( | u n ( x , y , s ) | , f n 1 ( x , y , s ) ) , Φ n ( π , π ] , n { r , g , b } .
CPMI ( Φ l , s , Φ r , s ) = H ( Φ l , s ) + H ( Φ r , s ) H ( Φ l , s , Φ r , s ) ,
H ( Φ ) = E Φ [ log ( P ( Φ ) ) ] = ϕ i Ω ϕ log ( P ( Φ = ϕ i ) ) P ( Φ = ϕ i ) ,
H ( Φ l , s , Φ r , s ) = E Φ l , s [ E Φ r , s [ log ( P ( Φ l , s , Φ r , s ) ) ] ] ,
CPMI ( Φ l , s , Φ r , s ) p cpmi ( Φ l , s ( p ) , Φ r , s ( p + d p ) ) ,
E = E data + E smooth ,
E data = p cpmi ( Φ l , s ( p ) , Φ r , s ( p + d p ) ) .
E smooth = p q 𝒩 ( p ) V p q ( d p , d q ) ,
V p q ( d p , d q ) = λ min ( | d p d q | 2 , V max ) ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.