Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Robust registration for ultra-field infrared and visible binocular images

Open Access Open Access

Abstract

The ultra-field infrared and visible image registration is a challenging task due to its nonlinear imaging and multi-modal image features. In this paper, a robust registration method is proposed for the ultra-field infrared and visible images. First, control points are extracted utilizing phase congruency and optimized based on the guidance map, which is proposed according to significant structures information. Second, ROI pair matching is accomplished based on epipolar curve. Its effect is equivalent to a search window that is popular in methods with the standard field of view, and it can overcome the content differences in the search window caused by nonlinear imaging and vision disparity. Third, a descriptor, named multiple phase congruency directional pattern (MPCDP), is established and composed of distribution information and main direction. The phase congruency amplitudes are encoded as binary patterns, and then they are represented as weighted histogram for distribution information. Six pairs of ultra-field infrared and visible images are employed for registration experiments, and the results demonstrate that the performance of the proposed is robust and accurate in five types of ultra-field scenes and two different camera relationships.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

With the rapid development of sensor technology and imaging techniques, more and more complementary information is provided by the multispectral imaging and the ultra-field imaging techniques, which are popular in remote sensing [1], military reconnaissance [2], computer vision [3,4], and so on. However, due to the different viewpoints, sensors and lens, the multi-modal images are inevitably misaligned, so they cannot be directly used for image fusion, change detection and image mosaic [5]. Image registration, as a fundamental step, has been studied for the images alignment, in which many researches focus on multiple spectrums, but few of them are related to ultra-field imaging technique. Among them, the infrared-visible registration is a research hotspot, but it is still a difficult task because there are different gray-scale characteristics in the infrared and visible images, such as nonlinear intensity variations and local gray inversion [6].

Currently, the main multi-modal registration methods can be categorized into area based and feature based methods [79]. Area based methods accomplish the registration depending on calculating the image similarity between the input images. Commonly used area based similarity functions include mutual information (MI) [10], cross-correlation (CC) [11], normalized cross correlation (NCC) [12], and so on. This type of methods can easily realize automatic and work well when input images are highly similar. However, misregistration always occurs when there is difference in input images, especially for multi-modal images registration. In contrast to the area based methods, the feature based methods don’t work directly with original images. They mainly extract obvious image features and match the features based on their similarity. In the procedure of feature extracting, the image features usually include point feature, edge feature, region feature, and so on. Among these features, point feature is the most popular, including scale invariant feature (SIFT) [13], Harris algorithm, binary robust independent elementary features (BRIEF) [14], features from accelerated segment test (FAST) [15], and so on. In addition, Zhao proposed a line feature based registration method for multi-modal image, in which the descriptor is named as multi-modality robust line segment descriptor (MRLSD) [16]. Song proposed a retrofitted SIFT algorithm and Lissajous-curve trajectories based feature for the registration [17]. Lv combined the line and points features to realize automatic registration of airborne Li DAR point cloud data and optical imagery depth map [18]. There are also many region feature based descriptor, such as local binary pattern (LBP) [19], local self-similarity (LSS) [20], histogram of oriented gradients (HOG) [21]. Among them, the region centers are usually treated as control points (CP). Teng proposed an improved local feature descriptor and it can improve the performance of all SIFT-based applications [22]. Tang proposed an adaptable local-global feature for the rail registration in infrared and visible bands [23]. One current popular trend of the features based registration methods is the structures based method, and the phase congruency is one of the effective theories for multi-modal images registration [24]. Liu employed the structure features by constructing a maximally stable phase congruency descriptor to for registration [25]. Many other researches also have proved that the feature based method is more suitable for various situations, including illumination changes, gray intensity changes, geometric distortion, and so on.

Through analysis above-mentioned, the feature based method may satisfy the infrared and visible images registration. However, on the one hand, many CP detection methods depend on gradient, which varies nonlinearly in infrared and visible images. Moreover, the CP detected should be not only enough for quantity, but also high corresponding rate for transformation model. On the other hand, we can find that most registration methods focus on image registration with standard field of view (FOV). In contrary, the literatures related to the ultra-filed image registration are not so many, especially for ultra-field infrared and visible images. The reason for this phenomenon is that the conventional methods can’t be directly applied into ultra-filed images without considering the nonlinear imaging. In detail, three difficulties, existing in ultra-field infrared and visible images, are analyzed as follows.

  • (1) The distortions become large from center to border in the ultra-field image, i.e., the straight structure lines will become curves. Hence, the corner extraction may be insufficient by utilizing the corner and edge based algorithms, such as Harris and shape context (SC) algorithms. Moreover, the intensity variation in infrared and visible images makes the extraction difficulty more serious.
  • (2) The FOV of the ultra-field image is larger than commonly used images. So its scene can’t be always full of salient structures, and the sky background is inevitably brought in the image. Thus, the uniform extraction method in whole image is unreasonable, in other words, only the image parts with obvious structures can be used for CP extraction. Moreover, due to different bands, some structures may not keep consistently salient in both infrared and visible images.
  • (3) Due to nonlinear imaging and vision disparity, the image contents are different between two adjacent windows for the same CP in ultra-field infrared and visible images. Therefore, the template matching, adopted in most of current methods, is inapplicable to identify correspondences in ultra-field image.

To overcome the difficulties above, we proposed a novel registration method for the ultra-filed infrared-visible images, and our contributions mainly consist of three parts. Firstly, phase congruency is utilized to extract enough CP in both ultra-field infrared and visible images. However, due to visible image has more abundant detail information, many obvious single points in visible image don’t have corresponding points in the infrared image. So the nonlinear diffusion filtering is adopted to preserve significant structures, and then, their gradient maps are calculated to guide the quantity optimization for CP. Thus, the CP extraction is robust in both infrared and visible images. Secondly, the ultra-field infrared and visible binocular cameras are calibrated in advance, in which the RBF network is introduced to optimize the calibration results. On the basis of the calibration results, the epipolar curve is proposed to be equivalent to search window for ROI matching. Thus, the interference brought by different imaging mechanisms and vision disparity can be reduced to a great extent. Thirdly, a significant structure based descriptor, naming multiple phase congruency directional patterns (MPCDP), is proposed for the correct CP pairs identification. Specifically, the phase congruency amplitudes are quantized to binary pattern at multi-orientation. The distribution information of MPCDP is represented by spatial histograms of binary pattern, which are weighted by nonzero mean amplitudes at every orientation. Similarly, the main direction of the MPCDP is calculated by combining weighted multi-orientations together. The experiments are implemented in six pairs of ultra-field infrared and visible images, which contain five different scenes and two different camera relationships. The performances demonstrate that the proposed method is robust and accurate to achieve the registration of the ultra-field infrared and visible images.

The remainder of the paper is organized as follows. Section 2 presents the methodology of the proposed method, which mainly consists of CP extraction, ROI matching and MPCDP descriptor construction. The experiment results and analysis are presented in Section 3. Finally, the conclusions are given in Section 4.

2. Methodology

2.1 Robust ultra-field infrared and visible CP extraction

The significance and quantity of detected candidate CP have a large influence on the performance and accuracy of the image registration [26]. In multi-modal images registration, the issue of significance becomes difficult, but it’s important. The first step of nonrigid registration is to select a set of candidate CP in ultra-field infrared and visible images. Considering different imaging mechanisms, the structures based methods are more robust than other features based methods in the infrared and visible images. We take the phase congruency theory into account. It postulates that perceptually significant features are situated at locations of the image where the Fourier components are maximally in phase [26]. Compared with commonly used gradient based methods, the phase congruency is more robust against the gray-scale intensity variation, especially in multi-modal images, as shown in Fig. 1. Additionally, the logarithmic Gabor filter is utilized to expand the phase congruency into multi-scales and orientations, which is defined as follows.

$$P({x,y} )= \frac{{\sum\limits_n {W({x,y} )\lfloor{{A_n}({x,y} )\Delta \Phi ({x,y} )- T} \rfloor } }}{{\sum\limits_n {{A_n}({x,y} )+ \varepsilon } }}$$
$$\Delta \Phi ({x,y} )= \cos ({{\phi_n}({x,y} )- \bar{\phi }({x,y} )} )- |{\sin ({{\phi_n}({x,y} )- \bar{\phi }({x,y} )} )} |$$
where $({x,y} )$ represents the coordinate of the point in the image, n is the scale of the filter, $W({x,y} )$ is the weighting factor based on the frequency spread, ${A_n}({x,y} )$ is the amplitude, ${\phi _n}({x,y} )$ is the phase at the scale n, $\bar{\phi }({x,y} )$ is the weighted mean phase, T represents the noise threshold and $\varepsilon$ is a small constant to avoid division by zero. $\lfloor{} \rfloor$ denotes that the enclosed quantity is equal to itself when its value is positive or zero otherwise.

 figure: Fig. 1.

Fig. 1. Comparison of the phase congruency and gradient in infrared and visible images

Download Full Size | PDF

Another issue that needs to be considered is the quantity of candidate CP, that is to say, only enough CP can guarantee the accuracy of transformation model. In the view of CP detection in single image, the significant points can be effectively detected, which well embody the structure information in ultra-field infrared and visible images. However, it can be found that many CP gather in interior structures of the scene in visible image, such as tree and grassland. These details are not obvious, even disappeared in infrared image. Therefore, in the view of couple CP detection, we select more robust structures for CP matching, like edges of the buildings, sky-ground line and others, which are distinctly distinguishable in both infrared and visible images. To preserve enough candidate CP couples, we introduce the nonlinear diffusion to filter the original image and utilize the gradient map of filtering image to guide the CP selection.

Nonlinear diffusion filtering has been proved to preserve the edges adaptively according to the image structures themselves. Motivated by this characteristic, we perform nonlinear diffusion filtering on visible image to preserve robust structures and neglect the relatively weak details and noises, as shown in Fig. 2(a). Thus, the structures consistency of infrared and visible images can be improved.

$$\frac{{\partial I({x,y} )}}{{\partial t}} = div({c({x,y,t} )\cdot \nabla I} )$$
$$c({x,y,t} )= g({|{\nabla I({x,y,t} )} |} )$$
$$g = \left\{ {\begin{array}{cc} 1&{{{|{\nabla I} |}^2} \le 0}\\ {1 - \exp \left( { - \frac{{3.315}}{{{{({|{\nabla I} |/k} )}^3}}}} \right)}&{{{|{\nabla I} |}^2} > 0} \end{array}} \right.$$
where t is the scale parameter that increasing t leads to simpler image representation, $div$ is the divergence operator, $\nabla$ represents the gradient operator, $c({x,y,t} )$ is the diffusion coefficient, k is the diffusion contrast factor.

 figure: Fig. 2.

Fig. 2. Principle of the proposed CP extraction

Download Full Size | PDF

The gradient map is calculated by performing $[{ - 1,0,1} ]$ and ${[{ - 1,0,1} ]^T}$ operators on filtering image. Then, the gradient map is updated by setting the values to zeros if they are less than threshold Th. Finally, the candidate CP in visible image is optimized by removing the low significant points, where the gradient is zeros. As shown in Fig. 2(b), the edge and structure with high significance are extracted in guidance map, and most of the CP in visible image distribute in strong edges or structures, like buildings and vehicles. In contrary, the single CP in grassland and trees are removed effectively in intuitive observation, as shown in Figs. 2(c)-(d). The effective candidate CP couples occupy larger proportion in all detected points, which will lead to higher matching efficiency.

2.2 ROI pairs matching based on epipolar curve

Instead relying on georeferencing information or search window size in whole image, we propose to extract the ROI pairs using the binocular calibration results, which reflect the spatial relationship and initial imaging law of the ultra-field infrared and visible cameras. Specifically, for an image pair, the homologous point of each pixel in one image is located on a line in another image, which is named as epipolar curve [27,28]. Similarly, the corresponding points in slave image always locate near a curve. In other words, the CP near the epipolar curve can be treated as the ROI, and the correct CP will be selected from these ROI. The main steps for extracting ROI pairs are described as follows.

Firstly, the ultra-field visible and infrared cameras are calibrated using calibration model, which is the optimization of the Bouguet binocular calibration model [29]. The calibration process is mainly composed of four steps: (1) the corners are located at subpixel level in infrared and visible bands synchronously using designed thermal radiation checkerboard; (2) the single camera calibration is performed in infrared and visible bands, respectively; (3) the calibrated parameters of binocular cameras are combined to calculate the sum of the reprojection; (4) the binocular camera parameters are refined by utilizing the RBF network to fit mapping relationship from camera coordinate to sensor coordinate. The imaging model of the ultra-field infrared and visible binocular cameras is illustrated as shown in Fig. 3. The designed checkerboard and calibration results are shown in Fig. 4. The position relationship of binocular cameras and checkerboard is shown in Fig. 4(c). The designed checkerboard is composed of black and white aluminium alloy squares, with the peltiers installed behind them; thus, they can be used for heating and cooling, and robust corners can be made using their edges. Note that the working time is restricted within 30 min to guarantee the accuracy.

 figure: Fig. 3.

Fig. 3. Ultra-field infrared and visible binocular model

Download Full Size | PDF

 figure: Fig. 4.

Fig. 4. Ultra-field infrared and visible binocular calibration results

Download Full Size | PDF

Secondly, the ultra-field infrared image is taken as the master image due to that its CP is less but stable. The ultra-field visible image is taken as slave image. According to model in Fig. 3(a) CP in master image is inversely mapped into world space as a ray using the initial parameter of the ultra-field infrared camera.

$${p_l}\textrm{ = }\left\{ {\begin{array}{{c}} {{\rho_l} = \sqrt {{x^2} + {y^2}} }\\ {{\omega_l} = \arctan \frac{{{y_l}}}{{{x_l}}}} \end{array}} \right.,{P_l}\textrm{ = }\left\{ {\begin{array}{{c}} R\\ {{\varphi_l} = proj_l^{ - 1}({{\rho_l},{\omega_l}} )}\\ {{\theta_l}} \end{array}} \right.,{P_{l - ray}}\textrm{ = }{\lambda _l}\left\{ {\begin{array}{{c}} {R\sin {\varphi_l}\sin {\theta_l}}\\ {R\sin {\varphi_l}\cos {\theta_l}}\\ {R\cos {\varphi_l}} \end{array}} \right.$$
where ${p_l}$ is a CP in ultra-field infrared image, ${P_l}$ is the corresponding coordinate on the lens’ hemisphere, ${P_{l - ray}}$ is the ray in camera space, $proj_l^{ - 1}$ represents the inverse mapping rule of infrared system and ${\lambda _l}$ is the scale factor.

The point coordinates on the ray are transformed to ultra-field visible camera coordinate thanks to external relationship, and they continue to be converted into ultra-field visible image using its initial parameters.

$${P_{r - ray}}\textrm{ = }{R_{rotation}} \cdot {\left[ {{\lambda_l}\left\{ {\begin{array}{{c}} {R\sin {\varphi_l}\sin {\theta_l}}\\ {R\sin {\varphi_l}\cos {\theta_l}}\\ {R\cos {\varphi_l}} \end{array}} \right.} \right]^T} + {T_{translation}},{p_{epipolar}} = pro{j_r}({{P_{r - ray}}} )$$
where ${R_{rotation}}$ and ${T_{translation}}$ are rotation matrix and translation matrix of calibration results, respectively, $pro{j_r}$ is the mapping rule of visible camera.

As shown in Fig. 5, the yellow curve is the epipolar curve for the CP in infrared image. The corresponding CP in visible image is near the epipolar with little deviation, which results from calibration accuracy.

 figure: Fig. 5.

Fig. 5. Epipolar curve based on calibration results

Download Full Size | PDF

Thirdly, the Euclidean distances are calculated between the CP with epipolar curve in ultra-field visible image. The distance threshold is set as 3 pixels to compensate the calibration deviation.

In summary, the ROI candidates are extracted using epipolar curve in ultra-field infrared and visible cameras. As a result, the proposed method can adapt to various scenes and decrease the search iteration. Thus, the ROI pairs matching can be accomplished with high efficiency for further identification and registration.

2.3 Multiple phase congruency directional patterns descriptor

The main contribution of this subsection is to propose a novel local structure based descriptor, named as MPCDP, for identifying the real CP pairs from ROI candidate pairs. The proposed MPCDP is motivated by GDP descriptor [30]. The GDP can overcome the sensibility to noise and illumination variation of local binary pattern (LBP) and obtain good result in facial expression. The MPCDP descriptor is constructed in adjacent window by quantizing its phase congruency amplitudes at different orientations. The adjacent window is defined according to the calibration results in ultra-field visible image. The establishment steps of the MPCDP descriptor is in the following.

Firstly, the MPCDP operator selects a 3×3 neighborhood around each point of the phase congruency map at each orientation. The phase congruency amplitudes are quantized with respect to center point using a threshold t. Here, a neighbor point is marked as 1 if its absolute amplitude difference with center point is less than t, otherwise, marked as 0, as shown in Fig. 6.

 figure: Fig. 6.

Fig. 6. The quantizing process of phase congruency

Download Full Size | PDF

Secondly, the pattern of the center point is expressed as 10101010 as a binary code, and the encoded map can be obtained by MPCDP operator. Then, the distribution information of the MPCDP is represented as a spatial histogram to describe the characteristics of adjacent window.

$${H_{MPCDP}}(i )= \sum\limits_{x = 1}^M {\sum\limits_{y = 1}^N {f({MPCDP({x,y} ),i} )} } ,\textrm{ }f({a,b} )= \left\{ {\begin{array}{{cc}} 1&{a = b}\\ 0&{otherwise} \end{array}} \right.$$

Thirdly, the histograms of different orientation are combined and weighted through nonzero mean amplitude in their phase congruency maps. Thus, the descriptor histogram is defined as follows:

$$MPCDP = \{{{\omega_1} \cdot PCD{P_1},{\omega_2} \cdot PCD{P_2}, \ldots {\omega_n} \cdot PCD{P_{nth - direction}}} \}$$
where ${\omega _1},{\omega _2}, \ldots {\omega _n}$ are L2-normalized weights of the nonzero mean values at different orientations.

Finally, the main direction of the descriptor is defined by introducing the ${\omega _1},{\omega _2}, \ldots {\omega _n}$ into orientation angles. Thus, the weighted histogram and main direction are made up for MPCDP descriptor.

It can be concluded that the MPCDP is built by utilizing both amplitude and orientation of the phase congruency. The amplitude information is converted to binary code. Therefore, the influence of the contrast changes can be eliminated and the MPCDP is more robust in its adaptability. After obtaining the MPCDP descriptors of the candidate CP and the ROI, their similarity is computed using normalized correlation coefficient (NCC). If their NCC can satisfy the similarity threshold, the CP pairs are real, otherwise they are not.

In summary, the flowchart of the proposed registration method is shown in Fig. 7. It mainly consists of following six steps:

  • (1) The ultra-field infrared image is taken as master image since its significant structures are not abundant as ultra-field visible image.
  • (2) The CP in ultra-field visible and infrared image are initially extracted by adopting maximum moment of phase congruency covariance. The CP with weak significance are removed by proposed guidance map in ultra-field visible image.
  • (3) Taking CP candidates in ultra-field infrared image as reference points, the ROI are selected in visible image based on the epipolar curve model. The adjacent window of ROI in visible image is defined according to the ultra-field binocular infrared-visible calibration results.
  • (4) The MPCDP descriptors of the CP and ROI are constructed based on their phase congruency maps, respectively. The MPCDP descriptor is composed of weighted histogram and main direction.
  • (5) The similarity of two MPCDP descriptors is computed by employing the NCC. The matched pairs are extracted as CP candidate pairs if their NCC satisfies the condition.
  • (6) The mismatched pairs are eliminated by using a global consistency check method based on a global transformation [31].

 figure: Fig. 7.

Fig. 7. Flow of the proposed registration method

Download Full Size | PDF

3. Experiment results and analysis

In this section, the experiments are implemented by utilizing six ultra-field infrared and visible image pairs, which are taken by our system. In first four image pairs, the scenes proportions of whole filed are adjusted to verify the flexibility of the proposed method when used in different conditions. To validate the universality of the proposed method, the binocular cameras relationship is changed in the latter two image pairs. To evaluate the adaptability of the proposed method for illumination changes, the first and fourth image pairs are taken on the sunny morning. The second and third image pairs are taken on the sunny afternoon. The fifth and sixth image pairs are taken in a small rainy day. The main parameters about phase congruency are setting as: nscale=4, norientation=6, $\sigma \textrm{ = }0.55$ and ${\lambda _{\min }} = 3$ in scale filter. The diffusivity speed of nonlinear diffusion in infrared image is set 3. All the experiments are implemented using MATLAB 2014b on a notebook PC with a 2.2 GHz Intel Core CPU and 4 GB memory.

3.1 Performance analysis of robust CP extraction

The performance of the proposed CP extraction method is evaluated by calculating the repeatability [32]. The nonlinear diffusion filter is replaced by Gaussian filter in proposed method, expressed as Gaussian replaced in proposed method. To further verify the superiority, the proposed method is compared with phase congruency based method, Harris corner detection method and Gaussian replaced in proposed method. According to the illustration of the repeatability, the average repeatability of image rotation, scale change, variation illumination and viewpoint change is calculated about six image pairs.

The proposed method nearly ranks second among all test methods, as shown in Fig. 8. However, in our view, it’s not comprehensive to only rely on the repeatability results since the number of CP varies with the image morphology change. For example, though the phase congruency based method always ranks first, the image rotation change causes the CP number explosion, as shown in Fig. 9. The reason is that the Fourier components of the image change due to image rotation, leading to that more points can satisfy the original threshold. In other words, there are too many invalid CP to guarantee the detection robust. The Gaussian replaced in proposed method can preserve part of CP in image morphology changes without bringing CP explosion, but its performance is not as good as the proposed method. The reason is that the nonlinear diffusion filtering is better to extract and preserve the significant structures and details. The guidance map based on the nonlinear diffusion filtering is more distinguishable and robust than Gaussian filter. The performance of Harris based method is not as good as the proposed method excepting in the fourth image pair. However, the Harris based method only extracts 24 CP in fourth image pair, in which the quantity of CP is too few and it will result in poor registration accuracy. So, we can conclude that the proposed method is more robust to extract CP for ultra-field infrared and visible images registration, as shown in Fig. 10.

 figure: Fig. 8.

Fig. 8. Average repeatability of all the detection methods

Download Full Size | PDF

 figure: Fig. 9.

Fig. 9. CP extraction results for rotation change

Download Full Size | PDF

 figure: Fig. 10.

Fig. 10. Robust ultra-field infrared and visible CP extraction results

Download Full Size | PDF

3.2 Performance analysis of ROI pairs matching

The ultra-field infrared and visible binocular cameras are calibrated to establish the corresponding relationship between CP in infrared image and epipolar curve in visible image. The effect of the epipolar curve can be equivalent to search window in most of current methods. Therefore, the calibrated accuracy is important. Two groups of the calibrated results are as follows in Table 1.

Tables Icon

Table 1. Calibration results of infrared and visible ultra-field cameras

From the results above, it can be seen that the intrinsic parameters of two cameras are different, and the difference of camera positions mainly distributes in X and Z direction, which causes the vision disparity. Thus, the imaging positions and characteristics of the same object are different in two cameras due to vision disparity, different imaging law and wave bands. As shown in Fig. 11, the calibration projection errors of original method are 0.48pixel and 0.43pixel. In comparison, the errors of improved calibration method are 0.28pixel and 0.26pixel. It can be concluded that the improved method increases the calibration accuracy, which benefits to locating epipolar curve more accurately in binocular camera system.

 figure: Fig. 11.

Fig. 11. Reprojection errors comparison in ultra-field infrared and visible cameras

Download Full Size | PDF

To further evaluate the performance of the proposed method for ROI pairs matching, its error is defined by utilizing the Euclidean distances between real CP coordinates and calculated CP. The calculated CP coordinates is obtained by using transformation models in slave image. The transformation models for comparison are selected as affine, projective, Quadratic polynomial and cubic polynomial. In our method, the Euclidean distance is defined as real CP coordinate’s closet distance to the epipolar curve. To make the comparison reliable, we establish a large number of CP pairs set, which are distributed on the edges or corners of the salient structures. Moreover, the training CP pairs are uniformly sampled in CP pairs set to avoid falling into local optimization for transformation model. The test CP pairs are randomly selected to test the matching errors of the ROI pairs. In our experiment, the proportions of the training CP and test CP vary from 10% to 100% in the CP set. The numbers of the CP pairs selected for experiment are 1269, 706, 1363, 1095, 915 and 928 in image pairs No.1 to No.6, respectively.

The low average matching errors imply that it is easier to locate correct ROI with high accuracy. As shown in Fig. 12, the average matching errors of the proposed method are always the lowest. The errors of the comparison methods are decreased at least 50% by our method. Specially, there exist large errors in cubic polynomial result in second image pair. The reason is that the scene only occupies one-fifth FOV in second image pair, so it is lack of the training CP for whole fitting. Moreover, the proposed method has higher efficiency, because the comparison methods still need to increase the amount of the training CP to optimize transformation model. The standard deviations of the matching errors are shown in Fig. 13. Similarly, the proposed method obtains the lower results than comparison methods, which indicates the proposed method is more robust for various application conditions and different camera relationships. The higher standard deviation suggests the distance between the ROI with the correct CP sometimes is much larger than average errors. Thus, in order to guarantee the registration effective, the comparison methods not only need to increase the amount of the training CP, but also need to expand the search window size. However, the calculation amount will be undoubtedly increased. Therefore, it can be concluded that the proposed method is more efficiency and robust in ROI pairs matching.

 figure: Fig. 12.

Fig. 12. Average matching errors of image pairs No.1-6

Download Full Size | PDF

 figure: Fig. 13.

Fig. 13. Standard deviations of matching errors of image pairs No.1-6

Download Full Size | PDF

3.3 Performance analysis of MPCDP descriptor

To evaluate the performance of the MPCDP descriptor, the comparison experiments are implemented between the proposed MPCDP descriptor with comparison descriptors in this section. The comparison descriptors are composed of four stat-of-the-art descriptors, namely, gradient directional pattern (GDP) [30], histogram of orientated phase congruency (HOPC) [5], LBP [19] and local directional texture pattern (LDTP) [33]. They represent the gradient based, phase congruency based, binary based and direction based features, respectively. All five descriptors, including MPCDP, are applied into correct CP pairs’ identification tests. Note that the ROI pairs in all tests are obtained from the previous section. The parameters of the comparison descriptors are set up as the default in respective literatures, except that the window sizes are set to 20 pixels among all descriptors. Here we adopt the correct matching ratio (CMR) as the index, which is defined as follows.

$$CMR = \frac{{Nu{m_{correct}}}}{{Nu{m_{all}}}}$$
where $Nu{m_{correct}}$ is the number of corrected matching CP pairs, $Nu{m_{all}}$ is number of the whole CP pairs candidates.

As shown in Fig. 14, the CMR of all descriptors are calculated in six ultra-filed infrared and visible image pairs. It can be seen that the proposed MPCDP ranks first in all CMR tests. The CMR of the LDTP are approximately equal to ours results, but they are slightly lower than ours. The CMR of the HOPC descriptor remain around 0.7. Although HOPC only ranks third, its results indicate that it is stable in various conditions. Compared with top three descriptors, the performances of the GDP and LBP descriptors are relatively unstable, especially their CMR are lower than 50% in second image pairs.

 figure: Fig. 14.

Fig. 14. Five descriptors CMR of image pairs No.1-6

Download Full Size | PDF

From the results above, we can conclude that the phase congruency and direction characteristics are better adapted into multi-modal feature matching, because the top three descriptors all refer to them, or one of them. Note that the slightly worse result of HOPC is due to the fact that there are not enough salient structure features in local window. So this conclusion is credible and it exactly accords with the analysis in Section 2.1. In contrary, it’s hard to obtain good result only relying on gradient or binary based method. The reason is that nonlinear intensity variations interfere with the structure similarity calculation in original infrared and visible images.

All the above-mentioned results and analysis demonstrate that the proposed MPCDP descriptor can achieve good effect on correct CP identification in ultra-field infrared and visible images. Its superior performance may attribute to two aspects. On the one hand, the relative relationships of the phase congruency amplitudes are converted to the weighted histogram of the binary codes. On the other hand, multiple directions are combined together in phase congruency map. The L2-normlized weights contribute to the robustness of the method. Thus, the MPCDP descriptor can not only distinguish the significant structures accurately, but also keep robust against nonlinear intensity variations.

3.4 Registration results

To verify the overall performance of the proposed registration method, six pairs of ultra-field infrared and visible images are employed for the final registration experiments. The occupying proportions of the scenes in whole field are approximately 85%, 25%, 50%, 75%, 50% and 50%, respectively. To better distinguish the registration details, the outlines of the infrared images are extracted and superimposed on the visible images as yellow [5,9]. The reason of this way is that there are less outlines in infrared images, otherwise, the yellow outlines of the visible images will cover up the infrared details. Thus, it will be more appropriate for the human observation and understanding. As shown in Fig. 15, the registration result of every image pair consists of three images, which are CP correspondence image, fusion image based on infrared outlines, local enlarged part of fusion image, respectively. Note that fifty CP pairs are randomly selected to appear in the CP correspondence image for easier observation. It can be observed that the CP correspondences are all correct. The structures and details of the ultra-field infrared and visible images are accurately aligned. Moreover, all the regions are well registered, whether at the edge regions or the center regions in the images. It indicates that the proposed method can eliminate the interference of the image distortions and vision disparity. Meanwhile, the proposed method performs robustly in various scenes and different camera relationships.

 figure: Fig. 15.

Fig. 15. Final registration results of six image pairs

Download Full Size | PDF

In summary, the proposed method combines the binocular camera model and phase congruency based feature together to accomplish the accurate registration. Hence, the accuracy will be decreased if there is less structure information because the construction of the phase congruency based feature is interfered. The more effective feature extraction with less dependence on details will be studied in the future.

4. Conclusions

In this paper, a robust registration method is presented for ultra-field infrared and visible images. The method combines the calculation processing on image and binocular cameras physical model together to overcome the vision disparity and image distortion brought by nonlinear imaging. The CP extraction is robust by utilizing phase congruency theory and employing the significant structures as guidance in both ultra-field infrared and visible images. Then, ROI pairs are selected efficiently based on epipolar curve, which is sourced from the improved calibration results. The MPCDP descriptor is proposed to identify the correct CP pairs for registration. The experiments are implemented in five types of scenes occupying different proportions in whole FOV and two different camera relationships. The results of the repeatability, matching errors and CMR prove that the proposed method is robust and accurate for ultra-field infrared and visible images registration. In further work, the proposed registration method will be studied for higher efficiency.

Funding

National Natural Science Foundation of China (61702347, 61801507); Army Engineering University Foundation Frontier Innovation Project (XK201933XQ15).

Disclosures

The authors declare no conflicts of interest.

References

1. W. M. Zhang, J. Zhao, M. Chen, Y. M. Chen, K. Yan, L. Y. Li, J. B. Qi, X. Y. Wang, J. H. Luo, and Q. Chu, “Registration of optical imagery and Li DAR data using an inherent geometrical constraint,” Opt. Express 23(6), 7694–7702 (2015). [CrossRef]  

2. Q. H. Yu, D. M. Wu, F. C. Chen, and S. L. Sun, “Design of a wide-field target detection and tracking system using the segmented planar imaging detector for electro-optical reconnaissance,” Chin. Opt. Lett. 16(7), 071101 (2018). [CrossRef]  

3. S. G. Li, “Binocular Spherical Stereo,” IEEE Trans. Intell. Transport. Syst. 9(4), 589–600 (2008). [CrossRef]  

4. X. H. Wang, K. H. Wu, and S. Z. Wang, “Research on Panoramic Image Registration Approach based on Spherical Model,” IJSIP 6(6), 297–308 (2013). [CrossRef]  

5. Y. X. Ye, J. Shan, L. Bruzzone, and L. Shen, “Robust Registration of Multimodal Remote Sensing Images Based on Structural Similarity,” IEEE Trans. Geosci. Electron. 55(5), 2941–2958 (2017). [CrossRef]  

6. H. D. Liang, C. L. Liu, B. He, T. Nie, G. L. Bi, and C. Su, “A Binary Method of Multisensor Image Registration Based on Angle Traversal,” Infrared Phys. Technol. 95, 189–198 (2018). [CrossRef]  

7. Y. Bentoutou, N. Taleb, K. Kpalma, and J. Ronsin, “An Automatic Image Registration for Applications in Remote Sensing,” IEEE Trans. Geosci. Electron. 43(9), 2127–2137 (2005). [CrossRef]  

8. Y. J. Zuo, J. H. Liu, M. Y. Yang, X. Wang, and M. C. Sun, “Algorithm for unmanned aerial vehicle aerial different-source image matching,” Opt. Eng. 55(12), 123111 (2016). [CrossRef]  

9. J. W. Fan, Y. Wu, M. Li, W. K. Liang, and Y. C. Cao, “SAR and Optical Image Registration Using Nonlinear Diffusion and Phase Congruency Structural Descriptor,” IEEE Trans. Geosci. Electron. 56(9), 5368–5379 (2018). [CrossRef]  

10. F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens, “Multimodality image registration by maximization of mutual information,” IEEE Trans. Med. Imaging 16(2), 187–198 (1997). [CrossRef]  

11. S. K. S. Fan, Y. C. Chuang, and J. R. Wu, “A New Cross-Correlation Based Image Registration Method,” Appl. Mech. Mater. 58-60(60), 1979–1984 (2011). [CrossRef]  

12. Y. Hel-Or, H. Hel-Or, and E. David, “Fast template matching in non-linear tone-mapped images,” in Proc. IEEE Int. Conf. Comput. Vis., (2011), pp. 1355–1362.

13. D. Lowe, “Distinctive image features from scale-invariant keypoints,” Int. J. Comput. Vis. 60(2), 91–110 (2004). [CrossRef]  

14. M. Calonder, V. Lepetit, C. Strecha, and P. Fua, “BRIEF: Binary Robust Independent Elementary Features,” In Proceedings of the European Conference on Computer Vision, (ECCV), (2010), pp. 1–4.

15. E. Rosten and T. Drummond, “Machine learning for high-speed corner detection,” In Proceedings of the European Conference on Computer Vision (ECCV), (2006), pp. 1–2.

16. C. Y. Zhao, H. C. Zhao, J. F. Lv, S. J. Sun, and B. Li, “Multimodal image matching based on Multimodality Robust Line Segment Descriptor,” Neurocomputing 177(6), 290–303 (2016). [CrossRef]  

17. Z. L. Song, S. Li, and Thomas F. George, “Remote sensing image registration approach based on a retrofitted SIFT algorithm and Lissajous-curve trajectories,” Opt. Express 18(2), 513–522 (2010). [CrossRef]  

18. F. Lv and K. Ren, “Automatic registration of airborne Li DAR point cloud data and optical imagery depth map based on line and points features,” Infrared Phys. Technol. 71, 457–463 (2015). [CrossRef]  

19. T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale and rotation invariant texture classification with local binary patterns,” IEEE Trans. Pattern Anal. Machine Intell. 24(7), 971–987 (2002). [CrossRef]  

20. E. Shechtman and M. Irani, “Matching local self-similarities across images and videos,” in Proc. IEEE Comput. Vis. Pattern Recognit. (2007), pp. 1–8.

21. M. I. Patel, V. K. Thakar, and S. K. Shah, “Image Registration of Satellite Images with Varying Illumination Level Using HOG Descriptor Based SURF,” Procedia Comput. Sci. 93, 382–388 (2016). [CrossRef]  

22. S. W. Teng, M. T. Hossain, and G. J. Lu, “Multimodal image registration technique based on improved local feature descriptors,” J. Electron. Imaging 24(1), 013013 (2015). [CrossRef]  

23. C. Q. Tang, G. Y. Tian, X. T. Chen, J. B. Wu, K. J. Li, and H. Y. Meng, “Infrared and Visible Images Registration withAdaptable Local-Global Feature Integration for RailInspection,” Infrared Phys. Technol. 87, 31–39 (2017). [CrossRef]  

24. P. Kovesi, “Image features from phase congruency,” Videre: Journal of Computer Vision Research 1(3), 1–26 (1999).

25. X. Z. Liu, Y. F. Ai, J. L. Zhang, and Z. P. Wang, “A Novel Affine and Contrast Invariant Descriptor for Infrared and Visible Image Registration,” Remote Sens. 10(4), 658 (2018). [CrossRef]  

26. A. Wong and D. A. Clausi, “ARRSI: Automatic Registration of Remote-Sensing Images,” IEEE Trans. Geosci. Electron. 45(5), 1483–1493 (2007). [CrossRef]  

27. S. G. Li, “Real-Time Spherical Stereo,” in 18th International Conference on Pattern Recognition (ICPR’06), (2006), pp. 1–4.

28. J. Moreau, S. Ambellouis, and Y. Ruichek, “3D reconstruction of urban environments based on fisheye stereovision,” in 8th International Conference on Signal Image Technology and Internet Based System, (2012), pp. 36–41.

29. J. Bouguet, “Camera calibration toolbox for matlab,” (2014). http://www.vision.caltech.edu/bouguetj/calib_doc/index.html, (visited January 2020).

30. F. Ahmed, “Gradient directional pattern: a robust feature descriptor for facial expression recognition,” Electron. Lett. 48(19), 1203 (2012). [CrossRef]  

31. L. Yu, D. R. Zhang, and E. J. Holden, “A fast and fully automatic registration approach based on point features for multi-source remote-sensing images,” Comput. Geosci. 34(7), 838–848 (2008). [CrossRef]  

32. C. Schmid, R. Mohr, and C. Bauckhage, “Evaluation of interest point detectors,” Int. J. Comput. Vis. 37(2), 151–172 (2000). [CrossRef]  

33. A. R. Rivera, J. R. Castillo, and O. Chae, “Local directional texture pattern image descriptor,” Pattern Recognit. Lett. 51, 94–100 (2015). [CrossRef]  

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (15)

Fig. 1.
Fig. 1. Comparison of the phase congruency and gradient in infrared and visible images
Fig. 2.
Fig. 2. Principle of the proposed CP extraction
Fig. 3.
Fig. 3. Ultra-field infrared and visible binocular model
Fig. 4.
Fig. 4. Ultra-field infrared and visible binocular calibration results
Fig. 5.
Fig. 5. Epipolar curve based on calibration results
Fig. 6.
Fig. 6. The quantizing process of phase congruency
Fig. 7.
Fig. 7. Flow of the proposed registration method
Fig. 8.
Fig. 8. Average repeatability of all the detection methods
Fig. 9.
Fig. 9. CP extraction results for rotation change
Fig. 10.
Fig. 10. Robust ultra-field infrared and visible CP extraction results
Fig. 11.
Fig. 11. Reprojection errors comparison in ultra-field infrared and visible cameras
Fig. 12.
Fig. 12. Average matching errors of image pairs No.1-6
Fig. 13.
Fig. 13. Standard deviations of matching errors of image pairs No.1-6
Fig. 14.
Fig. 14. Five descriptors CMR of image pairs No.1-6
Fig. 15.
Fig. 15. Final registration results of six image pairs

Tables (1)

Tables Icon

Table 1. Calibration results of infrared and visible ultra-field cameras

Equations (10)

Equations on this page are rendered with MathJax. Learn more.

P ( x , y ) = n W ( x , y ) A n ( x , y ) Δ Φ ( x , y ) T n A n ( x , y ) + ε
Δ Φ ( x , y ) = cos ( ϕ n ( x , y ) ϕ ¯ ( x , y ) ) | sin ( ϕ n ( x , y ) ϕ ¯ ( x , y ) ) |
I ( x , y ) t = d i v ( c ( x , y , t ) I )
c ( x , y , t ) = g ( | I ( x , y , t ) | )
g = { 1 | I | 2 0 1 exp ( 3.315 ( | I | / k ) 3 ) | I | 2 > 0
p l  =  { ρ l = x 2 + y 2 ω l = arctan y l x l , P l  =  { R φ l = p r o j l 1 ( ρ l , ω l ) θ l , P l r a y  =  λ l { R sin φ l sin θ l R sin φ l cos θ l R cos φ l
P r r a y  =  R r o t a t i o n [ λ l { R sin φ l sin θ l R sin φ l cos θ l R cos φ l ] T + T t r a n s l a t i o n , p e p i p o l a r = p r o j r ( P r r a y )
H M P C D P ( i ) = x = 1 M y = 1 N f ( M P C D P ( x , y ) , i ) ,   f ( a , b ) = { 1 a = b 0 o t h e r w i s e
M P C D P = { ω 1 P C D P 1 , ω 2 P C D P 2 , ω n P C D P n t h d i r e c t i o n }
C M R = N u m c o r r e c t N u m a l l
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.