Calibration method for large-field-of-view stereo vision system based on distance-related distortion model

Xu Wang; Xu Wang; Yang Gao; Yang Gao; Zhenzhong Wei; Zhenzhong Wei

doi:10.1364/OE.492498

1. Introduction

Vision-based large-field-of-view (LFOV) precision measurement is widely used in many fields, including large aviation components measurement [1], propeller measurement [2], shield machine size measurement [3], and other fields [4,5]. The calibration of the LFOV stereo vision system plays an important role in ensuring measurement accuracy. There has been a lot of research into the calibration of vision sensors. For example, Abdel-Aziz et al. proposed the direct linear transformation method (DLT) [6]. This method requires at least 6 control points, and can directly solve the internal and external parameters of the camera according to the 3D coordinates of the control points and the 2D image coordinates. However, the DLT method does not consider the influence of camera distortion, and the calibration accuracy cannot be satisfied. Yakimovsky [7] improved the DLT method by optimizing the camera distortion parameters. As 3D targets are difficult to manufacture, calibration methods based on 2D calibration objects are more flexible. Moreover, 2D targets can provide considerable reference points to effectively constrain the camera parameters. Tsai [8] proposes a two-step approach that implements camera calibration using radial constraints and optimization. And the Zhang calibration method [9], which uses planar checkerboard targets for calibration, is wildly used in close-range photogrammetry.

Calibration of the LFOV stereo vision system requires large calibration objects. However, large-size planar targets or 3D targets have the problems of high cost, suffering in manufacturing, and being inconvenient to carry. Therefore, many researchers focus on the calibration of the LFOV system based on virtual stereo targets. Those virtual 3D targets are constructed by measuring the pre-arranged targets through three-coordinate instruments (total station [10], laser tracker [1,11], and visual three-coordinate system [2]). For example, Marta Róg used a total station to obtain a large number of control points to calibrate the camera. According to the calibration report, the overall mean accuracy of this process is 0.24 mm, which was automatically calculated by the software based on residuals obtained on test points in the directions of coordinate system axes [10]. Yang used 3D targets which are obtained by laser tracker and cube-corner prism for calibration. This work constructs a virtual three-dimensional flexible control field and realizes the precise calibration of the binocular vision system. The experimental results show that under the conditions that the baseline is 120 mm and the working distance is 3m∼5 m, the residual error of spatial points is 0.3654 mm and the root mean square (RMS) error of the reconstructed distance is 1.0728 mm [11]. Hu calibrated the camera by using a digital close-range industrial photogrammetry system. With a camera resolution of 2448 × 2048 pixels, the reprojection error is less than 0.11 pixels. For the measuring volume of 10.0 m × 8.4 m × 3.6 m, the maximum absolute error of 3-D measurement is 0.42 mm [2]. Previous studies on the calibration methods of the LFOV stereo vision system have achieved calibration accuracies that can satisfy the measurement requirements in specific environments. However, these methods also have limitations, such as the considerable amounts of control points and the complexity of operation. These limitations can make it challenging to apply the method to specific scenarios, such as a scene of tens of meters.

Small targets have many advantages over large ones [9], such as ease of use and lower cost. Therefore, it is of great significance to be able to calibrate an LFOV stereo vision system with a small target. However, to cover the imaging view, small targets must be used close to the vision sensors. As a result, the measurement field and the calibration field are not unified, and the calibration result would be affected. Magill and Brown's study [12,13] of the optical imaging model shows that the lens distortion coefficient, primarily the radial distortion coefficient, is related to the magnification of imaging. However, in the traditional camera model [8,9], camera distortions are described using constant parameters. This theoretically explains that when the constant distortion model (CDM) is used, the inconsistency between the calibration field and measurement field will lead to inaccurate results. Therefore, new distortion models are established to solve the inconsistency. For example, Magil et al. proposed a distance-related model for different focusing planes [12]. By calibrating the distortion in two focal planes (these planes being perpendicular to the optical axis), Brown [13] et al. improved the generalization performance of the model based on Magil, and they gave a numerical derivation of the distortion coefficients in the unfocussed and focused planes. Based on Brown et al., Fryer [14] proposed a new distance-related model that overcomes the difficulty of Brown's model in describing large image distortions. However, there are empirical coefficients in this model that make it difficult to obtain good results in terms of generalization to different types of lenses. Alvarez [15] et al. derive a radial distortion model suitable for planar scenes that eliminates empirical coefficients and improves calibration flexibility compared to Fryer's model. Also, Alvarez proposed a camera calibration method by using the straight line information on the planar scene. However, this method needs to estimate the location of the focal plane, which introduces some parameters that are not necessary to solve. Sun [16] et al. proposed a distance-related model for solving the camera distortion coefficients in an arbitrary plane, and the 3D measurement accuracy in a 7.0 m × 3.5 m × 2.5 m space at 6 m object distance was greatly improved. However, the calibration method proposed by Sun requires an industrial robot and a large linear beam calibration plate to provide precise constraints. This approach is not flexible. Combining the work of Frye and Alvarez et al, Li [17] et al. proposed a model based on the equal-radius, which divides the image into several annular regions and models the camera distortion coefficients in each annular region, and they demonstrated the validity of the model experimentally in a close-up small depth of field (DOF) experimental condition. They proposed a calibration method that can reduce the position requirements for the checkerboard. But this method needs to fully occupy the field of view (FOV) and DOF with a checkerboard, which is almost impossible for LFOV measurement.

Therefore, to improve the calibration flexibility and calibration precision of the LFOV stereo vision system, a new distance-related distortion model and a new flexible calibration method are proposed in this paper. Firstly, compared to the existing distance-related models [16,17], the proposed model has the lowest reprojection error on the test dataset and it can alleviate overfitting. Secondly, our method can achieve satisfactory results in terms of re-projection error on 2D test datasets and 3D length measurement when using 5 3D points and 3 checkerboard images for calibration. Finally, our method has greater flexibility compared to the CDM-based method. The improvement in camera calibration accuracy with an increasing number of chessboards for the CDM-based method would be slight when a sufficient number of 3D points are available. In contrast, our approach can enhance the calibration accuracy further by increasing the number of checkerboards placed in close range when the 3D points are adequate, and the increase in calibration accuracy achieved by adding checkerboards nearby is equivalent to that of adding 3D points at a long-range distance.

The rest of the paper is organized as follows. Section 2 introduces our distance-related model. Section 3 gives the details of the procedures of the proposed calibration method. In Section 4, the calibration experiments and the accuracy verification are designed to verify the validity of our distance-related distortion model and calibration method. Section 5 presents the conclusion.

2. Mathematical model

2.1. Camera pinhole model with lens distortion

The pinhole imaging mode is the most common camera imaging model [9]. It can be described as:

(1)$${\rho _i}{{\boldsymbol x}_i} ={=} {\boldsymbol K}\left[ {\begin{array}{{cc}} {\boldsymbol R}&{\boldsymbol t} \end{array}} \right]{{\boldsymbol X}_w}_i{\boldsymbol = }\left[ {\begin{array}{{ccc}} {{f_x}}&0&{{u_0}}\\ 0&{{f_y}}&{{v_0}}\\ 0&0&1 \end{array}} \right]\left[ {\begin{array}{{cc}} {\boldsymbol R}&{\boldsymbol t} \end{array}} \right]{{\boldsymbol X}_w}_i$$

where ${{\boldsymbol X}_w}_i = {\left[ {\begin{array}{{cccc}} {\begin{array}{{cccc}} {{X_{wi}}}&{{Y_{wi}}}&{{Z_{wi}}} \end{array}}&1 \end{array}} \right]^\textrm{T}}$ is the homogeneous coordinate in the world reference system. ${{\boldsymbol x}_i} = {[\begin{array}{{ccc}} {{u_i}}&{{v_i}}&1 \end{array}]^\textrm{T}}$ is the homogeneous coordinate in the image frame. ${\rho _i}$ is an arbitrary scale factor and ${\boldsymbol K}$ is the internal parameter matrix of the camera. ${f_x}$ and ${f_y}$ are the effective focal lengths; ${u_0}$ and ${v_0}$ are the coordinates of the principal point; ${\boldsymbol R}$ and ${\boldsymbol t}$ are rotation matrices and shift vectors that associate the world reference frame with the camera reference frame. According to Zhang [8], the first two order distortion coefficients are sufficient to describe camera distortion. Therefore, it can be described as:

(2)$$\left\{ {\begin{array}{{c}} {\Delta x\textrm{ = ( }{k_1}{r^2} + \textrm{ }{k_2}{r^4})x + {p_1}({r^2} + 2{x^2}) + 2{p_2}(xy)}\\ {\Delta y\textrm{ = ( }{k_1}{r^2} + \textrm{ }{k_2}{r^4})y + {p_2}({r^2} + 2{y^2}) + 2{p_1}(xy)}\\ {{r^2} = {x^2} + {y^2}} \end{array}} \right.$$

In Eq. (2), x and y are the normalized image coordinates without distortion. $\Delta x$ and $\Delta y$ represent the amount of distortion. ${k_1}$ and ${k_2}$ are radial distortion coefficients. p₁ and p₂ are tangential distortion coefficients. r is the radial distance on the normalized plane.

2.2. Distance-related distortion model

The effect of object distance on imaging distortion, as shown in Fig. 1, can be divided into two categories [13]. One is caused by the focus position of the camera, as shown in Fig. 1(a); the other is caused by the position of the object in the DOF, as shown in Fig. 1(b). Figure 1(a) shows that while the points below the object distance s₁ and s₂ are respectively imaged on their focal plane, and the radial distance ${r_{{s_1}}}$ is equal to ${r_{{s_2}}}$ numerically, there is a clear gap between the distortion $\Delta {r_{{s_1}}}$ and distortion $\Delta {r_{{s_2}}}$. Figure 1(b) illustrates that while the focal length of the camera is fixed and the camera is focused at distance s, the object at distance s₁ defocuses on the image plane. In this case, the distortion $\Delta {r_s}_{,{s_1}}$ at defocus position is different from that $\Delta {r_{{s_1}}}$ at the focus position. Sun [16] and Li [17] have conducted modeling on the aforementioned scenario. Despite both of the models exhibiting superior performance than constant parameter models on calibration datasets, they underperform when applied to test datasets. Also, as the number of calibrators decreases, the RMS of reprojection errors on calibration datasets decreases while the RMS of reprojection errors on test datasets increases. These phenomena are analogous to that of model overfitting in machine learning. To alleviate this phenomenon and improve the calibration accuracy, we took inspiration from hyperparameter tuning in integrated learning and proposed a new DRDM model by introducing a pair of hyperparameters based on the Sun and Li models.

Fig. 1. Effect of object distance on imaging distortion. (a) Influence of the camera’s focus position. (b) Influence of object distance in the DOF.

Download Full Size | PDF

According to the previous work [13], the distortion parameters of two focal planes s₁ and s₂ can be used to describe the distortion of any focal plane s_x:

(3)$${k_{i,{s_x}}} = \; {\alpha _{{s_1},{s_2},{s_x}}}{k_{i,{s_1}}} + (1 - {\alpha _{{s_1},{s_2},{s_x}}}){k_{i,{s_2}}}({i = 1,2} )$$

where ${\alpha _{{s_1},{s_2},{s_x}}} = {\; }\frac{{({{s_2} - {s_x}} )({{s_1} - f} )}}{{({{s_2} - {s_1}} )({{s_x} - f} )}}$. f is the production focal length of the lens. Assuming that the focusing plane is located at s, the coefficient relationship between the focusing plane with the defocusing plane can be obtained:

(4)$${k_{i,s,{s_x}}} = \frac{1}{{{\gamma _{s,{s_x}}}}}{k_{i,{s_x}}}({i = 1,2} )$$

where ${\gamma _{s,{s_x}}} = \frac{{s - f}}{{{s_x} - f}}\frac{{{s_x}}}{s}$. Therefore, substitute Eq. (4) into Eq. (3), the distortion coefficient at the defocus plane s_x can be expressed by:

(5)$${k_{i,s,{s_x}}} = \; ({{\alpha_{{s_1},{s_2},{s_x}}}{k_{i,{s_1}}} - ({\; {\alpha_{{s_1},{s_2},{s_x}}} - 1} ){k_{i,{s_2}}}} ){({\; {\gamma_{s,{s_x}}}} )^{2i}}({i = 1,2} )$$

When the focus position s is fixed, the distortion coefficient at any position s_x in the DOF can be expressed by the coefficients of position s₁ and position s₂:

(6)$${k_{i,s,{s_x}}} = {\gamma _{i,{s_1},{s_x}}}{\alpha _{{s_1},{s_2},{s_x}}}{k_{i,s,{s_1}}} + {\gamma _{i,{s_2},{s_x}}}({1 - {\alpha_{{s_1},{s_2},{s_x}}}} ){k_{i,s,{s_2}}}({i = 1,2} )$$

where ${\gamma _{i,{s_j},{s_x}}} = {({{\gamma_{{s_j},{s_x}}}} )^{(2i - 1)}}(i,j = 1,2)$. To simplify the expression, we establish: ${\alpha _{{s_x}}} = {\alpha _{{s_1},{s_2},{s_x}}}$ The distortion coefficient at the defocus plane can also be expressed as [14]:

(7)$${k_{i,s,{s_x}}} = \; {k_s} + g \cdot ({k_{i,{s_x}}} - {k_{i,s}})({i = 1,2} )$$

where g is an empirical constant. By eliminating this coefficient, we obtain [15]:

(8)$${k_{i,s,{s_x}}} = \; {k_{i,s,{s_1}}} + {\alpha _{s,{s_1},{s_x}}}({k_{i,s}} - {k_{i,s,{s_1}}})({i = 1,2} )$$

According to Eq. (8), we obtain:

(9)$$\left\{ {\begin{array}{{c}} {{k_{i,s,{s_x}}} = \; {k_{i,s,{s_1}}} + {\alpha_{s,{s_1},{s_x}}}({k_{i,s}} - {k_{i,s,{s_1}}})}\\ {{k_{i,s,{s_x}}} = \; {k_{i,s,{s_2}}} + {\alpha_{s,{s_2},{s_x}}}({k_{i,s}} - {k_{i,s,{s_2}}})}\\ {{k_{i,s,{s_1}}} = {k_{i,s,{s_2}}} + {\alpha_{s,{s_2},{s_1}}}({k_{i,s}} - {k_{i,s,{s_2}}})} \end{array}} \right.({i = 1,2} )$$

If we eliminate the parameters related to s, such as ${\alpha _{s,{s_1},{s_x}}}$, we obtain:

(10)$${k_{i,s,{s_x}}} = \; {\beta _{{s_x}}}{k_{i,s,{s_1}}} + ({1 - {\beta_{{s_x}}}} ){k_{i,s,{s_2}}}({i = 1,2} )$$

where ${\beta _{{s_x}}} = {\; }\frac{{({{s_2} - {s_x}} )({{s_2} - f} )}}{{({{s_1} - f} )({{s_2} - {s_1}} )}}$. According to Eq. (6) and Eq. (10), a new distance-related model can be obtained:

(11)$$\left\{ {\begin{array}{{l}} {{k_{i,s,{s_x}}} = \; {\lambda_{i,{s_x}}}{k_{i,s,{s_1}}} + {\mu_{i,{s_x}}}{k_{i,s,{s_2}}}({i = 1,2} )}\\ {{\lambda_{i,{s_x}}} = \lambda {\gamma_{i,{s_1},{s_x}}}{\alpha_{{s_x}}} + \mu {\beta_{{s_x}}}}\\ {{\mu_{i,{s_x}}} = \lambda {\gamma_{i,{s_2},{s_x}}}({1 - {\alpha_{{s_x}}}} )+ \mu ({1 - {\beta_{{s_x}}}} )} \end{array}} \right.$$

where $\lambda$ and $\mu$ are empirical hyperparameters, and they satisfy $\lambda + \mu = 1({\lambda ,\mu > 0} )$. The above model shows that if the distances s_x and the distortion coefficients ${k_{i,s,{s_j}}} \;(i,j = 1,2)$ under distances s_j$(j = 1,2)$ are known, the distortion parameters ${k_{1,s,{s_x}}}$ and ${k_{2,s,{s_x}}}$ can be solved. Therefore, image distortion at any given depth can be described by the following formula.

(12)$$\left\{ \begin{array}{c} \Delta x\textrm{ = }\left( {\sum {{k_{i,s,{s_x}}}{r^{2i}}} } \right)x + {p_1}({r^2} + 2{x^2}) + 2{p_2}(xy)\\ \Delta y\textrm{ = }\left( {\sum {{k_{i,s,{s_x}}}{r^{2i}}} } \right)y + {p_2}({r^2} + 2{y^2}) + 2{p_1}(xy) \end{array} \right.(i = 1,2)$$

3. Calibration method

This paper proposes a new LFOV stereo vision system calibration method that integrates checkerboards with 3D points. The internal parameters of the LFOV camera and the external structural parameters between the vision sensors are the main solving objectives. The calibration principle is shown in Fig. 2. In our method, checkerboards are used as close-range targets, and 3D calibration points are used as long-range targets, and both targets are guaranteed to be in the DOF of the LFOV camera. The 3D calibration points are placed in the public FOV of the LFOV stereo vision system, while the checkerboard targets are placed in the non-public FOV.

Fig. 2. Calibration Principle.

Download Full Size | PDF

The entire calibration procedure, shown in Fig. 3, can be divided into four steps. (1) We process the target images in the DOF and then use the image data to achieve the initial calibration of the LFOV camera. (2) Next, we calculate the initial value of the external parameters and obtain the initial value of the distance-related distortion coefficients. (3) Then, we use the nonlinear optimization algorithm Levenberg-Marquardt (LM) [18] to obtain the optimal parameters of the LFOV camera based on DRDM. As shown in Fig. 3, ${{\boldsymbol R}_{\textrm{CB}}}$, ${{\boldsymbol t}_{\textrm{CB}}}$ are external parameters of checkerboard, and ${{\boldsymbol R}_{\textrm{TS}}}$, ${{\boldsymbol t}_{\textrm{TS}}}$ are external parameters of the total station. (4) Finally, after completing the above three steps, we globally optimize the parameters of the left and right cameras. ${{\boldsymbol R}_{LR}}$, ${{\boldsymbol t}_{LR}}$ in Fig. 3 are the rotation matrix and translation vector from the left to the right camera frame. Here, the calibration work of the LFOV stereo vision system is completed.

Fig. 3. Calibration process.

Download Full Size | PDF

3.1. Initial parameters estimation

The initial parameter estimation is a two-step method. First, when the image coordinates of points p₁, p₂, and p₃ are given, it is easy to calculate the external camera parameters using the p3p method in the literature [19], where the manufacturing parameters of the camera are used as the internal parameters. However, this p3p method has to solve a quadratic equation, and the solution of the quadratic equation is not unique. Therefore, verification points p_4i are added in this paper, and the reprojection errors of these points are used to filter the external camera parameters. Usually, two test points are sufficient. Second, since the camera distortion is not considered in the calculation of the external parameters, further optimization is required. The form of the optimization function is as follows:

(13)$$\arg \min \left( \sum\limits_{i = 1}^n {{\hat{\boldsymbol{p}}}_i} - \boldsymbol{p}\;(\boldsymbol{f},\boldsymbol{c},\boldsymbol{k}) + \sum\limits_{j = 1}^m {{{\hat{\boldsymbol{p}}}_j} - \boldsymbol{p}\;(\boldsymbol{f},\boldsymbol{c},\boldsymbol{k})} \right)$$

where ${\boldsymbol f}$ and ${\boldsymbol c}$ denote the camera focal length and the camera principal point, respectively, and they are both two-dimensional vectors. ${\boldsymbol k}$ represents the camera distortion parameter, which is a 4-dimensional vector containing two radial distortion coefficients and two tangential distortion coefficients. ${\boldsymbol p}({\cdot} )$ is the projection function given by Eq. (1) and Eq. (2). ${\hat{{\boldsymbol p}}_i}$ is the image coordinate of the 3D calibration point, and ${\hat{{\boldsymbol p}}_j}$ is the image coordinate of the checkerboard corner.

3.2. Distance-related parameters estimation

According to the initial camera parameters estimated in the previous section, the depth value of the standard point can be obtained from Eq. (14).

(14)$$\left\{ {\begin{array}{{c}} {{{\boldsymbol q}_{\textrm{TS}}} = {{\boldsymbol R}_{\textrm{TS}}}{{\boldsymbol X}_{\textrm{TS}}}\textrm{ + }{{\boldsymbol t}_{\textrm{TS}}}}\\ {{{\boldsymbol q}_{\textrm{CB}}} = {{\boldsymbol R}_{\textrm{CB}}}{{\boldsymbol X}_{\textrm{CB}}}\textrm{ + }{{\boldsymbol t}_{\textrm{CB}}}} \end{array}} \right.$$

where, ${{\boldsymbol X}_{\textrm{TS}}}$ and ${{\boldsymbol X}_{\textrm{CB}}}$ represent the coordinates of 3D points and checkerboard corners under the world frame, respectively. ${{\boldsymbol q}_{\textrm{TS}}}$ and ${{\boldsymbol q}_{\textrm{CB}}}$ represent the coordinates of 3D points and checkerboard corners under the camera coordinate system. Simultaneously, the depth of the calibrators can be obtained from the camera coordinate system, which is the value of the Z axis. Consequently, the minimum depth s_min and the maximum depth s_max can be obtained easily. The $\lambda$ and $\mu$ are empirical hyperparameters, which are set as 0.6 and 0.4 in the experiment. By substituting s₁ and s₂ for s_min and s_max in Eq. (11), ${\lambda _{i,{s_x}}}$ and ${\mu _{i,{s_x}}}$ can be solved. Since the tangential distortion coefficients p₁ and p₂ were obtained in the previous section, $c = {p_1}({r^2} + 2{u^2}) + 2{p_2}(uv)$ and $d = {p_2}({r^2} + 2{v^2}) + 2{p_1}(uv)$ are known quantities. According to Eq. (12), we can derive:

(15)$$\left[ {\begin{array}{{@{}ccccc@{}}} {\begin{array}{{@{}ccccc@{}}} {(u - {u_0}){\lambda_{i,{s_x}}}{r^2}}&{(u - {u_0}){\mu_{i,{s_x}}}{r^2}}&{(u - {u_0}){\lambda_{i,{s_x}}}{r^4}}&{(u - {u_0}){\mu_{i,{s_x}}}{r^4}} \end{array}}\\ {\begin{array}{{@{}cccc@{}}} {(v - {v_0}){\lambda_{i,{s_x}}}{r^2}}&{(v - {v_0}){\mu_{i,{s_x}}}{r^2}}&{(v - {v_0}){\lambda_{i,{s_x}}}{r^4}}&{(v - {v_0}){\mu_{i,{s_x}}}{r^4}} \end{array}} \end{array}} \right]\left[ {\begin{array}{{c}} {{k_{1,s,{s_1}}}}\\ {{k_{1,s,{s_2}}}}\\ {{k_{2,s,{s_1}}}}\\ {{k_{2,s,{s_2}}}} \end{array}} \right] = \left[ {\begin{array}{{@{}c@{}}} {\hat{u} - u + c}\\ {\hat{v} - v + d} \end{array}} \right]$$

If there are n points, the matrix equation can be obtained by combining all the equations like Eq. (15), as follows:

(16)$$\scalebox{0.9}{$\left[ {\begin{array}{{@{}cccc@{}}} {\begin{array}{{@{}cccc@{}}} {\begin{array}{{@{}cccc@{}}} {(u - {u_0}){\lambda_{i,{s_{x1}}}}{r^2}}&{(u - {u_0}){\mu_{i,{s_{x1}}}}{r^2}}&{(u - {u_0}){\lambda_{i,{s_{x1}}}}{r^4}}&{(u - {u_0}){\mu_{i,{s_{x1}}}}{r^4}} \end{array}}\\ {\begin{array}{{@{}cccc@{}}} {(v - {v_0}){\lambda_{i,{s_{x1}}}}{r^2}}&{(v - {v_0}){\mu_{i,{s_{x1}}}}{r^2}}&{(v - {v_0}){\lambda_{i,{s_{x1}}}}{r^4}}&{(v - {v_0}){\mu_{i,{s_{x1}}}}{r^4}} \end{array}} \end{array}}\\ \vdots \\ {\begin{array}{{@{}cccc@{}}} {\begin{array}{{@{}cccc@{}}} {(u - {u_0}){\lambda_{i,{s_{xn}}}}{r^2}}&{(u - {u_0}){\mu_{i,{s_{xn}}}}{r^2}}&{(u - {u_0}){\lambda_{i,{s_{xn}}}}{r^4}}&{(u - {u_0}){\mu_{i,{s_{xn}}}}{r^4}} \end{array}}\\ {\begin{array}{{cccc}} {(v - {v_0}){\lambda_{i,{s_{xn}}}}{r^2}}&{(v - {v_0}){\mu_{i,{s_{xn}}}}{r^2}}&{(v - {v_0}){\lambda_{i,{s_{xn}}}}{r^4}}&{(v - {v_0}){\mu_{i,{s_{xn}}}}{r^4}} \end{array}} \end{array}} \end{array}} \right]\left[ {\begin{array}{{c}} {{k_{1,s,{s_1}}}}\\ {{k_{1,s,{s_2}}}}\\ {{k_{2,s,{s_1}}}}\\ {{k_{2,s,{s_2}}}} \end{array}} \right] = \left[ {\begin{array}{{@{}c@{}}} {\begin{array}{{@{}c@{}}} {{{\hat{u}}_1} - {u_1} + {c_1}}\\ {{{\hat{v}}_1} - {v_1} + {d_1}} \end{array}}\\ \vdots \\ {\begin{array}{{@{}c@{}}} {{{\hat{u}}_n} - {u_n} + {c_n}}\\ {{{\hat{v}}_n} - {v_n} + {d_n}} \end{array}} \end{array}} \right]$}$$

where ${c_i}$ and ${d_i}$ are tangential distortion for ith point. The coefficient matrix of Eq. (16) is denoted as A, the non-homogeneous term at the right end of the equation is denoted as B, and then the matrix form is obtained.

(17)$${\boldsymbol A}{{\boldsymbol k}_{\boldsymbol d}} = {\boldsymbol B}$$

Then, the least square method is used to solve the problem.

(18)$${{\boldsymbol k}_{\boldsymbol d}} = \left[ {\begin{array}{{c}} {{k_{1,s,{s_1}}}}\\ {{k_{1,s,{s_2}}}}\\ {{k_{2,s,{s_1}}}}\\ {{k_{2,s,{s_2}}}} \end{array}} \right] = {({{\boldsymbol A}^T}{\boldsymbol WA})^{ - 1}}{{\boldsymbol A}^T}{\boldsymbol B}$$

where W is the weight matrix, usually defined as the identity matrix. The above steps complete the calculation of the initial coefficient of the DRDM. In the process of calculating the distortion parameters, the coordinates of ideal image points are approximate values. Therefore, to obtain a better result, further optimization is necessary.

3.3. Optimization based on geometric constraints

To obtain more reliable camera parameters, optimization is essential. Therefore, how to design the minimizing objective function becomes the primary problem. This paper proposes a multi-constraint optimization method based on the imaging model and geometric plane constraints.

First, the distance from the model projection point to the extraction point should be minimum, therefore, for a single camera, its minimum objective function is:

(19)$${\boldsymbol{E}_{\boldsymbol{proj}}} = \sum\limits_{i = 1}^n {{\hat{\boldsymbol{p}}}_i} - \boldsymbol{p^{\prime}}\;(\boldsymbol{f},\boldsymbol{c},{\boldsymbol{k}_{\textrm{dp}}}) + \sum\limits_{j = 1}^m {{\hat{\boldsymbol{p}}}_j} - \boldsymbol{p^{\prime}}\;(\boldsymbol{f},\boldsymbol{c},{\boldsymbol{k}_{\textrm{dp}}})$$

where ${{\boldsymbol k}_{\textrm{dp}}}$ represents the camera distortion parameter, which is a 6-dimensional vector containing ${{\boldsymbol k}_\textrm{d}}$ and two tangential distortion coefficients. ${\boldsymbol p^{\prime}}({\cdot} )$ is the projection function given by Eq. (1) and Eq. (12).

Second, according to the literature [20], the projection of a straight line is still a straight line. Therefore, we use the straightness of the correction point as a geometric constraint. In this paper, two equations are used to describe the straightness:

(1) As shown in Fig. 4(a), the distance ${d_s}$ between the correction point and its fitted line ${l_w}$ can be expressed by: $(20)$${d_s} = \frac{{\overrightarrow {{p_{w,n}}{p_{w,m}}} \cdot \overrightarrow {{p_{w,n}}{p_{w,s}}} }}{{\overrightarrow {|{p_{w,n}}{p_{w,m}}|} }}$$$

where ${p_{w,n}}$, ${p_{w,m}}$, ${p_{w,s}}$ are the three non-overlapping points on the wth line. All these undistort points can be obtained by Eq. (15). Meanwhile, checkerboards can provide b straight lines and there are t characteristic points on a line. Therefore, the first minimal line-constraint objective function is:

(21)$${{\boldsymbol E}_{{\boldsymbol line}}} = \sum\limits_{w = 1}^b {\sum\limits_{s = 1}^t {\left( {\frac{{\overrightarrow {{p_{w,n}}{p_{w,m}}} \cdot \overrightarrow {{p_{w,n}}{p_{w,s}}} }}{{\overrightarrow {|{p_{w,n}}{p_{w,m}}|} }}} \right)} }$$

(2) As shown in Fig. 4(a), the vector angle θ of any three points ${p_{w,n}}$, ${p_{w,m}}$, ${p_{w,s}}$ on the line segment can be expressed by: $(22)$$\theta = \arcsin \left( {\frac{{\overrightarrow {{p_{w,n}}{p_{w,s}}} \cdot \overrightarrow {{p_{w,m}}{p_{w,s}}} }}{{|\overrightarrow {{p_{w,n}}{p_{w,s}}} |\cdot |\overrightarrow {{p_{w,m}}{p_{w,s}}} |}}} \right)$$$

Therefore, the second minimal line-constraint objective function is:

(23)$${{\boldsymbol E}_{{\boldsymbol angle}}} = \sum\limits_{w = 1}^b {\sum\limits_{s = 1}^t {\arcsin \left( {\frac{{\overrightarrow {{p_{w,n}}{p_{w,s}}} \cdot \overrightarrow {{p_{w,m}}{p_{w,s}}} }}{{|\overrightarrow {{p_{w,n}}{p_{w,s}}} |\cdot |\overrightarrow {{p_{w,m}}{p_{w,s}}} |}}} \right)} }$$

Fig. 4. Geometric Constraints.

Download Full Size | PDF

Thus, the minimum optimization function for the monocular camera has the form:

(24)$${{\boldsymbol E}_{{\boldsymbol monocular }}}\textrm{ = }{M_\textrm{1}}{{\boldsymbol E}_{{\boldsymbol proj }}} + {M_2}{{\boldsymbol E}_{{\boldsymbol line}}} + {M_3}{{\boldsymbol E}_{{\boldsymbol angle}}}$$

where ${M_\textrm{1}}$, ${M_2}$ and ${M_3}$ are penalty factors that represent the contribution of each component to the overall result which are set as 1. The calibration of LFOV stereo vision systems requires a large number of homonymous points. However, in our method, most of the points involved in the calibration are taken from the checkerboards, which are posed in the non-public FOV. So, this paper introduces the polar plane constraint and coplanar constraint shown in Fig. 4(b). to improve the calibration accuracy.

(1) As shown in Fig. 4(b), the vectors $\overrightarrow {{O_L}{p_1}}$, $\overrightarrow {{O_R}{p_1}}$, $\overrightarrow {{O_L}{O_R}}$ satisfy the following relationship: $(25)$$\overrightarrow {{O_L}{O_R}} \cdot ({\overrightarrow {{O_L}{p_1}} \times \overrightarrow {{O_R}{p_1}} } )= 0$$$ where ${O_L} = R_{\textrm{TS}}^LO + T_{\textrm{TS}}^L$ and ${O_R} = R_{\textrm{TS}}^RO + T_{\textrm{TS}}^R$. O is [0,0,0]^T. $R_{\textrm{TS}}^L$, $T_{\textrm{TS}}^L$ represent the rotation-translation relationship between the left camera frame and the world frame, and $R_{\textrm{TS}}^R$, $T_{\textrm{TS}}^R$ represent the rotation-translation relationship between the right camera frame and the world frame. Therefore, Eq. (25) only refers to external parameters and is not coupled with internal parameters. Thus, for n points, we have: $(26)$${{\boldsymbol E}_{{\boldsymbol vector}}} = \sum\limits_{i = 1}^n {\overrightarrow {{O_L}{O_R}} \cdot ({\overrightarrow {{O_L}{p_i}} \times \overrightarrow {{O_R}{p_i}} } )}$$$
(2) As shown in Fig. 4(b), the vectors $\overrightarrow {{O_L}{p_2}}$, $\overrightarrow {{O_R}{p_2}}$, $\overrightarrow {{O_L}{p_3}}$, $\overrightarrow {{O_R}{p_3}}$, $\overrightarrow {{p_2}{p_3}}$ satisfy the following relationship: $(27)$$\left\{ {\begin{array}{{c}} {\overrightarrow {{p_2}{p_3}} \cdot ({\overrightarrow {{O_L}{p_2}} \times \overrightarrow {{O_L}{p_3}} } )= 0}\\ {\overrightarrow {{p_2}{p_3}} \cdot ({\overrightarrow {{O_R}{p_2}} \times \overrightarrow {{O_R}{p_3}} } )= 0} \end{array}} \right.$$$

Thus, for any two points i_a, i_b $({i_a} \ne {i_b})$, we can obtain.

(28)$$\begin{array}{l} {{\boldsymbol E}_{{\boldsymbol vector2}}} = \sum\limits_{{i_a} = 1}^n {\sum\limits_{{i_b} = 1}^n {\overrightarrow {{p_{{i_a}}}{p_{{i_b}}}} \cdot ({\overrightarrow {{O_L}{p_{{i_a}}}} \times \overrightarrow {{O_L}{p_{{i_b}}}} } )} } \\ + \sum\limits_{{i_a} = 1}^n {\sum\limits_{{i_b} = 1}^n {\overrightarrow {{p_{{i_a}}}{p_{{i_b}}}} \cdot ({\overrightarrow {{O_R}{p_{{i_a}}}} \times \overrightarrow {{O_R}{p_{{i_b}}}} } )} } \end{array}$$

Therefore, the global optimization function for LFOV stereo vision system is as follows:

(29)$$\boldsymbol{E}_{\boldsymbol{stereo}} = {M_L}\boldsymbol{E}_{\boldsymbol{monocular}}^L + {M_R}\boldsymbol{E}_{\boldsymbol{monocular}}^R + {M_{v1}}{\boldsymbol{E}_{\boldsymbol{vector}}} + {M_{v2}}{\boldsymbol{E}_{\boldsymbol{vector2}}}$$

where ${M_L}$, ${M_R}$, ${M_{v1}}$ and ${M_{v2}}$ are penalty factors that represent the contribution of each component to the overall result which are set as 1. ${\boldsymbol E}_{{\boldsymbol monocular}}^L$, ${\boldsymbol E}_{{\boldsymbol monocular}}^R$ are the minimum objective functions of the left and right cameras respectively, which can be obtained in Eq. (24).

4. Experiment and discussions

4.1. Calibration experiment

To verify the effectiveness of the calibration method and the accuracy of the proposed model, the following calibration experiment is conducted. First, a LFOV stereo vision system is built as shown in Fig. 5. Then, we set up a test field with a space of 5m × 2m × 16 m which is around 5 m away from the LFOV stereo vision system. In this test field, 54 3D points were evenly arranged. The 3D view of the experimental space is shown in Fig. 6. Next, the images of calibrators are recorded, and part of them are shown in Fig. 7, including 6 pictures of checkerboards and 2 pictures of 3D points. Simultaneously, the pixel coordinates of the checkerboard corners and 3D points are extracted by the Harris algorithm [21]. Finally, by using the calibration method introduced in Section 3, the calibration of the LFOV stereo vision system is completed.

Fig. 5. Experimental environment.

Download Full Size | PDF

Fig. 6. Experimental space.

Download Full Size | PDF

Fig. 7. Calibration Dataset.

Download Full Size | PDF

3D points were divided into two parts, of which 15 were chosen randomly as 3D calibration datasets and the remaining 39 were chosen as 3D test datasets. In Fig. 6 and Fig. 7, the test dataset and calibration dataset are marked in cyan and blue, respectively, and the data marked in red are used for the proposed calibration method. The physical part of the experiment consists of two AVT1920 CCD cameras (with a resolution of 1920*1080, and a cell size of 5.5µm), two 8 mm Schneider fixed focus lenses, a laptop computer, a total station with an accuracy of 3mm + 2 ppm, a 40 cm glass luminous mosaic grid target with a processing accuracy of 0.05 mm, several 3D points which measured by total station and several power supply cables. Several algorithms are compared in this paper to verify the accuracy of our model and the effectiveness of the proposed method. (1) Zhang + Epnp algorithm, where the internal camera reference is solved by the Zhang method [9] and the external reference between the world coordinate system and the camera coordinate system is solved by the Epnp algorithm [22]. (2) Constant distortion model-based method (CDMM), which is based on CDM, and calibrated by the method in Section 3. (3) DRDM-Ours, which is based on our DRDM proposed in Section 2, and calibrated by the calibration method proposed in Section 3. (4) DRDM-Sun, which is based on the DRDM proposed in literature [15], and calibrated by the calibration method proposed in Section 3. (5) DRDM-Li, which is based on the DRDM proposed in literature [16], and calibrated by the calibration method proposed in Section 3.

Since the Zhang method requires at least 12 checkerboard images to accurately calibrate the internal parameters of the camera [9], and the Epnp algorithm requires at least 10 points to obtain accurate external parameters [22], a total of 30 checkerboard images and 15 3D points are used in the Zhang + Epnp method. The calibration method proposed in this paper uses 5 3D points and 6 checkerboard images. The type and quantity of calibrators for each method are listed in Table 1. The interior parameters of the LFOV stereo vision system are shown in Table 2. As for the left camera, s₁= 542.32 mm, s₂ = 22810.26 mm, and for the right camera, s₁= 661.69 mm, s₂ = 22742.84 mm. The external parameters between the left and right cameras are shown in Table 3, where R_LR and t_LR are the construction parameters of the LFOV stereo vision system.

Table 1. Calibrators types

View Table | View all tables in this article

Table 2. Internal parameter calibration results

View Table | View all tables in this article

Table 3. External parameter calibration results

View Table | View all tables in this article

4.2. Model performance analysis

To demonstrate that our method can indeed mitigate some of the overfitting problems, we conducted three groups of calibration experiments A (5p + 1pic, which means using 5 3D calibration points and 1 checkerboard picture), B (10p + 1pic), and C (15p + 1pic) for comparison, and each group includes 100 calibration experiments, and the experiment environment is the same as Section 4.1. Figure 8 shows the calibration results for A, B, and C separately. For each group, row (1) represents the performance of different methods on the calibration datasets, and row (2) represents their performance on the test datasets. Columns (a), (b), (c), and (d) respectively represent the performance of CDMM, DRDM-Ours, DRDM-Sun, and DRDM-Li on different datasets.

Fig. 8. Reprojection errors in different groups.

Download Full Size | PDF

From Fig. 8, we find that (1) As the number of calibration points decreases, the reprojection error of all methods on the calibration datasets becomes smaller while the reprojection error on the test datasets increases. This indicates that all methods suffer from overfitting issues. (2) In all experiment groups, DRDM-Ours, DRDM-Sun, and DRDM-Li have smaller reprojection errors on the calibration datasets than CDMM, with DRDM-Ours having the smallest reprojection error on the test datasets compared to DRDM-Sun and DRDM-Li. This illustrates that our model has better resistance to overfitting compared to the Sun and Li models.

4.3. Calibration accuracy analysis

To establish the effectiveness of the proposed method, we conducted 10 LFOV camera calibration experiments. The calibration objects used in each calibration experiment are shown in Table 1. Our experimental procedure involved constructing the space illustrated in Fig. 6, fixing the monocular camera (this camera features a lens with an 8 mm focal length and a resolution of 1920 × 1080), capturing images of the calibration object, and extracting the image points with the Harris method. We categorize the points into three distinct datasets: the 3D point test set, the checkerboard calibration grid dataset, and the 3D calibration dataset. Calibration was performed using the approach outlined in Section 3. In this paper, the RMS of the reprojection error of each method on the 3D test dataset is used as the evaluation standard. The smaller the RMS, the higher the calibration accuracy of the method. Figure 9 shows the RMS of reprojection error of each method on the 3D test dataset and 3D calibration test in all monocular camera calibration experiments. Rows (1) and (2) respectively represent the reprojection errors of the 3D point calibration dataset and the 3D point test set. Columns (a) to (e) in Fig. 9 represent the reprojection errors of different methods.

Fig. 9. Reprojection errors of different methods.

Download Full Size | PDF

Firstly, the RMS of reprojection errors of CDMM is chosen as a baseline to evaluate the performance of DRDM-Sun, DRDM-Li, and DRDM-Ours. Our findings in Fig. 9 reveal that the Sun and Li models exhibited a considerable decrease in RMS on the calibration dataset, which is nearly 50%. However, there is an increase in RMS on the test dataset for these models. To be specific, DRDM-Sun shows an increase of 9.6%, while DRDM-Li has an increase of 55% in the RMS on the test dataset. In contrast, DRDM-Ours exhibits a decline in the RMS on both the calibration and test datasets when compared to CDMM. Specifically, there is a decrease of 21% in the calibration dataset and a 10% reduction in the test dataset for DRDM-Ours. The experimental result further proves that our model can resist the overfitting issues of DRDM-Sun and DRDM-Li.

Secondly, it can be seen from Fig. 9 that: DRDM-Ours has the lowest RMS on the 3D test dataset. DRDM-Ours demonstrates a reduction in the test dataset's reprojection error RMS by at least 18% when compared to other methods like DRDM-Sun and DRDM-Li. In addition, DRDM-Ours has a better performance than CDMM on both the calibration dataset and test dataset. Compared with CDMM, the RMS of DRDM-Ours is reduced by 21% and 10% on the calibration dataset and the test set, respectively. In comparison to the Zhang + Epnp method, we achieve a reduction of 9% in RMS on the test dataset. Consequently, the calibration accuracy of the proposed method on the test set is verified.

Finally, we observe that the reprojection error on the u-axis is consistently higher than that on the v-axis across all calibration methods. We hypothesize that this discrepancy may be attributed to additional errors introduced in the u-axis direction during the image feature point extraction process.

4.4. Length measurement accuracy verification

To verify the accuracy of DRDM-Ours, we reconstruct all points in 3D test datasets and evaluate the accuracy of different methods such as DRDM-Sun, DRDM-Li, CDMM and Zhang + Epnp by comparing the length between any two points. DRDM-Ours, DRDM-Sun and DRDM-Li adopted the reconstruction method in literature [16], while CDMM and Zhang + Epnp adopted the reconstruction method in literature [11]. In this experiment, the measurement of the total station is taken as the true value. By comparing with the true value, the relative measurement error ${e_i}$ of each method can be easily calculated. There are 39 points in the 3D test set and 741 possibilities to take 2 points from the test set. In such a condition, 741 relative measurement errors are obtained. The RMS of the 741 relative measurement errors ${e_{rms}}$ is used as the accuracy criterion in this thesis.

{e_{rms}} = \sqrt {\frac{{\sum\limits_{i = 1}^n {{e_i}^2} }}{n}} (n = 741)

The smaller the ${e_{rms}}$, the more accurate the algorithm. To ensure the reliability of the experiment, 6 measurement experiments were performed randomly. The ${e_{rms}}$ of the measurement is shown in Table 4, where L_i (i = 1∼6) represents the serial number of the ith experiment. Table 4 shows that DRDM-Ours has the highest measurement accuracy, which improves the measurement accuracy by around 25% compared to Zhang + Epnp or CDMM. Other models, such as DRDM-Sun and DRDM-Li, do not perform well in terms of measurement errors. Although other DRDMs perform well on the reprojection error of the calibration dataset, it appears that these models tend to overfit and therefore perform poorly in terms of relative measurement error. Notably, the standard deviation (std) of our method is larger than DRDM-Sun’s method. However, this does not mean that our method is inferior to Sun's. The std in Table 4 is main indicator for evaluating the quality of the average error of multiple measurements, which is a statistical measure of various errors and represents the stability of the errors. Moreover, as shown in Table 4, even our highest RMS value is lower than the lowest RMS value of Sun, which demonstrates that our method is better than Sun.

Table 4. Relative measurement errors

View Table | View all tables in this article

4.5. Calibration efficiency analysis

To verify the efficiency of our method, We conduct two sets of experiments. First, we fix the number of 3d calibration points at five and divide the experiments into 15 groups according to the number of checkerboards. Second, we fix the number of 3d calibration points at fifteen and divide the experiments into 15 groups according to the number of checkerboards. Each group contains 100 random LFOV camera calibration experiments, which are the same as the experiment in 4.2. We record the standard deviation of the 100 calibration parameters (including focal length, principle points, and radial distortion coefficient) and the RMS of projection errors on the 3D test data in each group. In order to illustrate the rationality of DRDM-Ours, we also conducted 100 calibration experiments on the Zhang + Epnp algorithm, which uses 15 3D calibration points and 30 pictures of checkerboards. The correlation between the number of checkerboards and the standard deviations of the principal distance and principal point coordinates is illustrated in Fig. 10(a) and Fig. 10(c), respectively. Figure 10(b) and Fig. 10(d) depict the association between the number of checkerboards and the standard deviations of the distortion coefficients. The relationship between the number of checkerboards and the RMS of projection errors on the 3D test dataset is presented in Fig. 11(a) and Fig. 11(b) for two separate experiments, respectively.

Fig. 10. Parameter stability analysis.

Download Full Size | PDF

Fig. 11. Analysis of the Number of Checkerboards.

Download Full Size | PDF

Firstly, we analyze the correlation between the number of checkerboards and parameter uncertainty in the proposed method. Figure 10 indicates that as the number of checkerboards increases, the standard deviations of parameters decrease. The decreasing trend gradually levels off and most of them show convergence at 14 checkerboards. Figure 10(a) reveals that when the number of calibration objects is limited (5p + 3pic), our approach can yield acceptable parameter uncertainty. Notably, the comparison between Fig. 10(a) and (c) shows that integrating distant 3D points (15p + 3pic) can significantly diminish calibration uncertainty. Nevertheless, for LFOV measurements, integrating 3D points would incur considerable operational expenses and is relatively rigid. By increasing the number of nearby checkerboards (5p + 15pic), our method achieves identical results compared to incorporating a greater number of distant 3D points. Consequently, our methodology offers a more adaptable alternative whereby the calibration stability of parameters can be enriched simply by including additional checkerboards.

Secondly, we investigate the correlation between the number of checkerboards and the RMS of the reprojection error on the test set. Figure 11(a) illustrates that with five 3D calibration points, both DRDM-Ours and CDMM exhibit RMS convergence around 14 checkerboard grids. In contrast, Fig. 11(b) shows that when using 15 3D calibration points, the RMS in CDMM methodology varies insignificantly with an increase in the number of checkerboards, while the RMS in DRDM-Ours stabilizes around 12 checkerboards. If the number of checkerboards exceeds 2, our method consistently outperforms CDMM and Zhang + Epnp, suggesting that our approach can attain exceptional precision even with a limited amount of calibration objects.

Finally, we further analyze the distinctive nature of our approach by selecting four scenarios, namely A(5p + 3pic), B(5p + 15pic), C(15p + 3pic), and D(15p + 15pic). Table 5 shows the RMS of reprojection errors and the calibrators component for each scenario. Through the comparison of A, B, and C, we found that DRDM-Ours can achieve the effect of adding 3D points in distant areas by increasing the number of nearby checkerboards. However, the CDMM is unable to achieve a similar outcome. Moreover, C and D highlight that while additional checkerboards fail to enhance the accuracy of the CDMM when numerous 3D points are involved in calibration, our method can further improve the precision by increasing the number of checkerboards. Consequently, it is capable to improve calibration stability and accuracy by adding close-range checkerboards. Compared to arranging a large number of 3D points in the external field, our method keeps the calibration accuracy and improves the efficiency of calibration to a certain extent.

Table 5. Calibration efficiency analysis

View Table | View all tables in this article

5. Conclusion

This study proposes a calibration method for the LFOV stereo vision system based on DRDM. The main contributions of our works are as follows. (1) We propose a new DRDM in our paper. Compared to DRDMs proposed by Sun and Li, our model has a better performance on reprojection errors in the 3D test dataset and it can alleviate overfitting. (2) We propose a new calibration method that utilizes two types of calibration objects. Compared to the Zhang + Epnp, our method requires fewer calibration objects and achieves satisfactory results in the experiments. Compared to the CDMM, our approach offers greater flexibility in improving calibration precision. (3) Our method introduces geometric constraints such as straight lines and vector cross products to improve calibration accuracy. Experiments demonstrate the effectiveness of the proposed method.

Although our method reduces the number of 3D calibration points compared with Zhang + Epnp, it still needs at least 5 3D calibration points at long distances to calibrate accurately. Also, our method can alleviate overfitting, but it cannot eliminate this phenomenon entirely. Hence, several potential research directions can be explored in the future, such as: (1) optimizing 3D calibration points to achieve high measurement accuracy with the minimal number of points, and (2) addressing overfitting issues that stem from inadequate calibration objects. Those research can improve the calibration efficiency and calibration accuracy further, which is particularly meaningful for LFOV stereo visual systems.

Funding

National Natural Science Foundation of China (52005028, 52127809).

Disclosures

The authors declare that there are no conflicts of interest related to this article.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. Y. Zhang, W. Liu, F. Wang, Y. Lu, W. Wang, F. Yang, and Z. Jia, “Improved separated-parameter calibration method for binocular vision measurements with a large field of view,” Opt. Express 28(3), 2956–2974 (2020). [CrossRef]

2. H. Hu, B. Wei, S. Mei, J. Liang, and Y. Zhang, “A two-step calibration method for vision measurement with large field of view,” IEEE Trans. Instrum. Meas. 71, 1–10 (2022). [CrossRef]

3. J. Huo, H. Zhang, Z. Meng, F. Yang, and G. Yang, “A flexible calibration method based on small planar target for defocused cameras,” Opt. Lasers Eng. 157, 107125 (2022). [CrossRef]

4. D. Li, X. Pan, Z. Fu, L. Chang, and G. Zhang, “Real-time accurate deep learning-based edge detection for 3-D pantograph pose status inspection,” IEEE Trans. Instrum. Meas. 71, 1–10 (2022). [CrossRef]

5. X. Chen, J. Lin, Y. Sun, H. Ma, and J. Zhu, “Analytical solution of uncertainty with the GUM method for a dynamic stereo vision measurement system,” Opt. Express 29(6), 8967–8984 (2021). [CrossRef]

6. Y.I. Abdel-Aziz, H.M. Karara, and M. Hauck, “Direct linear transformation from comparator coordinates into object space coordinates in close-range photogrammetry,” Photogramm. Eng. Remote Sens. 81(2), 103–107 (2015). [CrossRef]

7. Y. Yakimovsky and R. Cunningham, “A system for extracting three-dimensional measurements from a stereo pair of TV cameras,” Comput. Graph. Image Process. 7(2), 195–210 (1978). [CrossRef]

8. R. Tsai, “A versatile camera calibration technique for high-accuracy 3D machine vision metrology using off-the-shelf TV cameras and lenses,” IEEE J. Robot. Automat. 3(4), 323–344 (1987). [CrossRef]

9. Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Machine Intell. 22(11), 1330–1334 (2000). [CrossRef]

10. M. Róg and A. Rzonca, “The impact of photo overlap, the number of control points, and the method of camera calibration on the accuracy of 3D model reconstruction,” Geomatics Environ. Eng. 15(2), 67–87 (2021). [CrossRef]

11. S. Yang, Y. Gao, Z. Liu, and G. Zhang, “A calibration method for binocular stereo vision sensor with short-baseline based on 3D flexible control field,” Opt. Lasers Eng. 124, 105817 (2020). [CrossRef]

12. A.A. Magill, “Variation in distortion with magnification,” J. Opt. Soc. Am. 45(3), 148–152 (1955). [CrossRef]

13. C.B. Duane, “Close-range camera calibration,” Photogramm. Eng. 37(8), 855–866 (1971).

14. J.G. Fryer and D.C. Brown, “Lens distortion for close-range photogrammetry,” Photogramm. Eng. Remote Sens. 52, 51–58 (1986).

15. L. Alvarez, L. Gómez, and J.R. Sendra, “Accurate depth dependent lens distortion models: an application to planar view scenarios,” J. Math. Imaging Vis. 39(1), 75–85 (2011). [CrossRef]

16. P. Sun, N. Lu, and M. Dong, “Modelling and calibration of depth-dependent distortion for large depth visual measurement cameras,” Opt. Express 25(9), 9834–9847 (2017). [CrossRef]

17. X. Li, W. Li, X. Ma, X. Yin, X. Chen, and J. Zhao, “Spatial light path analysis and calibration of four-mirror-based monocular stereo vision,” Opt. Express 29(20), 31249–31269 (2021). [CrossRef]

18. A. Ranganathan, “The Levenberg-Marquardt algorithm,” Tutorial on LM algorithm 11(1), 101–110 (2004).

19. G. Nakano,. “A Simple Direct Solution to the Perspective-Three-Point Problem,” In British Machine Vision Conference (BMVC) (2019), pp. 26.

20. F. Devernay and O. Faugeras, “Straight lines have to be straight,” Mach. Vis. Appl. 13(1), 14–24 (2001). [CrossRef]

21. C. Harris and M. Stephens, “A combined corner and edge detector,” In Alvey Vision Conference (AVC) (1988), pp. 10–5244.

22. V. Lepetit, F. Moreno-Noguer, and P. Fua, “EPnP: An accurate O(n) solution to the PnP problem,” Int. J. Comput. Vis. 81(2), 155–166 (2009). [CrossRef]

Method	Zhang + Epnp		CDMM		DRDM-Ours		DRDM-Sun		DRDM-Li
Method	L	R	L	R	L	R	L	R	L	R
f_x	1501.69	1498.56	1500.85	1497.22	1501.02	1497.01	1500.37	1497.07	1505.57	1497.26
f_y	1502.80	1498.41	1501.81	1497.40	1501.93	1497.16	1501.27	1497.20	1506.47	1497.39
c_x	956.55	963.58	956.68	957.60	956.61	958.16	956.57	958.31	957.15	958.36
c_y	581.93	531.45	574.42	529.51	574.10	529.72	574.17	529.62	572.80	529.66
k_1,s,s1	−0.1043	−0.1131	−0.1198	−0.1210	−0.1209	−0.1206	−0.1240	−0.1213	−0.1198	−0.1214
k_2,s,s1	0.1078	0.1424	0.1389	0.1452	0.1414	0.1413	0.1589	0.1427	0.1344	0.1432
k_1,s,s2	−0.1043	−0.1131	−0.1198	−0.1210	−0.1087	−0.1280	−0.1090	−0.1206	−1.6378	1.3384
k_2,s,s2	0.1078	0.1424	0.1389	0.1452	0.1187	0.2015	0.0964	0.1605	22.4317	10.4561
p₁	5.5E-04	1.1E-03	−2.4E-04	−8.8E-05	−2.0E-04	−4.6E-05	−2.0E-04	−3.2E-05	−9.4E-05	−2.7E-05
p₂	1.3E-03	9.2E-04	1.8E-04	−2.5E-04	1.4E-04	−1.7E-04	1.4E-04	−1.2E-04	1.7E-04	−1.1E-04

Method	Zhang + Epnp	CDMM	DRDM-Ours	DRDM-Sun	DRDM-Li
R_LR	$\left[ {\begin{array}{{@{}c@{}}} {1, - 0.04, - 0.03}\\ {0.04,1,0.05}\\ {0.03, - 0.05,1} \end{array}} \right]$	$\left[ {\begin{array}{{@{}c@{}}} {1, - 0.04, - 0.03}\\ {0.04,1,0.05}\\ {0.03, - 0.05,1} \end{array}} \right]$	$\left[ {\begin{array}{{@{}c@{}}} {1, - 0.04, - 0.03}\\ {0.04,1,0.05}\\ {0.03, - 0.05,1} \end{array}} \right]$	$\left[ {\begin{array}{{@{}c@{}}} {1, - 0.04, - 0.03}\\ {0.04,1,0.05}\\ {0.03, - 0.05,1} \end{array}} \right]$	$\left[ {\begin{array}{{@{}c@{}}} {1, - 0.04, - 0.03}\\ {0.04,1,0.05}\\ {0.03, - 0.05,1} \end{array}} \right]$
t_LR	$\left[ {\begin{array}{{@{}c@{}}} { - 872.25}\\ { - 17.99}\\ { - 17.26} \end{array}} \right]$	$\left[ {\begin{array}{{@{}c@{}}} { - 867.78}\\ { - 16.17}\\ {2.93} \end{array}} \right]$	$\left[ {\begin{array}{{@{}c@{}}} { - 867.71}\\ { - 16.52}\\ { - 4.27} \end{array}} \right]$	$\left[ {\begin{array}{{@{}c@{}}} { - 868.04}\\ { - 16.42}\\ {2.26} \end{array}} \right]$	$\left[ {\begin{array}{{@{}c@{}}} { - 870.46}\\ { - 18.31}\\ { - 39.35} \end{array}} \right]$

Method	e_rms (‰)
Method	L₁	L₂	L₃	L₄	L₅	L₆	mean	std
Zhang + Epnp (15p + 30pic)	5.0	5.6	5.6	3.5	4.6	4.6	4.8	0.8
CDMM (5p + 6pic)	5.8	5.6	4.1	3.5	3.9	5.9	4.8	1.1
DRDM-Ours (5p + 6pic)	3.9	4.3	3.2	3.0	3.6	3.5	3.6	0.5
DRDM-Sun (5p + 6pic)	5.6	5.8	5.0	5.3	5.3	5.2	5.4	0.3
DRDM-Li (5p + 6pic)	16.2	12.8	11.6	22.2	5.6	14.4	13.8	5.5

Method	RMS of reprojection errors(pixel)
Method	A(5p + 3pic)	B(5p + 15pic)	C(15p + 3pic)	D(15p + 15pic)
DRDM-Ours	0.305	0.217	0.212	0.195
CDMM	0.336	0.253	0.235	0.230

Method	Zhang + Epnp		CDMM		DRDM-Ours		DRDM-Sun		DRDM-Li
Method	L	R	L	R	L	R	L	R	L	R
f_x	1501.69	1498.56	1500.85	1497.22	1501.02	1497.01	1500.37	1497.07	1505.57	1497.26
f_y	1502.80	1498.41	1501.81	1497.40	1501.93	1497.16	1501.27	1497.20	1506.47	1497.39
c_x	956.55	963.58	956.68	957.60	956.61	958.16	956.57	958.31	957.15	958.36
c_y	581.93	531.45	574.42	529.51	574.10	529.72	574.17	529.62	572.80	529.66
k_1,s,s1	−0.1043	−0.1131	−0.1198	−0.1210	−0.1209	−0.1206	−0.1240	−0.1213	−0.1198	−0.1214
k_2,s,s1	0.1078	0.1424	0.1389	0.1452	0.1414	0.1413	0.1589	0.1427	0.1344	0.1432
k_1,s,s2	−0.1043	−0.1131	−0.1198	−0.1210	−0.1087	−0.1280	−0.1090	−0.1206	−1.6378	1.3384
k_2,s,s2	0.1078	0.1424	0.1389	0.1452	0.1187	0.2015	0.0964	0.1605	22.4317	10.4561
p₁	5.5E-04	1.1E-03	−2.4E-04	−8.8E-05	−2.0E-04	−4.6E-05	−2.0E-04	−3.2E-05	−9.4E-05	−2.7E-05
p₂	1.3E-03	9.2E-04	1.8E-04	−2.5E-04	1.4E-04	−1.7E-04	1.4E-04	−1.2E-04	1.7E-04	−1.1E-04

Calibration method for large-field-of-view stereo vision system based on distance-related distortion model

Abstract

1. Introduction

2. Mathematical model

2.1. Camera pinhole model with lens distortion

2.2. Distance-related distortion model

3. Calibration method

3.1. Initial parameters estimation

3.2. Distance-related parameters estimation

3.3. Optimization based on geometric constraints

4. Experiment and discussions

4.1. Calibration experiment

4.2. Model performance analysis

4.3. Calibration accuracy analysis

4.4. Length measurement accuracy verification

4.5. Calibration efficiency analysis

5. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (11)

Tables (5)

Equations (30)

Optics Express

Method	Number of calibrators
Method	Checkerboard pictures	3D calibration points
Zhang + Epnp	30	15
CDMM	6	5
DRDM-Ours	6	5
DRDM-Sun	6	5
DRDM-Li	6	5