## Abstract

In this paper, a new method to calibrate a trinocular vision sensor is presented. A planar target with several parallel lines is utilized. The trifocal tensor of three image planes can be calculated out according to line correspondences. Compatible essential matrix between each two cameras can be obtained. Then, rotation matrix and translation matrix can be deduced base on singular value decomposition of their corresponding essential matrix. In our proposed calibration method, image rectification is carried out to remove perspective distortion. As the feature utilized is straight line, precise point to point correspondence is not necessary. Experimental results show that our proposed calibration method can obtain precise results. Moreover, the trifocal tensor can also give a strict constraint for feature matching as descripted in our previous work. Root mean square error of measured distances is 0.029 mm with regards to the view field of about 250×250 mm. As parallel feature exists widely in natural scene, our calibration method also provides a new approach for self-calibration of a trinocular vision sensor.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

For a binocular vision sensor, cameras are fixed in the process of measurement. Determining the relationship between two cameras is significant which is known as calibration method. Heretofore, calibration methods are various [1–3], such as planar target-based method, 1D target-based method and so on. In a planar target-based method [4–6], different features on the planar target are utilized, e.g. centers of circles, corners, cross points, and so on. Anyway, relationships of these features are known exactly. When images of these features are captured by two cameras simultaneously, coordinates of point (or expressions of feature) under each camera coordinate system can be deduced from camera imaging model and constraints of features. Once these corresponding features are confirmed, relationship between the two cameras can be determined. In a 1-D target-based calibration method [7,8], geometrical feature is normal utilized, such as co-linearity feature, invariance of double cross ratio and so on. Similar with the planar target-based calibration method, as relative locations of features are known beforehand, relationship between two cameras in a binocular vision sensor can be confirmed.

Based on the calibrated binocular vision sensor, 3D reconstruction is with a good result though mismatching and mismeasurement are occasional. In this case, a trinocular vision sensor is designed [9,10]. Similar with a binocular vision sensor, relationship between each two cameras (including rotation matrix and translation matrix) should be confirmed. Traditionally, each two cameras in the trinocular vision sensor are treated as one binocular vision sensor. Three binocular vision sensors will obtain three sets of 3D data when an object is measured. Data fusion is conducted to get a relatively stable measurement result.

Liu et al. [11] proposed a calibration method for a trinocular vision sensor with non-overlapping views using a 1D target. Rotation matrix between adjacent two vision sensors is computed according to the co-linearity property of feature points on 1D target. Then the translation matrix can be deduced from known distances between feature points on the target. Wei et al. [12] calibrated a multi-camera system based on laser scanning. In this method, two lasers are mounted on a turntable. Relationship between lasers and the turntable can be confirmed base on the hand-eye calibration method. When the light planes are projected into the field of view of each camera, relationship between the camera and the laser plane can be obtained. Then the external parameters between each camera and the turntable are calculated. As the turntable is fixed, external parameters of the multi-camera system are confirmed. Abedi et al. [13]. calibrated a multi-camera imaging system in circular arrangement using a three-dimensional calibration object with new patterns. Camera projection matrix is estimated based on the extrinsic parameters which are calculated from a scheme of group geometric features. These traditional calibration methods for a trinocular vision sensor (or a multi-camera system) treat each two cameras as a binocular vision sensor. In this case, relationship between each two cameras can be confirmed, but incompatible. As precise point to point correspondence is necessary, perspective distortion is involved easily.

In this paper, a new calibration method for a trinocular vision sensor is presented. A planar target with several parallel lines is utilized to calibrate a trinocular vision sensor. In our calibration method, three cameras are treated as one sensor. Trifocal tensor is first calculated out which indicate the geometric relations between three camera views. Compatible rotation matrix and translation matrix are deduced from the trifocal tensor and parallel feature on the target. This paper is organized as follows. First, the trinocular vision sensor is introduced, including traditional description and a new description which treats three cameras as one. After related principle of a trifocal tensor is detailed (in Section 2), a calibration method based on parallel feature is proposed in Section 3. Calibration procedure is detailed. Then experiments (Section 4) are conducted to analyze and verify our calibration method. Finally, conclusion is given in Section 5.

## 2. Related description

#### 2.1. Trinocular vision sensor

A trinocular vision sensor is consisted of three cameras. Measurement area of the sensor is the overlapping FOV (field of view) of each camera. When each two cameras in the trinocular vision sensor are treated as one binocular vision sensor (as illustrated in Fig. 1), we can get three binocular vision sensors.

After calibrated, relationship between two cameras in each binocular vision sensor can be confirmed separately.

When a target object is located in measurement area of a trinocular vision sensor, 3D points $P_n^i$ are captured by each binocular vision sensor. Finally, reconstructed data will be obtained based on data fusion to remove additive noise, simple method of which is the averaging algorithm, i.e.

When these three cameras in a trinocular vision sensor are treated as one sensor, corresponding features in three images captured by three cameras satisfy constraint of a trifocal tensor. Then we can get compatible transformation matrix between each two cameras. Define the camera matrix as ${P_n} = {\textbf{K}_\textbf{n}}[R_n^G|T_n^G]$, where ${\textbf{K}_\textbf{n}}$ is intrinsic parameter matrix of camera n, $R_n^G$ is rotation matrix from global coordinate system (GCS) to camera coordinate system (CCS), while $T_n^G$ is the translation matrix. Relationship of the trinocular vision sensor can be expressed as

*P*under GCS. Then we can get the expression as follows

*i*row of camera matrix ${P_n}$. The spatial point can be confirmed by singular value decomposition (SVD) of matrix

*A*, i.e. the solution corresponding to the smallest singular value of matrix

*A*.

#### 2.2. Trifocal tensor

When define the trifocal tensor as $\textbf{T} = [{\textbf{T}_\textbf{1}}\textbf{,}{\textbf{T}_\textbf{2}}\textbf{,}{\textbf{T}_\textbf{3}}]$, we can get the compatible fundamental matrix of each two cameras as

*Ref*[14].

## 3. Calibration

In a trinocular vision sensor, relationship between each two cameras is fixed. Determining the relationship is named as calibration of a trinocular vision sensor. In this section, calibration method based on a planar target with several equally spaced parallel lines is detailed.

In a trinocular vision sensor, coordinate system of camera one (CCS1) is defined as the global coordinate system. The target is with more than three equally spaced parallel lines as illustrated in Fig. 2 (In fact, at least 13 lines are needed to calculate the trifocal tensor) and the distance between each two adjacent line is defined as *dis*. We define the line *i* on target plane (${\Pi _p}$) as ${L_i}$, while its corresponding projection on image plane of camera *n* as $l_n^i$. Point ${O_n}$ is the optical center of camera *n*, i.e. the origin of coordinate system of camera *n* (CCSn). In this case, point ${O_n}$, line ${L_i}$ and line $l_n^i$ are coplanar, which is denoted as $\Pi _n^i$.

#### 3.1. Image rectification

As the target used in our calibration method is planar, perspective distortion is inevitable. In this case, correction should be carried out. As is known, projection of a straight line on an image plane is still straight. The first constraint function can be given [15]:

*j*locating on the line ${l_i} = ({a_i},{b_i},{c_i}).$ In computer vision, a set of parallel lines in 3D space projects on the image plane of perspective geometry and intersects at one point, i.e. the vanishing point. As described in Ref. [14], vanishing line can also be deduced from line $l_n^i$:

*s*as $p_s^v = {(u_s^v,v_s^v,1)^\textbf{T}},$ we can get the constraint as

*s*. Based on Eq. (8), Eq. (10) and Eq. (11), full constraint function to rectify captured image can be given as

#### 3.2. Rotation matrix

Essential matrix ${\textbf{E}_\textbf{n}}$ can be calculated according to Eq. (4) and Eq. (6). Furthermore, matrix ${\textbf{E}_\textbf{n}}$ can be decomposed as

where singular values satisfy ${\sigma _1} = {\sigma _2} = \sigma ,$ namely matrix ${\textbf{E}_\textbf{n}}$ has two same singular values and a zero singular value. Transformation matrices, including rotation matrix and translation matrix with a scale factor $\kappa $, can be deduced based on SVD of matrix ${\textbf{E}_\textbf{n}}$. Four possible solutions of the rotation matrix and translation matrix are listed as below:*U*, and $Z = \left[ {\begin{array}{ccc} 0&1&0\\ { - 1}&0&0\\ 0&0&1 \end{array}} \right].$ In Eq. (14), ${A_n}$ and ${C_n}$, similar with ${B_n}$ and ${D_n}$, are baseline reversed, while ${A_n}$ and ${D_n}$, similar with ${B_n}$ and ${C_n}$, are rotated 180° about the baseline. In a trinocular vision sensor, sixteen groups of solutions will be obtained. Possible groups of solutions are listed in Table 1, while six of these groups are illustrated in Fig. 3.

In these sixteen groups, there is only one group of solutions which satisfies the condition that all reconstructed points must locate in front of all cameras. Then compatible rotation matrices (including rotation matrix from CCS2 to GCS and rotation matrix from CCS3 to GCS) and compatible translation matrices (including translation matrix from CCS2 to GCS and translation matrix from CCS3 to GCS) with a scale factor can be confirmed according to procedure described in Ref. [16].

#### 3.3. Translation matrix

### 3.3.1. Determining expression of target plane

Take planes under CCS1 for example, related planes are expressed as

**K**is intrinsic parameter matrix of camera,

*l*is homogeneous coordinate of the vanishing line, $\vec{n}$ is normal vector of the scene plane, i.e. the target plane in this paper.

In this case, normal vector ${\vec{n}_n}$ of the target plane under CCSn can be calculated from Eq. (16) and Eq. (9):

where ${\textbf{K}_n}$ is intrinsic parameter matrix of camera n. When the target plane under GCS is defined as its normal vector is $\vec{n}_p^1 = (a,b,c).$ Then direction vector of ${L_n}$ can be expressed as where $\vec{n}_n^i = ({A_i},{B_i},{C_i})$ is defined in Eq. (15). Then we can get two points ($P_i^1$ and $P_i^2$) locating on line ${L_i}$ according to Eq. (15) and Eq. (18).*dis*), we can get the relationship as where symbol

*dot*indicates dot product of two vector, $\vec{v}$ is a normal vector expressed by $\vec{v} = {\vec{n}_{Ln}} \times \vec{n}_p^n.$ In this case, parameter

*d*in Eq. (18) can be confirmed. Until now, expression of the target plane is worked out.

### 3.3.2. Determining translation matrix

As CCS1 is treated as GCS, define rotation matrix from CCSn to GCS as $R_n^G$ and translation matrix as $T_n^G$ (n=2, 3). Relationship between each coordinate system can be expressed as

## 4. Experiments and discussion

In this section, a trinocular vision sensor is designed. Three CCD cameras (*AVT Stingray F-504B*) with a maximum resolution of 2452×2056 pixels are utilized (as illustrated in Fig. 4). In order to obtain a high frame rate, resolution of each camera is reduced to 1600×1200 pixels in our experiment.

In our experiment, each camera is calibrated by Zhang’s calibration method [17]. The obtained intrinsic parameters are listed in Table 2.

In Table 2, *f*_{x} and *f*_{y} represent the scale factor in *x*-coordinate direction and *y*-coordinate direction respectively. ${({u_0},{v_0})^\textbf{T}}$ are the coordinates of the principal point in terms of pixel dimensions. The planar target to calibrate a trinocular vision sensor in our paper and the target used to evaluate our calibration results are illustrated in Fig. 5.

#### 4.1. Image rectification

According to rectification method detailed in Section 3.1, we first compensated for image distortion before calibration and measurement. Line feature is extracted based on Steger’s algorithm [18] (which is not shown) and related images are illustrated in Fig. 6. First row of Fig. 6 demonstrates original images, while the second row illustrates their corresponding rectified ones.

#### 4.2. Calibration results

Trifocal tensor of these three cameras in a trinocular vision sensor is confirmed based on line correspondences. Obtained trifocal tensor, its corresponding compatible fundamental matrix and transformation matrix are listed in Table 3.

#### 4.3. Accuracy evaluation

A one dimensional (1D) target as illustrated in Fig. 5(b) is utilized to evaluate our calibration method. The distance between each two adjacent feature points (*D _{real}*) is known exactly. Then feature points are reconstructed based on our calibration results. Feature matching is conducted according to related algorithm in our previous work [8]. In this case, the distance (

*D*) can also be calculated out. All measurement values are compared with their corresponding true value (40.00 mm). When the 1D target is moved into more different positions randomly, we can obtain enough distances to evaluate our calibration results (as illustrated in Fig. 7). Part of evaluation results are listed in Table 4.

_{mea}As the planar target used in our calibration experiment is merely with accuracy of 0.01 mm, RMS error of measured distance is 0.029 mm and the relative calibration accuracy is around 0.116‰ with regards to the view field of about 250×250 mm.

#### 4.4. Comparison with traditional calibration method

One typical calibration method for a trinocular vision sensor is the planar target calibration method based on Zhang’s algorithm [17]. In this calibration method, each two cameras are treated as one binocular vision sensor. Transformation matrix of each two cameras, including rotation matrix and translation matrix, can be confirmed easily as the relative position of feature points is known exactly.

For the purpose of comparison, we calibrate the trinocular vision sensor using the typical planar target calibration method described above (named as Zhang’s method in this section). The utilized planar target is with a checkerboard pattern. Material of the target is the same with the planar target used in our calibration method (as illustrated in Fig. 5(c)). Relationship between each two cameras is confirmed separately. Feature matching and measurement are evaluated based on calibration results obtained from Zhang’s method and our calibration method respectively.

A laser is projected onto the surface of the object and the light stripe is extracted based on Steger's extraction method [18]. As the extracted center points are mass, one point from each thirty points is taken and related feature matching results are illustrated in Fig. 8. The matching results based on Zhang’s calibration results are illustrated in Fig. 8(a), while our matching results are illustrated in Fig. 8(b).

From Fig. 8, we can see that the traditional calibration results are incompatible, which makes matching error. Instead, there is no mismatching point in feature matching results based on our calibration results, which indicates a strict constraint. Moreover, we evaluated Zhang’s calibration method following steps detailed in Section 4.3. Evaluation results are listed in Table 5.

In Table 5, sensor 1 is the binocular vision sensor consist of camera 1 and camera 2, sensor 2 is the sensor consist of camera 2 and camera 3, while sensor 3 is the sensor consist of camera 1 and camera 3. Sensor is the trinocular vision sensor. Measurement results of sensor is the average value of measurements from sensor 1, sensor 2 and sensor 3. RMS error of measured distance (of sensor) is 0.092 mm. As the RMS error of our calibration method is 0.029 mm (as listed in Table 4), measurement precision is improved by nearly 68.48%.

#### 4.5. Application in self-calibration

In many scenes, features of parallels exist widely and can be extracted easily. For example, in some industrial areas of mechanical manufacturing and assembly, features of parallels can often be detected from some reference features, such as parallel rows of holes, parallel edges, parallel grooves or lines in the device (as illustrated in Fig. 9(a) and Fig. 9(b)). In addition, in some natural scenes, such as the sides of floor tiles in the corridor (Fig. 9(c)), the parallel edges of the walls or windows in buildings (Fig. 9(d)), features of parallels can also be detected easily. In this case, trifocal tensor can be calculated out (with a minimum number of 7 image point correspondences across 3 images, or 13 line correspondences, or a mixture of point and line correspondences).

Then self-calibration of a trinocular vision sensor can be conducted according to our proposed calibration method.

## 5. Conclusion

In this paper, a planar target with several parallel lines is utilized to calibrate a trinocular vision sensor. From images of the target captured by three cameras simultaneously, trifocal tensor can be determined. Rotation matrix and translation matrix with a scale factor can be deduced from the compatible essential matrix, which is calculated out based on singular value decomposition of the trifocal tensor. As relationship of parallel lines is known exactly, the scale factor can be confirmed. Related procedure is detailed. As features utilized in our calibration method are straight lines, precise point to point correspondence is not necessary. Experiments show that our proposed calibration method is precise and robust enough. Root mean square error of measured distances is 0.029 mm with regards to the view field of about 250×250 mm. Compare with a normal calibration method, the proposed method is more precise and robust. Moreover, the trifocal tensor can give a strict constraint for feature matching. As parallel feature exists widely in natural scene, our calibration method also provides a new approach for self-calibration of a trinocular vision sensor.

## Funding

Natural Science Foundation of Shandong Province (ZR2014FQ023); National Natural Science Foundation of China (41927805, U1706218); Fundamental Research Funds for the Central Universities (201964022).

## Acknowledgments

We thank the Electronic Information Laboratory for the use of their equipment in Qingdao University of Technology.

## Disclosures

The authors declare no conflicts of interest**.**

## References

**1. **Y. Cui, F. Q. Zhou, Y. Wang, L. Liu, and H. Gao, “Precise calibration of binocular vision system used for vision measurement,” Opt. Express **22**(8), 9134–9149 (2014). [CrossRef]

**2. **J. H. Yang, Z. Y. Jia, W. Liu, C. N. Fan, P. T. Xu, F. J. Wang, and Y. Liu, “Precision calibration method for binocular vision measurement systems based on arbitrary translations and 3D-connection information,” Meas. Sci. Technol. **27**(10), 105009 (2016). [CrossRef]

**3. **Y. Zhang, W. Liu, F. J. Wang, Y. K. Lu, W. Q. Wang, F. Yang, and Z. Y. Jia, “Improved separated-parameter calibration method for binocular vision measurements with a large field of view,” Opt. Express **28**(3), 2956–2974 (2020). [CrossRef]

**4. **Z. Z. Wei, M. W. Shao, and G. J. Zhang, “Parallel-based calibration method for line-structured light vision sensor,” Opt. Eng. **53**(3), 033101 (2014). [CrossRef]

**5. **Z. Z. Wei and X. K. Liu, “Vanishing feature constraints calibration method for binocular vision sensor,” Opt. Express **23**(15), 18897–18914 (2015). [CrossRef]

**6. **B. L. Guan, Y. Shang, and Q. F. Yu, “Planar self-calibration for stereo cameras with radial distortion,” Appl. Opt. **56**(33), 9257–9267 (2017). [CrossRef]

**7. **Z. Z. Wei, L. J. Cao, and G. J. Zhang, “A novel 1D target-based calibration method with unknown orientation for structured light vision sensor,” Opt. Laser Technol. **42**(4), 570–574 (2010). [CrossRef]

**8. **J. H. Sun, Q. Z. Liu, Z. Liu, and G. J. Zhang, “A calibration method for stereo vision sensor with large FOV based on 1D targets,” Opt. Lasers Eng. **49**(11), 1245–1250 (2011). [CrossRef]

**9. **M. Agrawal and L. S. Davis, “Trinocular stereo using shortest paths and the ordering constraint,” Int. J. Comput. Vis. **47**(1/3), 43–50 (2002). [CrossRef]

**10. **Y. P. Ma, Q. W. Li, J. Xing, G. Y. Huo, and Y. Liu, “An Intelligent Object Detection and Measurement System Based on Trinocular Vision,” IEEE. Trans. Circuits. Syst. Video Technol **30**(3), 711–724 (2020). [CrossRef]

**11. **Z. Liu, G. J. Zhang, Z. Z. Wei, and J. H. Sun, “Novel calibration method for non-overlapping multiple vision sensors based on 1D target,” Opt. Lasers Eng. **49**(4), 570–577 (2011). [CrossRef]

**12. **Z. Z. Wei, W. Zou, G. J. Zhang, and K. Zhao, “Extrinsic parameters calibration of multi-camera with non-overlapping fields of view using laser scanning,” Opt. Express **27**(12), 16719–16737 (2019). [CrossRef]

**13. **F. Abedi, Y. Yang, and Q. Liu, “Group geometric calibration and rectification for circular multi-camera imaging system,” Opt. Express **26**(23), 30596–30613 (2018). [CrossRef]

**14. **R. Hartley R and A. Zisserman, “Multiple view geometry in computer vision,” (Cambridge University, 2003), Chap.12-15.

**15. **C. Ricolfe-Viala, A. J. Sanchez-Salmeron, and A. Valera, “Calibration of a trinocular system formed with wide angle lens cameras,” Opt. Express **20**(25), 27691–27696 (2012). [CrossRef]

**16. **R. Lu and M. W. Shao, “Sphere-based calibration method for trinocular vision sensor,” Opt. Lasers Eng. **90**, 119–127 (2017). [CrossRef]

**17. **Z. Y. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern. Anal. Mach. Intell. **22**(11), 1330–1334 (2000). [CrossRef]

**18. **C. Steger, “Unbiased extraction of lines with parabolic and Gaussian profiles,” Comput. Vis. Image Underst. **117**(2), 97–112 (2013). [CrossRef]