Adaptive milliseconds tracking and zooming optics based on a high-speed gaze controller and liquid lenses

Jiaqi Li; Lin Li; Lihui Wang; Lei Li; Shaoyong Li; Masatoshi Ishikawa

doi:10.1364/OE.512003

1. Introduction

Vision is one of the critical acquisition tools for monitor systems [1,2], target tracking systems [3], telescope systems [4], and so on. There are several challenges for actual monitoring systems, because of the complexity and openness of the real monitoring environment [5,6], the diversity and dynamics of targets [7,8], and the uncertainty of interference factors [9]. The major factor affecting high-quality monitoring includes two aspects: high-speed gaze and high resolution.

The method based on mechanical pan-tilt is currently the most popular and common solution [10,11]. Fixing the camera directly on the mechanical gimbal has the advantage of allowing for a wide range of gaze control [12–14]. Therefore, it is widely used in scenarios such as robot vision and monitoring cameras. However, there is a large inertia for the mass of the rotating parts including the camera and the mechanical pan-tilt, and the response time of the pan-tilt is usually more than 20 ms [14]. Due to large inertia and low dynamic performance, this traditional method faces a major technical bottleneck [15] in high-speed tracking performance. On the other hand, a common solution for zooming in traditional monitoring systems is zooming. It includes optical zooming and digital zooming. The “interpolation” approach is used in classic monitoring systems for digital zoom [16]. While details are increased, image resolution and actual image quality are decreasing. Moreover, the conventional optical zoom system consists of several solid lenses, mechanical motors, and an image sensor. This kind of system is bulky and the response time is slow [17]. However, the actual monitoring system needs to consider both high speed and high resolution and digital zooming will lose resolution and detail.

In the previous study, an optical high-speed gaze controller system, named “Saccade Mirror”, was proposed [18]. It consisted of a high-speed mirror controller, a static camera, and pupil shift optics. By controlling the angle of the galvanometer mirror, an optical high-speed gaze controller system achieved milliseconds of pan-tilt performance. The camera was fixed and the optical path was adjusted by the mirrors to change the viewing angle [15,19,20]. This approach can reduce image processing computational complexity while enhancing real-time performance by utilizing high-speed visual servo hardware, effective image processing algorithms, parallel computing, and self-windowing. The gaze direction is within ±20°, and the response time can be less than 3 milliseconds [21,22]. This approach reduces inertia and realizes outstanding real-time tracking, but it cannot achieve optical zoom for a higher imaging quality.

Besides, the traditional optical zoom system is built with several solid lenses and moved according to a specific trajectory. Nowadays, a liquid-based variable focus lens is proposed as a novel optics component, which can dynamically change its focal length by a single lens cell. It does not require the back-and-forth motion instead of moving several solid lenses. The focal length of a liquid lens is varied by adjusting its bending curvature of the refractive surface, and it can achieve a millisecond-order response time [23,24]. It can be employed in optical systems, such as machine vision, microscopy, zoom optics, projection optics, and so on [25–30]. The advantages of the liquid lens are fast response, light weight, and no mechanical movement. Therefore, in this study, we would like to design a fast-focusing and zooming optics unit with the help of liquid lenses.

To solve the above problems, this study focused on the adaptive zoom of high-speed tracking and zooming optics and the improvement of the tracking algorithm according to the object's high-speed and dynamic characteristics. The proposed high-speed tracking and zooming optics has two functions [18]: an active tracking unit and an optical zooming unit. The tracking unit always tracks a high-speed object in the center of view. The zooming unit consisting of three liquid lenses and one glass lens is employed to accomplish non-mechanical optical zoom. In addition, it provides a compensation algorithm to achieve adaptive zoom accurately.

2. Structure and mechanism

2.1 Conceived principle

We proposed high-speed tracking and zooming optics that consists of an active tracking unit and an optical zooming unit. The tracking unit consists of a static high-speed camera, pupil shift optics, and a set of galvanometer mirrors. It has a high frame rate for tracking the high-speed object so high-speed visual feedback serves as a visual feedback pan-tilt control. The camera is stationary, and the mirrors modify the optical path to vary the gazing direction. The goal of the tracking unit is to make the centroid of the object image coincide with the center of the camera view [31]. By calculating the pixel bias, the rotation angles of mirrors are adjusted by the Proportional-Integral-Derivative (PID) controller to coincide with them [15,19].

The zooming unit is designed with a zooming function without mechanical movement. It includes three liquid lenses and one glass lens for detailed observation. A zoom group of the zoom objective is liquid lens 1, which adjusts the focal length by changing the optical power. The compensating group of the zoom objective is liquid lenses 2 and 3, which offset the conjugate distance of liquid lens 1 by changing the optical powers of lenses 2 and 3. In addition, the glass lens is used to compensate for aberrations and provide additional optical powers for the zoom objective. These two units are coaxially designed by a beam splitter to share the screen, as shown in Fig. 1.

Fig. 1. Schematic diagram of high-speed tracking and zooming optics based on high-speed gaze controller and zoom objective. It consists of two modules: an active tracking unit and an optical zooming unit.

Download Full Size | PDF

2.2 Zoom control and focus compensation algorithm

The focus compensation algorithm is described as follows. The image is first conducted on the Laplace of Gaussian (LoG) filter LoG filter is a conventional edge-detecting tool [32]. It is desirable to filter out the noise with a Gaussian filter before detecting edges with the Laplacian operator. Here we use the custom Laplacian value (after LoG processing), that is the mean and standard deviation of Laplacian values, to judge the charity of images. The focus compensation algorithm is mainly divided into two parts.

In the first part, the optical power fitting curve of each liquid lens under varied focal lengths is obtained by the least square method. (The specific application formula appears in Eqs. (3)-(5) in section 3.2.1). The least square method is probably the most popular technique in statistics [33,34]. The least square method is widely used to find or estimate the numerical values of the parameters to fit a function to a set of data and characterize the statistical properties of estimates.

However, the Laplacian value is easily affected by the surrounding environment (such as light and temperature). Therefore, the optical power of each liquid lens acquired by fitting curves may not get the clearest image, and the image will be out of focus.

The second part of the algorithm is to solve the problem of image out-of-focus by adjusting the optical power. It adjusts the optical power of each liquid lens and cyclically approaches the optimal value of the Laplacian value. The zooming unit mainly completes the zoom by the zoom objective. According to our previous design of the zoom objective [30], liquid lens 1 mainly adjusts the focal length, and liquid lenses 2 and 3 are responsible for revising whether images are in focus.

The detailed flow of the zoom control and focus compensation algorithm is shown in Fig. 2. First, the expected focal length of the zoom objective is obtained. The optical powers d₁, d₂, d₃ of each liquid lens corresponding to the focal length are obtained from the fitting curves, which also were control functions of each liquid lens. Next, the optical powers d₂ and d₃ are adjusted to focus the image and start with d₃. By judging whether d₃ is between d₃ min and d_3max, if not, it indicates that the current optical power d₃ reaches the best value and enters the judgment of d₂. Otherwise, d₃'s value is adjusted, and the current Laplace value is compared with the previous frame. If it is less than, the step value of d₃ is reduced and inverted. Then the current Laplacian value and the stored optimal Laplacian value are compared whether within the threshold value. If yes, the current optimal optical power of d₃ is obtained; otherwise, the whole processing of d₃ is cycled to the previous loop. In the same way, d₂ is handled similarly. Finally, the optimal value of the optical power of each liquid lens is obtained.

Fig. 2. The flowchart of the zoom control and focus compensation algorithm.

Download Full Size | PDF

3. Experiments and analysis

3.1 Prototype

Figures 3(a) and 3(d) are prototypes of the high-speed tracking and zooming optics. The image processing workstation is DELL 5810. As shown in the sketch in Fig. 1, the light is reflected by two galvanometer mirrors and divided into two different light paths by the beam splitter. One part of the light path passes through pupil shift optics to reach the high-speed camera (Basler acA-640-750uc, 751 FPS). The pupil shift consists of several customized optical lenses, which transfers the position of camera pupil into the proximity of the two mirrors. It can extend the angle of view even using small mirrors. Two high-speed mirrors are controlled by mirror controller driving boards to adjust the gazing direction, which is controllable by the external voltage. The control voltage is outputted via DA board (Interface, LPC-361216) to the driving board.

Fig. 3. High-speed tracking and zooming optics experimental device. (a) The structure of high-speed tracking and zooming optics. (b) The zooming unit. It consists of three liquid lenses, one doublet, and a high-speed camera. (c) Three lens drivers with a hub. Len driver controls the liquid lens by serial commands. (d) Overall view of high-speed tracking and zooming optics. (e) Tracked object. A blue circle with the “abc” was printed on an A2 white paper. This paper was placed on a custom-made board with backlighting.

Download Full Size | PDF

The other side of the light path passes through the zoom objective, which consists of three liquid lenses (Optotune, EL-10-30-Ci-VIS-LD-MV [35]) and one doublet lens, and finally reaches the high-speed camera (Basler acA-1300-200uc, 203 FPS). Liquid lens 1 is primarily responsible for magnifying the view from 1X to 2X. Liquid lens 2 and liquid lens 3 are used to focus compensation to get a clear view. The liquid lenses are driven by controllers (Optotune, Lens Driver 4i), as shown in Fig. 3(c). Figure 3(e) shows a physical picture of one of the tracking objects, which consists of a board with backlighting that can emit light and an A2 paper with a blue circle printed.

3.2 Optical zooming unit

3.2.1 Optical zoom

To correct the fabrication error of the zoom objective, we used the program to obtain data about the focal lengths of the zoom objective and the optical powers of each liquid lens and then found their relationship. The program can control the optical power of each liquid lens by the keyboard, and the optical powers were increased or decreased by 0.1 diopters each time to obtain the focal length in real time.

First, we placed the tracked object at an object distance of 4100 mm from the high-speed tracking and zooming optics. The captured image of the zooming unit camera was converted into a grayscale image. The grayscale image was Laplace transformed and the mean and standard deviation were calculated to derive the sharpness index. Next, the captured original image was performed by image processing including thresholding and binarization inversion. According to the numerical value of the Laplacian value, the optical power of each liquid lens was manually adjusted using the program. When the Laplacian value reached the peak, the pixel value of the circumscribed circle radius P in the image was obtained. P also was the pixel value of the radius of the circumscribed circle outline of the tracked object in the captured image. The ${R_O}$ measured manually by a caliper was the actual length of the radius of the circumscribed circle of the tracked object. The horizontal/vertical pixel size of the high-speed camera for the zooming unit was 0.0048 mm. The m was the actual magnification of the zoom objective. The 40 mm actual magnification was 0.00985. Thus, the focal length F of the zoom objective could be expressed as the following Eqs. (1) and (2):

(1)$$m = \frac{{P \times 0.0048}}{{{R_0}}}$$

and

(2)$$F = \frac{{m \times 40}}{{0.00985}}. $$

Finally, 37 groups of optical power data and focal lengths of zoom objective ranging from 40 mm to 80 mm were obtained. The whole experiment process was carried out at the room temperature of 25 degrees, and the duration lasted about 2 hours.

The optical power of the liquid lens was set to a dependent variable and F was set to an independent variable. The fitting curve of each liquid lens could be computed by the least square method using the obtained values, and these curves were also the control functions for each liquid lens, which is mentioned in section 2.2. The optical power of liquid lenses are d₁, d₂, d₃. The f is the focal length of the zoom objective. Thus, the relationship between the optical powers of liquid lenses and focal lengths is shown in Fig. 4 and detailed formulas could be expressed as the following Eqs. (3)-(5):

(3)$${d_1} = 7.85501 \times {10^{ - 5}} \times {f^3} - 1.458455 \times {10^{ - 2}} \times {f^2} + 1.041129 \times f - 2.46363 \times 10$$

(4)$${d_2} = 7.84875 \times {10^{ - 6}} \times {f^4} - 2.0569 \times {10^{ - 3}} \times {f^3} + 1.9307 \times {10^{ - 1}} \times {f^2} - 7.77 \times f + 1.1914 \times {10^2}$$

(5)$${d_3} ={-} 4.949 \times {10^{ - 6}} \times {f^4} + 1.3358 \times {10^{ - 3}} \times {f^3} - 1.2716 \times {10^{ - 1}} \times {f^2} + 4.863 \times f - 5.811 \times 10$$

Fig. 4. The relationship between the optical powers of liquid lenses and focal lengths. The data was obtained by manually adjusting the optical power to get the optimal data. The curves were fitted by the least square method, and they also were control functions for each liquid lens.

Download Full Size | PDF

Figure 4 shows information about the optical powers of liquid lenses under different focal lengths. When the focal length was between 40 and 65 mm, the figure for optical powers was significantly changed in liquid lenses 1 and 3 and remained stable in liquid lens 2. When the focal length was from 70 mm to 80 mm, the zooming capability of the zoom objective was primarily played by liquid lenses 1 and 2, while the figure for liquid lens 3 saw a steady trend. With a focal length ranging from 65 mm to 70 mm, the zooming mechanism of the zoom objective was influenced by the combined actions of all three liquid lenses. Therefore, at the same focal length, we need to adjust the optical powers of lenses 2 and 3 respectively for clear image.

In the algorithm design of focus compensation, the curves of lenses 2 and 3 are divided into three parts according to the focal length: 40-65 mm, 65-70 mm, and 70-80 mm. In 40-65 mm, liquid lens 3 plays a major role in focus compensation, dropping from around 7 diopters in 40 mm to approximately -1.5 diopters in 65 mm. In 70-80 mm, liquid lens 2 sees a remarkable decrease in focus compensation from around 5 diopters in 70 mm to about 0 diopters in 80 mm. Therefore, d₂ min and d_2max are set as 0 and 5 respectively, and d₃ min and d_3max are set as -1.5 and 7 respectively.

The focusing accuracy of the zooming unit could be affected by the environment and the inaccuracy of manual fabrication, if the optical powers of the liquid lenses were obtained simply from the control function. This zoom control was an open-loop control. Therefore, we further added a focus compensation algorithm to reduce the above influence. We conducted the similar experiment under the same experimental environment (such as room temperature, duration, and object distance). We obtained 22 groups of optical power data when the focal length of the zoom objective ranged from 40 mm to 80 mm, as shown in Fig. 5.

Fig. 5. Comparison chart of optical powers of three liquid lenses vs focal length of the zoom objective. Simulation means the simulated curve using simulation software. Manual refers to the data obtained by manual adjustment procedures. Experiment refers to the curve combined with focus compensation algorithms.

Download Full Size | PDF

In Fig. 5, the curve of simulation is the theoretical change curve of the optical power of each liquid lens under different focal lengths using simulation software (Zemax). The manual curve is the experimental data of manually adjusting the program. It also has the same data as Fig. 4. The experiment curve is experimental data combined with the added focus compensation algorithm.

However, there is a difference between the manual optical power and the simulation result, which may be caused by fabrication errors. Three curves show that the manual curve corrects the error between the theoretical data and the manual data The experiment curve presents the compensation adjustment compared to the manual curve, and it addressed the robustness of the zoom control function against dynamic environmental affected.

The experiment shows the continuous zoom when the object distance was at 4100 mm away. The zooming unit combined with the focus compensation algorithm changes the focal length from 40 mm to 80 mm and the magnification from 1X to 2X. The experiment was recorded on video and a series of photos were captured and shown in Fig. 6. A video of the experiment is in Visualization 1. Figure 7 shows a photograph of a USAF 1951 chart [36] printed on A4 paper, which was captured by the zooming unit camera with f = 42 mm and f = 80 mm. Group 1 element 5 is the smallest element that can be seen clearly at f = 42 mm, while group 2 element 3 is the smallest element that can be recognized obviously at f = 80 mm.

Fig. 6. Actual effect diagram at different focal lengths (Visualization 1).

Download Full Size | PDF

Fig. 7. Image of a resolution chart captured by the prototype (An object distance was 4100 mm away). The USAF 1951 chart was printed on a piece of A4 paper (as a reference), and Fig. 7 was captured at f = 42 mm and f = 80 mm.

Download Full Size | PDF

3.2.2 Adaptive zoom performance

To reflect the characteristics of the adaptive zoom of high-speed tracking and zooming optics, we changed different templates. Templates include a blue perfect circle with a radius of 80 mm, a green square with a circumcircle radius of 127 mm, and a red regular triangle with a circumcircle radius of 98.14 mm. The scene pictures of the templates are shown in Fig. 8.

Fig. 8. The photography of three different templates. The blue circle has a radius of 80 mm, the green square has a circumcircle radius of 127 mm, and the red triangle has a circumcircle radius of 98.14 mm.

Download Full Size | PDF

Fig. 9. Calculation diagram of screen-to-body ratio. The green circle is the circumscribed circle of the object. The blue circle is based on the same center of the object and the radius of the blue circle is half of the horizontal pixel of the screen (512 pixels). Therefore, the detailed formula of the screen-to-body ratio is the pixel value of the object's circumscribed circle diameter (green dotted line) accounting for the pixel value of the horizontal direction of the screen (blue dotted line).

Download Full Size | PDF

Because we need to magnify the view and see more details of the object, the magnification ratio should not be too small or too large. If it is too small, we will not see the detailed information, if it is too large, we will lose the overall information. Therefore, different magnifications are required for different templates to obtain a suitable screen-to-body ratio and effectively view more information.

The screen-to-body ratio here is the ratio of the pixel value of the circumscribed circle diameter of the object to the pixel value in the horizontal direction of the screen (1024 pixels). The pixel value of the object's circumscribed circle diameter (green dotted line) is d. The pixel value of the horizontal direction of the screen (blue dotted line) is l. Therefore, the screen-to-body ratio $r = d/l$. The detailed information about the screen-to-body ratio is in Fig. 9.

For different templates, the appropriate magnification can be obtained by measuring the screen-to-body ratio. Then the view can be directly enlarged to the corresponding magnification for different templates. The experiment result shows the adaptive zoom performance at different templates when the object distance was 4100 mm away. The following data about screen-to-body ratio was obtained, as shown in Table 1. The actual effect pictures can be seen in Fig. 10. Combining Table 1 and Fig. 10, the object’s overall information can be seen clearly when the screen-to-body ratio is between 65% and 70%. The experiment was recorded in video and the video is included in Visualization 2.

Fig. 10. Actual effect diagrams at different templates. These pictures show the actual pictures with focal lengths of 44 mm and 80 mm and with an appropriate screen-to-body ratio between 65% and 70% under different templates (Visualization 2).

Download Full Size | PDF

Table 1. Screen-to-body ratio and the circumscribed circle radius of objects (circle, square, and triangle) at different focal lengths

View Table

3.3 Active tracking unit

In the actual monitoring scene, the tracked object generally has the high-speed characteristic. An adaptive tracking algorithm was proposed to improve the tracking speed. Self-windowing is an effective mechanism to improve acquisition efficiency for image processing [14,37]. The proposed adaptive tracking algorithm includes PID controller and image processing consisting of image pyramid, template matching, and self-windowing, as shown in Fig. 11.

Fig. 11. Adaptive tracking mechanisms for image processing. The solid line box is the captured image, the dashed line box is the region of interest (ROI), and the blue area is Template. A is the point in the upper left corner of ROI. R is the object's centroid in x_RO_Ry_R. I is the object image's centroid in x_IO_Iy_I. M in Source matches best the Template’s centroid.

Download Full Size | PDF

Fig. 12. Performance demonstration experiment. (a) Schematic diagram of high-speed tracking and zooming optics. It consists of the active tracking unit and the optical zooming unit. (b) The movement of the tracked object. The object was moved left, right, up and down. (c) The actual shooting of the object. The moving range was ± 60 cm. (d) The photo was captured by the active tracking unit camera. The object was always in the center of view. (e) The photo was captured by the optical zooming unit camera The image was adjusted to a suitable screen-to-body ratio. (Visualization 3).

Download Full Size | PDF

We first create the image coordinate system x_IO_Iy_I and the adaptive window coordinate system x_RO_Ry_R. The image of the object was captured in advance at a predetermined distance. The resolution of the image is decreased by image processing including blurring and downsampling and this image is used as the Template. Because the tracked object will be moved at a high speed, we choose adaptive self-windowing. Because the purpose of the tracking unit is to maintain the tracked object in the center, the region of interest (ROI) of the tracked object should appear in the center of captured image. The ROI size will be adjusted adaptively according to different sizes of templates for improving processing speed. ROI is processed to obtain the Source by the same downsampling ratio as the object image. The point M = (x_m, y_m) in Source that best matches Template’s centroid is found by using the template matching algorithm [38,39]. When M is enlarged according to the previous downsampling ratio, the object's centroid in x_RO_Ry_R (R = (x_r, y_r)) can be obtained. The ratio refers to the downsampling ratio, which is a constant of 2. R can be expressed as the following Eq. (6):

(6)$$\left\{ {\begin{array}{{c}} {{x_r} = {x_m} \times ratio}\\ {{y_r} = {y_m} \times ratio} \end{array}} \right.. $$

The point A = (x_a, y_a) in x_IO_Iy_I is the point in the upper left corner of ROI. To facilitate subsequent operations, it is necessary to convert R from the coordinate system x_RO_Ry_R to x_IO_Iy_I. Thus, the object image's centroid in x_IO_Iy_I (I = (x_i, y_i)) can be expressed as the following Eq. (7):

(7)$$\left\{ \begin{array}{l} {x_i} = {x_r} + {x_a}\\ {y_r} = {y_r} + {y_a} \end{array} \right.. $$

Finally, we employ PID controller to adjust the mirrors and coincide I with the captured image’s center. This adaptive tracking algorithm can be executed within 4 ms from detecting the image to controlling the gaze. The zooming unit and the tracking unit are co-axial designed because the high-speed feedback processing is independent of the zooming unit. So that, the zooming unit can obtain a high-resolution enlarged view and detailed information about the tracked object. The performance demonstration experiment is shown in Fig. 12 and Visualization 3.

Based on the adaptive tracking algorithm, the experiment was conducted to test the overall effect. High-speed tracking and zooming optics are used for real-time object tracking and adaptive zooming when the object distance is 4100 mm away. The experiment result shows that the zooming unit performs optical zoom from f = 40 mm to f = 80 mm within 18 milliseconds (with focus compensation algorithm), and the tracking unit recognizes the object by adaptive tracking algorithm within 4 milliseconds. The experiment was recorded in Visualization 3.

4. Discussion

In Fig. 6 and Fig. 10, there is a black border around the image where the camera has a short focal length (from 40 mm to 48 mm). We believe that there are two reasons for this situation. One is that pupil shift optics is not used in the zooming unit. Another reason is that the galvanometer mirror stand (the white high-speed mirror stand in Fig. 3(a)) and the object distance cannot be further reduced. This results in a larger field of view than is reflected by high-speed mirrors, and the other part of the field of view cannot be reflected by the mirrors, so there is a black border around the image.

Figure 6 shows the images taken by the high-speed camera of the zooming unit, and the resolution of images is not high. The camera we chose is a high-speed camera because we ended up using a high-speed camera to integrate a focus compensation algorithm.

5. Conclusion

The traditional method of monitoring cameras is employed in robot vision, monitoring systems, and so on. However, it cannot track and zoom as fast as expected due to the large inertia. It also decreases the optical resolution because of the digital zoom on the interest area. Therefore, we proposed high-speed tracking and zooming optics that consists of an optical zooming unit and an active tracking unit. The two units are designed with coaxial optical paths by a beam splitter. The zooming unit is built with three liquid lenses, one glass lens, and a high-speed camera. It can continuously change the magnification from 1X to 2X. By controlling the optical powers of three liquid lenses, the focal length of the zooming unit can be changed from 40 mm to 80 mm within milliseconds. The tracking unit composed of a high-speed mirror-based gaze controller, a high-speed camera, and pupil shift optics, can track the object and keep it in the center of both views. In addition, the zooming unit provides a compensation algorithm for the zooming unit to achieve adaptive zoom accurately. The experiment shows that the zooming unit performs adaptive optical zoom, and the tracking unit recognizes the object by adaptive tracking algorithm within 4 milliseconds. The proposed optics can be used in monitor systems, target tracking systems, telescope systems, and so on.

Funding

Guangdong Academy of Sciences (2021GDASYL-20210102006); Natural Science Foundation of Guangdong Province (2021A1515012596, 2021B1515120064); National Natural Science Foundation of China (61927809, 61975139).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. I. Ghafir, V. Prenosil, J. Svoboda, et al., “A survey on network security monitoring systems,” in 2016 IEEE 4th International Conference on Future Internet of Things and Cloud Workshops (FiCloudW) (IEEE, 2016), pp. 77–82.

2. Y. Gu, M. Kim, Y. Cui, et al., “Design and implementation of UPnP-based surveillance camera system for home security,” in 2013 International Conference on Information Science and Applications (ICISA) (IEEE, 2013), pp. 1–4.

3. A. Specker, D. Stadler, L. Florin, et al., “An occlusion-aware multi-target multi-camera tracking system,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (IEEE, 2021), pp. 4168–4177.

4. C. W. Akerlof, R. L. Kehoe, T. A. McKay, et al., “The ROTSE-III robotic telescope system,” Publ. Astron. Soc. Pac. 115(803), 132–140 (2003). [CrossRef]

5. S. Kumar and J. Sen Yadav, “Video object extraction and its tracking using background subtraction in complex environments,” Perspect. Sci. 8, 317–322 (2016). [CrossRef]

6. E. Konstantinou, J. Lasenby, and I. Brilakis, “Adaptive computer vision-based 2D tracking of workers in complex environments,” Autom. Constr. 103, 168–184 (2019). [CrossRef]

7. P. Ren, K. Lu, Y. Yang, et al., “Multi-camera vehicle tracking system based on spatial-temporal filtering,” in 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW) (IEEE, 2021), pp. 4208–4214.

8. E. Ristani, F. Solera, R. Zou, et al., “Performance measures and a data set for multi-target, multi-camera tracking,” in Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) (Springer Verlag, 2016), 9914 LNCS, pp. 17–35.

9. S. Zhang, K. Zheng, and S. Huaiyuan, “Analysis of the Occlusion Interference Problem in Target Tracking,” Math. Probl. Eng. 2022, 1–10 (2022). [CrossRef]

10. I. Viana, J.-J. Orteu, N. Cornille, et al., “Inspection of aeronautical mechanical parts with a pan-tilt-zoom camera: an approach guided by the computer-aided design model,” J. Electron. Imaging 24(6), 061118 (2015). [CrossRef]

11. E Jinlong, L. He, Z. Li, et al., “WiseCam: wisely tuning wireless pan-tilt cameras for cost-effective moving object tracking,” in IEEE INFOCOM 2023 - IEEE Conference on Computer Communications (IEEE, 2023), 2023-May, pp. 1–10.

12. H. Yong, J. Huang, W. Xiang, et al., “Panoramic background image generation for PTZ cameras,” IEEE Trans. on Image Process. 28(7), 3162–3176 (2019). [CrossRef]

13. Y. Ariki, S. Kubota, and M. Kumano, “Automatic production system of soccer sports video by digital camera work based on situation recognition,” in Eighth IEEE International Symposium on Multimedia (ISM’06) (IEEE, 2006), pp. 851–860.

14. Y. Nakabo, M. Ishikawa, H. Toyoda, et al., “1 ms column parallel vision system and its application of high speed target tracking,” in Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065) (IEEE, 2000), 1(April), pp. 650–655.

15. K. Okumura, H. Oku, and M. Ishikawa, “High-speed gaze controller for millisecond-order pan/tilt camera,” in Proceedings - IEEE International Conference on Robotics and Automation (2011), pp. 6186–6191.

16. C. Weerasinghe, M. Nilsson, S. Lichman, et al., “Digital zoom camera with image sharpening and noise suppression,” IEEE Trans. Consum. Electron. 50(3), 777–786 (2004). [CrossRef]

17. T. Martinez, “Adaptive optical zoom,” Opt. Eng. 43(1), 8 (2004). [CrossRef]

18. K. Okumura, K. Yokoyama, H. Oku, et al., “1 ms auto pan-tilt – video shooting technology for objects in motion based on Saccade Mirror with background subtraction,” Adv. Robot. 29(7), 457–468 (2015). [CrossRef]

19. J. Sakakibara, J. Kita, and N. Osato, “Note: High-speed optical tracking of a flying insect,” Rev. Sci. Instrum. 83(3), 036103 (2012). [CrossRef]

20. R. Cao, J. Fu, H. Yang, et al., “Robust optical axis control of monocular active gazing based on pan-tilt mirrors for high dynamic targets,” Opt. Express 29(24), 40214 (2021). [CrossRef]

21. M. Kawakita, K. Iizuka, R. Iwama, et al., “Gain-modulated axi-vision camera (high speed high-accuracy depth-mapping camera),” Opt. Express 12(22), 5336 (2004). [CrossRef]

22. T. Sueishi, M. Ishii, and M. Ishikawa, “Tracking background-oriented schlieren for observing shock oscillations of transonic flying objects,” Appl. Opt. 56(13), 3789 (2017). [CrossRef]

23. H. Oku and M. Ishikawa, “High-speed liquid lens with 2 ms response and 80.3 nm root-mean-square wavefront error,” Appl. Phys. Lett. 94(22), 24–27 (2009). [CrossRef]

24. H. Oku, K. Hashimoto, and M. Ishikawa, “Variable-focus lens with 1-kHz bandwidth,” Opt. Express 12(10), 2138 (2004). [CrossRef]

25. D.-Y. Zhang, N. Justis, and Y.-H. Lo, “Fluidic adaptive zoom lens with high zoom ratio and widely tunable field of view,” Opt. Commun. 249(1-3), 175–182 (2005). [CrossRef]

26. R. Peng, J. Chen, and S. Zhuang, “Electrowetting-actuated zoom lens with spherical-interface liquid lenses,” J. Opt. Soc. Am. A 25(11), 2644 (2008). [CrossRef]

27. L. Wang, H. Oku, and M. Ishikawa, “An improved low-optical-power variable focus lens with a large aperture,” Opt. Express 22(16), 19448–19456 (2014). [CrossRef]

28. L. Wang, S. Tabata, H. Xu, et al., “Dynamic depth-of-field projection mapping method based on a variable focus lens and visual feedback,” Opt. Express 31(3), 3945–3953 (2023). [CrossRef]

29. L. Wang, H. Oku, and M. Ishikawa, “Variable-focus lens with 30 mm optical aperture based on liquid–membrane–liquid structure,” Appl. Phys. Lett. 102(13), 131111 (2013). [CrossRef]

30. L. Li, N. Xie, J.-Q. Li, et al., “Optofluidic zoom system with increased field of view and less chromatic aberration,” Opt. Express 31(15), 25117 (2023). [CrossRef]

31. S. Ali, A. G. Ramos, M. A. Carravilla, et al., “On-line three-dimensional packing problems: A review of off-line and on-line solution approaches,” Comput. Ind. Eng. 168, 108122 (2022). [CrossRef]

32. F. Ulupinar and G. Medioni, “Refining edges detected by a LoG operator,” Comput. Vision, Graph. Image Process. 51(3), 275–298 (1990). [CrossRef]

33. E. H. Thompson, “The theory of the method of least squares,” Photogramm. Rec. 4(19), 53–65 (1962). [CrossRef]

34. R. Szeliski and S. B. Kang, “Recovering 3D shape and motion from image streams using nonlinear least squares,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (IEEE Comput. Soc. Press, 1994), pp. 752–753.

35. “EL-3-10 lens — Optotune,” https://www.optotune.com/el-3-10-lens.

36. A. Orych, “Review of methods for determining the spatial resolution of UAV sensors,” Int. Arch. Photogramm. Remote Sens. Spatial Inf. Sci. XL-1/W4, 391–395 (2015). [CrossRef]

37. I. Ishii and M. Ishikawa, “Self windowing for high-speed vision,” Syst. Comput. Japan 32(10), 51–58 (2001). [CrossRef]

38. R. Brunelli and T. Poggiot, “Template matching: matched spatial filters and beyond,” Pattern Recognit. 30(5), 751–768 (1997). [CrossRef]

39. T. J. Darrell, I. A. Essa, and A. P. Pentland, “Task-specific gesture analysis in real-time using interpolated views,” IEEE Trans. Pattern Anal. Machine Intell. 18(12), 1236–1242 (1996). [CrossRef]

Name	Description
Visualization 1	The actual effect of the optical zooming unit at different focal lengths.
Visualization 2	The actual effect of optical zooming unit at different templates.
Visualization 3	The actual effect of high-speed tracking and zooming optics.

	The screen-to-body ratio of objects (%)			The circumscribed circle radius of objects (pixel)
Focal length (mm)	Circle	Triangle	Square	Circle	Triangle	Square
40	33.07	42.53	51.26	169.3	217.8	262.4
44	35.92	46.62	56.18	183.9	238.7	287.6
48	39.24	50.99	61.57	200.9	261.1	315.2
52	42.63	55.39	66.83	218.2	283.6	342.2
56	45.92	59.72	72.27	235.1	305.8	370.0
60	49.20	64.13	78.39	251.9	328.3	401.4
64	52.53	68.53	83.55	268.9	350.9	427.8
68	55.82	72.83	88.50	285.7	372.9	453.1
72	59.35	76.67	93.89	303.8	392.5	480.7
76	61.90	80.29	98.81	316.9	411.1	505.9
80	65.27	84.88	104.10	334.2	434.6	533.0

Adaptive milliseconds tracking and zooming optics based on a high-speed gaze controller and liquid lenses

Abstract

1. Introduction

2. Structure and mechanism

2.1 Conceived principle

2.2 Zoom control and focus compensation algorithm

3. Experiments and analysis

3.1 Prototype

3.2 Optical zooming unit

3.2.1 Optical zoom

3.2.2 Adaptive zoom performance

3.3 Active tracking unit

4. Discussion

5. Conclusion

Funding

Disclosures

Data availability

References

Supplementary Material (3)

Data availability

Cited By

Figures (12)

Tables (1)

Equations (7)

Optics Express