Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Prism-based tri-aperture laparoscopic objective for multi-view acquisition

Open Access Open Access

Abstract

This paper presents the design and prototype of a novel tri-aperture monocular laparoscopic objective that can acquire both stereoscopic views for depth information and a wide field of view (FOV) for situational awareness. The stereoscopic views are simultaneously captured via a shared objective with two displaced apertures and a custom prism. Overlapping crosstalk between the stereoscopic views is diminished by incorporating a strategically placed vignetting aperture. Meanwhile, the wide FOV is captured via a central third aperture of the same objective and provides a 2D view of the surgical field 2x as large as the area imaged by the stereoscopic views. We also demonstrate how the wide FOV provides a reference data set for stereo calibration, which enables absolute depth mapping in our experimental prototype.

© 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

The conventional optical design of a rigid laparoscope comprises of an optical objective lens and rod lens relays, and this form has been used for minimally invasive surgery since the beginning while novel designs are continually being developed. The conventional design has provided surgeons with excellent two-dimensional (2D) image quality over the operative field. However, two major optical limitations arise with conventional 2D laparoscopes: (1) the absence of binocular vision results in restricted depth perception and (2) the field of view (FOV) and spatial resolution are inversely proportional. The lack of depth information requires extensive training for physicians to become efficient with a 2D operative view. Meanwhile, the second limitation constrains the FOV to cover just the surgical area to maintain sufficient image resolution. Hence, complications that occur outside the surgical area would not be seen unless the laparoscope is physically moved. To improve upon these surgeries, these two limitations and their corresponding optical design solutions have been explored separately in literature.

To recover depth perception, various methods have been investigated, including dual-sensor stereo, single-sensor stereo, single-sensor 3D imaging via structured light, and uniaxial 3D imaging [1]. Commercially, stereoscopic endoscopes with dual-channel object-relay optics and dual imaging sensors, like the DaVinci and Endoeye Flex 3D [2,3], have been popularized through successful demonstration of 3D vision and depth perception in the live surgical setting. Academically, other types of stereo endoscopes have shown potential for the future. The ones that can acquire stereo vision in a uniaxial, single camera system are attractive because they preserve the limited design volume constrained by the laparoscope housing. The 3D-MARVEL system highlights this by using a monocular system with a dual aperture comprising of complementary multi-bandpass filters [4]. A generalized multi-aperture approach can acquire multiple views from different directions with varying baselines, leading to demonstrations of light field endoscopes [5,6]. For instance, Kwan et al. demonstrated light field laparoscope design, which captures each view by sampling the entrance pupil with a programmable aperture placed at the stop of the laparoscope [5]. Instead of using multiple apertures, another way to acquire multiple views is to place a multifaceted prism [7,8] or a microprism array (MPA) [9] in front of the endoscope. These prisms bend the light rays from the stereo views towards the camera lens so that each stereo image can be captured on one half of the image sensor. Although all these types of laparoscopes can acquire depth information, they still are limited by the FOV and spatial resolution tradeoff.

To overcome the second major optical limitation, the FOV versus spatial resolution tradeoff has been eliminated using a multi-resolution foveated laparoscope (MRFL) [1014]. A MRFL system simultaneously captures high-resolution zoomed-in and wide-angle zoomed-out views through a shared objective-relay tube and two separate imaging probes via beam splitting and 2D scanning. Surgeons can then use the high-resolution zoomed-in view for surgical operation while the wide-angle view provides peripheral awareness for preventing patient injury from accidental collisions of surgical instruments outside the surgical area. The MRFL has demonstrated successful 2D wide FOV (WFOV) minimally invasive surgery in animal trials, but additional implementation of depth perception recovery has not been attempted.

A laparoscope that can capture both 2D WFOV and 3D depth information has only been minimally explored and could be a promising development for minimally invasive surgery. Kit et al. recently attempted to utilize two independent stereo cameras to achieve 3D vision while creating a larger 2D FOV image by stitching the non-overlapping FOVs of the stereo cameras [15]. It required trading off a portion of the stereo FOV (SFOV) to gain a 2D WFOV. This situation is suboptimal because the surgeon would want to perform surgery with a maximal SFOV while the WFOV is only used periodically for peripheral awareness. Nonetheless, the concept was demonstrated and provides groundwork for future developments.

In this paper, we present the optical design and prototype of a novel tri-aperture monocular laparoscopic objective (TAMLO) for acquiring SFOV views for 3D vision and a 2D WFOV view for peripheral awareness. The monocular form factor with a sufficiently large clear aperture is chosen over a conventional stereo system with dual-channel objective-relay optics to preserve the design volume that is limited by the diameter of the laparoscope housing. More specifically, having two independent optical systems consumes more design volume due to the required edge apertures from each system and the opaque space between them. Instead, a monocular system can capture two stereo views from two laterally displaced aperture stops while preserving the central design volume for capturing the WFOV through a central aperture stop. The tri-aperture layout is a simplification of the multi-aperture approach of light field endoscopes [5] and retains the benefits of 3D depth acquisition without the cost of spatial or temporal tradeoffs. To enable simultaneous acquisition of the stereo views on a single sensor, we also incorporate a custom prism that can support the tri-aperture layout inside the TAMLO. As a result, the stereoscopic views are simultaneously captured via a shared objective with two displaced apertures and a custom prism while the 2D WFOV view is captured via a central third aperture of the same objective [16]. The rest of the paper is organized as follows. Section 2 presents the schematic design and key first-order analytical considerations for the proposed TAMLO optics. Section 3 discusses the optical design for a TAMLO prototype and the solution to resolve the crosstalk issue between the stereo views. Section 4 presents the opto-mechanical design and prototype assembly along with raw images of the SFOV and WFOV views captured by our experimental prototype. Section 5 demonstrates a method of stereo distortion calibration and camera modeling assisted by the WFOV view, the results of calibrated stereo views, and the absolute depth mapping results.

2. Schematic design of a TAMLO

Figure 1(a) shows the schematic layout of the proposed prism-based tri-aperture monocular laparoscope design, and Fig. 1(b) fully illustrates the optical layout and key parametric specifications of the TAMLO design, which is the main contribution of this paper. Adopting the convention of light traveling from left to right, the TAMLO images the object field through three laterally displaced apertures and forms three different views of the object on the intermediate image #1, corresponding to a wide FOV (WFOV) image of a large object field and two stereo FOV (SFOV) images of a smaller overlapping object field. Following the TAMLO, conventional laparoscope relay optics is utilized to relay the intermediate image #1 and forms an intermediate image #2 outside the patient’s body. After relaying the intermediate image #1 to the intermediate image #2, the spatial arrangement of the three views captured by the TAMLO is still preserved. Depending on the specific requirements and priorities of a particular laparoscope design, the three views may be captured by a single or multiple imaging sensors, which leads to different possible designs of the imaging probes. For instance, as illustrated in Fig. 1(a), we may adopt a scheme similar to the dual-channel imaging probes in our previous MRFL system. An eyepiece collimates the light from the intermediate image #2 and a beamsplitter then splits the collimated light into two imaging paths, one for capturing the WFOV image and one for the SFOV images. The type of beamsplitter used will depend on the type of tri-aperture selector used to separate the overlapping WFOV and SFOV images formed on the intermediate image #1. The two SFOV images through the two side-view apertures are recorded simultaneously on each half of the sensor #2, while the WFOV image through the center aperture is captured by the sensor #1. As long as the multiple views captured by the TAMLO are constrained within the maximally allowed diameter, IIW, of the intermediate image #1, the relay lens group only needs to be designed to support a FOV matching IIW. The relay lens must also support the maximal ray angle incident on the intermediate image #1 from all of the views. To avoid severe light loss from vignetting, it is preferred that the TAMLO and relay lens group are designed to be nearly telecentric at both intermediate image #1 and #2. It is worth noting that the TAMLO may be utilized alone for the option of a chip-on-tip form factor.

 figure: Fig. 1.

Fig. 1. (a) Proposed schematic layout of the prism-based tri-aperture monocular laparoscope design and (b) a magnified view of the TAMLO design with optical layout and key parametric specifications.

Download Full Size | PDF

As schematically illustrated in Fig. 1(b), the TAMLO mainly consists of a front lens group with a focal length of fLG1, a tri-aperture selector, a prism deflector, and a back lens group with a focal length of fLG2. The two lens groups are placed in front and behind the selector-deflector assembly to provide sufficient degrees of freedom during lens design optimization. The distances L between adjacent components are constrained by the method of optomechanical mounting and then precisely determined by the optical optimization. The chief ray bundles for the three views, which are highlighted by the different shaded regions in Fig. 1(b), are ideally maintained in separate regions of the two lens groups so that the local regions of the lenses can be optimized to the respective viewing angle. The aperture stop plane of the TAMLO is located at the tri-aperture selector, which consists of an on-axis central aperture A0 and two decentered apertures of A1 and A2. A different viewing angle of the object field is seen by each of the three different aperture stops as indicated by the labeled optical axes for the wide center view and the stereo views #1 and #2, respectively. The prism deflector located adjacent to the tri-aperture selector is made up of individual prisms Di corresponding to each aperture stop Ai. The central prism D0 is effectively a thin plane parallel plate and does not change the outgoing ray angles from A0, so a WFOV image, IIW, of the object field is formed at the intermediate image #1 and is centered about the central optical axis. D0 could be removed leaving an air space, but instead is present to provide structural support and to manufacture the prism deflector as one piece. The side prisms, D1 and D2, bend rays transmitting through them by deflection angles of θD1 and θD2, respectively, in opposite directions so that the stereo view images, S1 and S2, also located at the intermediate image #1, are laterally translated apart to the opposite sides of the central optical axis. Without the side prisms, S1 and S2 would be spatially overlapping on the exact same region about the central optical axis.

The two lens groups with focal lengths of fLG1 and fLG2 are optimized to support the WFOV captured by A0 and the SFOV captured by apertures A1 and A2. To ensure that the WFOV and stereo views can be imaged by the same relay lens within a confined volume required by a rigid laparoscope, the dimensions of the stereo view images, S1 and S2, need to be constrained to the same circular region as the WFOV image, IIW. Furthermore, to avoid crosstalk between the two stereo views, S1 and S2 should not overlap at the intermediate image #1. Therefore, the maximum SFOV covered by both the S1 and S2 along the direction of the stereo aperture displacement, $SFO{V_{\overrightarrow {{A_1}{A_2}} }}$, shall satisfy:

$$SFO{V_{\overrightarrow {{A_1}{A_2}} }} \le \frac{{WFO{V_{\overrightarrow {{A_1}{A_2}} }}}}{2}, $$
where $WFOV$ is the maximum wide FOV of the objective. $WFOV$ is expressed as $2{\tan ^{ - 1}}\frac{{I{I_W}}}{{2{f_{TAMLO}}}}$, where ${f_{TAMLO}}$ is the effective focal length of the objective, given as $\frac{1}{{\mathop f\nolimits_{LG1} }} + \frac{1}{{\mathop f\nolimits_{LG2} }} - \frac{{({{L_{TS}} + {L_{PD}} + {L_{LG2}}} )}}{{\mathop f\nolimits_{LG1} \mathop f\nolimits_{LG2} }}$, and IIW is the maximally allowed diameter of the intermediate image #1. The maximum SFOV along the axis orthogonal to the stereo aperture displacement can be as large as that of the maximum WFOV. To maximize the SFOV without causing overlapping between S1 and S2, the optimal value of the prism deflection angles, θD1 and θD2, are determined by:
$${\theta _{D1}} ={-} {\theta _{D2}} \approx {\tan ^{ - 1}}\frac{{I{I_W}}}{{4{f_{LG2}}}}.$$

To separate the overlapping WFOV and SFOV images, the tri-aperture selector, which determines the choice of the beamsplitter to be used in Fig. 1(a), may be implemented from different types of technologies that either block or encode the transmitting light. Blocking technologies include a mechanical shutter or a liquid-crystal device (LCD) that allows localized control of light transmission through a sub-region by switching the corresponding region on or off. Encoding technologies include a custom polarization device or color filter that allows localized control of light transmission through a sub-region by encoding different polarization states or spectral filters across the tri-aperture selector. If a blocking technology is used, either A0 or A1,2 is blocked in a time-sequential fashion so that the WFOV and SFOV images can be alternately captured by a single sensor, which eliminates the need for a beamsplitter and a second imaging probe in Fig. 1(a) and leads to a simpler system design with lower hardware cost. If an encoding technology is used, A0 can be encoded oppositely to A1,2. For example, orthogonal polarizers may be utilized for the apertures A0 and A1,2. Then, a corresponding polarizing or dichroic beamsplitter matching the encoded tri-aperture selector is used. This results in simultaneous capture of the WFOV and SFOV images by the two sensors illustrated in Fig. 1(a). The tradeoff of using this technology is half of each view’s irradiance due to the encoding filters.

There are multiple variations of the prism deflector design that vary in manufacturability and light manipulation. For the example in Fig. 1(b), the back faces of the prisms D0 through D2 are co-aligned vertically while the front faces of D1 and D2 are tilted oppositely to achieve the proper amount of light ray bending. Since only one side of the prisms requires angled faces, the three prisms can be manufactured as one piece through diamond turning. After deflection, the light rays must pass through different portions of the TAMLO’s back lens group. Ideally, the ray bending by the prism deflectors is desired to be independent of ray incident angle. Realistically, the net ray bending by the prism deflectors is derived using Snell’s law [17]:

$${\theta _{D1}} ={-} {\theta _{D2}} = \alpha - {\sin ^{ - 1}}\left[ {\sqrt {{n^2} - {{\sin }^2}{\theta_i}} \sin \alpha - \cos \alpha \sin {\theta_i}} \right] - {\theta _i},$$
where α is the angle of the prism, n is the index of the prism material, and θi is the incident angle of the incoming light ray. Since θD1,2 is dependent on a field or incident angle, S1,2 are distorted accordingly, but they will be calibrated in post-processing. Combining Eqs. (2) and (3), the prism design can be approximated.

Acquiring good image performance and sufficient depth resolution is critical to the TAMLO’s functionality. The diameter of each aperture stop Ai determines the corresponding F/# or numerical aperture of each view, and thus the cut-off spatial frequency or limiting resolution of the objective. Therefore, within the optics volume constraints for a rigid laparoscope, they should be maximized for optimal optical resolution. In the meanwhile, the lateral separation between the centers of the aperture stops A1 and A2, denoted as BLTS, ultimately determines the effective baseline, EBL, of the stereo views and thus the depth resolution of the system. The EBL can be found by determining the lateral separation between the centers of the entrance pupils, which are optically conjugate to the aperture stops A1 and A2 through the first lens group and is expressed as:

$$EBL = \frac{{B{L_{TS}}\ast {f_{LG1}}}}{{{f_{LG1}} - {L_{TS}}}},$$
where $B{L_{TS}} = \overline {{\textrm{A}_1}{\textrm{A}_2}} = 2\overline {{\textrm{A}_{1,2}}{\textrm{A}_0}}$ and LTS is the axial displacement of the aperture stop from the first lens group. The location of the entrance pupils, denoted by LEP, is found by imaging the tri-aperture selector through the first lens group and is expressed as ${L_{EP}} = {{{L_{TS}}{f_{LG1}}} / {({f_{LG1}} - {L_{TS}})}}$. The EBL can also be described in object space as:
$$EBL = ({{L_{WD}} + {L_{EP}}} )(|{\tan ({{\theta_{OA1}}} )} |+ |{\tan ({{\theta_{OA2}}} ))} |,$$
where θOA1 and θOA2 define the optical axis directions of the stereo views with respect to the central optical axis, and LWD is the working distance optically conjugate to the intermediate image #1 through the TAMLO objective. Meanwhile, the depth resolution of the SFOV system is determined analogously to the light field laparoscope in [5]. The average depth resolution, d, is given by:
$$d \approx \frac{{2{L_{WD}}}}{{EBL}}\left|{\frac{{ - {f_{LG1}}{f_{LG2}} + t{f_{LG1}} + {L_{WD}}({{f_{LG1}} + {f_{LG2}} - t} )}}{{ - {f_{LG1}}{f_{LG2}}}}} \right|P,$$
where P is the limiting resolution or equivalent pixel size at the intermediate image #1 and $t = {L_{TS}} + {L_{PD}} + {L_{LG2}}$. The first fraction corresponds to the triangular geometry between the object field and the effective baseline while the second fraction corresponds to the magnification of the pixel from intermediate image #1 to the object field. The variables in this equation must be chosen properly to obtain adequate depth resolution for laparoscopic surgery. An EBL of 4 mm is standard for commercial stereo endoscopes and a pixel magnification of ∼18 from intermediate image #1 to the object field is reasonable for recording the object field with an appropriately sized sensor. Using these constants, Fig. 2 plots the depth resolution rendered by a TAMLO as a function of the equivalent pixel size at the intermediate image #1 for three different working distances of 30, 60, and 120 mm. With a typical working distance of about 50 mm, a standard 2D laparoscope with an HD resolution sensor covers a circular object field of about 60 mm in diameter in a spatial resolution up to 16 lps/mm in the object space. Consider the SFOV images are expected to provide the same circular field coverage as a standard 2D laparoscope with a single HD resolution sensor, the equivalent pixel size on the intermediate image #1 falls in the range from 1.5 to 3 µm, depending on the maximally allowed diameter of the intermediate image #1 due to package constraints. A pixel resolution between 0.5 and 4 µm on the intermediate image #1 can provide a depth resolution from 0.14 to 4.37 mm depending on the working distance. This indicates a properly designed TAMLO can provide sufficient depth resolution for surgical guidance.

 figure: Fig. 2.

Fig. 2. The depth resolution range of the TAMLO as a function of pixel size at intermediate image #1 for three different working distances.

Download Full Size | PDF

Finally, the TAMLO must be able to acquire the three views without interference. The stereo images S1 and S2 are translated apart by the prism deflector according to the designed SFOV. However, as illustrated by Fig. 1, the object field is larger than the SFOV, which results in image points present outside of each designed stereo image S1,2 due to lack of a field stop. The undesired image points from one stereo image will overlap the opposite stereo image across the central optical axis, resulting in crosstalk. To address this issue, a vignetting aperture can be placed after the prism deflector, as shown in Fig. 1(b). At this location, the stereo ray bundles have diverged enough so that individual fields can be vignetted. The shape of the vignetting aperture depends on the prism deflector design. For the example in Fig. 1(b), an annulus vignetting aperture is required to allow the WFOV ray bundle to pass while blocking any stereo rays outside of the illustrated stereo ray bundle regions on the sides toward the central optical axis. Alternatively, if the two stereo views were oppositely polarized, a sensor with matching polarization on each half of the sensing area can eliminate crosstalk.

3. Tri-aperture objective lens design

Based on the analytical relationships and various constraints described in Section 2, we derived the first order specifications of the TAMLO design, which are listed in Table 1. To ensure adequate field coverage by both the WFOV and SFOV images, we chose a working distance of 120 mm. With a full FOV of 39° and 26° diagonally for the wide and stereo views, the system captures a circular region with a diameter of about 85 mm and 54 mm, respectively. To ensure adequate depth resolution, an EBL of 4 mm was chosen. We aim to fit the TAMLO prototype for a standard laparoscopic trocar and thus limited the mechanical housing diameter for the objective to be 12 mm and the maximum optics diameter to be 8 mm to account for housing and fiber illumination of 1 mm thickness each. We further limited the maximally allowed diameter of the intermediate image #1, IIW, to be 4.4 mm to allow direct capture of the images with a standard 1/3” imaging sensor and future development of the relay and imaging optics. These FOV and image size constraints led to an effective focal length of 7 mm for the objective. Between the two lens groups, we set an fLG1 of 47.5 mm for aberration compensation and balancing out the ray bending throughout the system and an fLG2 of 8 mm for sufficient power in a Petzval objective design. Since prisms are commonly made from N-BK7 glass material, a refractive index of 1.517 was selected for the prism deflector. The design of the side prisms was approximated using Eqs. (2) and (3) and an isosceles shaped prism for simplicity such that ${\theta _i} = \alpha $ for an incident ray parallel to the central optical axis. Based on the defined specifications, the bending θD1,2 of the stereo chief rays is ±7.8° to adequately separate the stereo images, and the associated prism angle α was found to be ∼15°. Using an HD sensor with 2.2 µm pixels, Fig. 2 indicates that this design will provide ∼2.5 mm depth resolution at a working distance of 120 mm, and a higher depth resolution can be achieved at a shorter working distance. Compared to the MRFL prototypes [1014], the main tradeoff of implementing the stereo apertures was the reduction of the WFOV to about half. In other words, the challenge of the TAMLO design is to balance the optical performance between the SFOV and WFOV. The entrance pupil diameter for all three apertures was set to 1.2 mm, leading to an F/# of 5.8. The entire objective, however, is effectively F/1.35 because it supports, in a monocular form factor, larger ray angles that come from the stereo aperture stops. The target spatial resolution in the object space was set to be 2.1 lps/mm and 6.25 lps/mm for the WFOV and SFOV, respectively. The object resolution specification is weighted lower for the WFOV than the SFOV because it is mainly used for peripheral awareness. Meanwhile, the constraints applicable to conventional rigid laparoscopes were also met. The optical design is constrained for image space telecentricity so that relay lenses can be easily inserted after the objective lens.

Tables Icon

Table 1. First order lens design specifications for TAMLO

The starting point of the TAMLO lens design was based on the existing MRFL and commercial 3D endoscope objectives. In all lenses and the prism deflector, rays were constrained within a 7.2 mm clear aperture diameter, or 90% of the maximum lens diameter. This diameter is slightly larger than that of the MRFL to provide additional design volume for the SFOV ray paths. For the SFOV, the object field was sampled across the + x and +/-y region with an aspect ratio of 4:3 corresponding to the sensor because each stereo view system is bilaterally, rather than rotationally, symmetric. Some of these field points along the edge are noted in Fig. 3(d). In the middle stage of the design process, custom lenses and prism deflector designs were allowed to determine the maximum achievable image performance and avoid local minimum solutions. Throughout this phase, the size of the SFOV was maintained at the same size as the conventional 2D laparoscope while the other first order specifications were adjusted accordingly based on what was practical, given the required laparoscope constraints and the incorporation of stereo apertures. Since the size of the intermediate image IIW can vary and be magnified accordingly after being relayed, it was kept to < 5.6 mm diameter to avoid vignetting and ensure possible integration with relays. In the late stage, further constraints were implemented to convert the custom components into manufacturable ones. Tolerance sensitivity reduction was also applied to produce realistic lens shape factors. In addition, the lens housing would be 3D printed, so optomechanical tolerances were loosened accordingly to ensure assembly variation was accounted for. The custom lenses were then converted to stock lenses to lower costs and achieve rapid prototyping. Image performance was slightly reduced as a result but can be restored in future versions.

 figure: Fig. 3.

Fig. 3. Manufacturable TAMLO lens design for (a) WFOV and (b) SFOV acquisition. Corresponding (c, d) polychromatic MTFs and (e, f) tolerance analyses indicate sufficient performance for prototyping.

Download Full Size | PDF

The manufacturable TAMLO lens design to be prototyped is shown in Figs. 3(a) and 3(b) for WFOV and SFOV acquisition, respectively, and the lens prescription is shown in Table 2. Both figures show the same set of monocular lenses, and all of them except one are stock components. It was found that the achromatic doublet near the middle and the field lens should be kept in the meniscus shape to maintain good image performance. Since meniscus lenses are uncommon in stock lenses, the achromat was custom made and the field lens was formed using two singlets of the same glass. For the prism deflector, a reversed deflection design was chosen so that it could be more easily manufactured as one piece. It is essentially an obtuse angle prism with the top flattened out. In this real lens design, θD1,2 is ±6.22° for a horizontal incident ray, and the associated prism angle α is 11.9°. Compared to these real values, the corresponding theoretical values from the beginning of this section are slightly different because they did not account for real thick optics and relied on approximations for simplification. Yet, those values were a good starting point for the prism design. Furthermore, the dispersion from the prism deflector was accounted for. The prism deflector design can be thought of as a segmented lens. For example, the one in Fig. 1(b) approximates a concave-plano lens, and the one in Fig. 3(b) approximates a convex-plano lens. Because the prism deflector looks like a conventional lens, its dispersion is similar to the lens it approximates. This dispersion was suppressed by using conventional dispersion compensation from the other monocular lenses with different glass types during the lens optimization process.

Tables Icon

Table 2. Lens prescription for design in Fig. 3

Figure 3(a) also shows the light rays from the WFOV transmitting through the central aperture stop A0 and plane parallel plate D0. Because there is no deflection from D0, the WFOV system is modeled as rotationally symmetric, and the center of IIW is on the lens optical axis. As the chief rays travel to IIW, they are collimated. This indicates the system is image space telecentric. Telecentricity is one of the image quality limiting constraints, which can be removed if using the TAMLO as a chip-on-tip system, where the sensor is placed at intermediate image #1. Telecentricity is maintained in this prototype design to demonstrate design feasibility. Similarly, Fig. 3(b) shows the light rays from the SFOV transmitting through the top and bottom aperture stops A1,2. On the prism deflector’s left side, the ray bundles from each view were constrained so that they only interact with their corresponding prism surface. On the prism deflector’s right side, all ray bundles share the same flat surface. D1,2 bend the corresponding ray bundles according to Eqs. (2) and (3) so that the SFOV images are translated to the upper or lower side of the optical axis without surpassing the WFOV boundary, thus allowing for simultaneous stereo image pair acquisition on a single sensor. Compared to Fig. 1(b), the prism design here deflects the corresponding stereo images S1,2 to the opposite sides of the optical axis rather than to the same side.

By having lens groups in front and behind the prism deflector, the optical system has sufficient degrees of freedom to achieve a balanced image performance between the two imaging modalities. Comparing Figs. 3(a) and 3(b) further illustrates that for the lenses closest to the tri-aperture selector, the SFOV ray bundles only occupy the outer local portions of the lenses while the WFOV ray bundles mainly occupy the central local portion. This indicates that these lenses have more flexibility to impact the imaging modalities separately, and aspheric surfaces can add additional degrees of freedom. The polychromatic MTFs in Figs. 3(c) and 3(d) corresponding to the WFOV and SFOV, respectively, show that adequate image performance can be achieved with this lens design. Although astigmatism impacts the WFOV especially at the higher frequencies, only peripheral awareness is essential rather than high resolution. Thus, slightly lower contrast is acceptable in the WFOV system. Quantitatively, the MTFs indicate that the lowest modulation for the WFOV at 37 lps/mm (2.1 lps/mm in object space) is 0.7 and for the stereo view at 109 lps/mm (6.25 lps/mm in object space) is 0.24. The cutoff frequency at 227 lps/mm corresponds to the sensor (Allied Vision Alvium 1800 U-500c) used for capturing intermediate image #1. Using stock lens and 3D printing tolerances, the tolerance analyses for the WFOV and SFOV systems in Figs. 3(e) and 3(f), respectively, confirm that this design will maintain adequate performance after assembly. The modulation will be greater than 0.1 at 110 lps/mm (6.12 lps/mm in object space) for both systems. According to Table 1, this approximately meets the object resolution criteria for the SFOV and exceeds it for the WFOV.

As described in Section 2, although S1,2 are translated apart, they are larger than as designed in Fig. 3(b) because the object field is larger than the chosen SFOV. This results in overlapping crosstalk between S1,2. The amount of overlap is simulated by seeing how much S1 crosses onto the upper half of the sensor, as shown in Fig. 4(a), when the SFOV is extended by a large amount. The boundaries of the larger S1 indicate when the edge apertures of the TAMLO’s lenses begin to vignette S1. By symmetry, S2 would overlap just as much on the lower half of the sensor. The amount of overlap past the midline is significant and would corrupt a major portion of the designed S1,2 area. Since it is not possible to limit the size of S1,2 with a field stop, a vignetting strategy is implemented here. Because the prism deflector deflects the images to opposite sides of the optical axis, a circular vignetting aperture can be inserted right after it to significantly reduce the overlapping crosstalk, as shown in Fig. 3(b). Comparing to Fig. 3(a), the placement of this vignetting aperture will not interfere with the rays from the WFOV system. This technique can preserve most of the designed S1,2, as simulated in Fig. 4(b), where most of the overlapping crosstalk is diminished after insertion of the vignetting aperture. Note that in this demonstration, the relative irradiance is 0.5 at the midline of intermediate image #1 because the circular vignetting aperture was designed to half vignette there. If the residual crosstalk needs to be further reduced, the vignetting can be increased and a calibration in post-processing could recover the irradiance that was lost in the designed S1,2 area. For prism deflector designs that deflect to the same half of the sensor, such as the one in Fig. 1(b), the circular vignetting aperture would not work because it would vignette at the edges of the sensor instead of the center where the overlapping crosstalk occurs. Instead, a similar vignetting solution could be achieved with an annulus vignetting aperture.

 figure: Fig. 4.

Fig. 4. Simulation of S1 image size and overlap when SFOV is extended with (a) no vignetting and (b) inserted vignetting aperture.

Download Full Size | PDF

4. Prototype assembly and raw data

A basic lens housing was designed and 3D printed for assembling the TAMLO prototype, as shown in Fig. 5. The second stock lens prescription did not come in the same diameter as the other lenses, resulting in the large housing in the front. The housing contains railings to align the tri-aperture selector, prism deflector, and sensor along the same axis. To separate the overlapping WFOV and SFOV images, the tri-aperture selector was a manual shutter that blocked either A0 or A1,2. Rectangular aperture blockers were simply inserted into a slot of the housing, resulting in time-sequential acquisition between the two imaging modalities. There is an additional slot after the prism to insert a vignetting aperture to reduce the overlapping stereo image crosstalk. For prototype evaluation, a real sensor was mounted at intermediate image #1. The entrance pupils corresponding to A0-2 can be seen clearly in the frontal view of Fig. 5(c).

 figure: Fig. 5.

Fig. 5. TAMLO (a, b) optomechanical housing design and (c, d) prototype assembly

Download Full Size | PDF

Figure 6 illustrates the raw data acquired from the working TAMLO prototype. The object field is a ruler lying on a planar checkerboard that is tilted so that the object depth linearly increases as a function of image height. For all the images, distortion can be observed by looking at the curvatures of lines that should be straight. Figure 6(a) shows the WFOV image while the SFOV apertures are blocked. Along the vertical axis, ∼5.5 cm of the ruler can be seen. Figure 6(b) shows the SFOV images captured by both stereo apertures simultaneously with the vignetting aperture in place. The stereo images were translated by the prism deflector to the top and bottom half of the sensor without exceeding the WFOV image. Each of the stereo images sees ∼2.5 cm of the ruler. Thus, in quantitative comparison, the WFOV shows twice the SFOV in the vertical or baseline direction when the stereo images are captured simultaneously. Figure 6(c) shows the same stereo images taken with the same exposure settings but without the vignetting aperture. Along the midline of the sensor, the strong presence of the overlapping crosstalk reduces the contrast and the sum of the irradiance results in saturated pixels. There is still some residual crosstalk in Fig. 6(b), but it has been significantly reduced, and the vignetting aperture size can be further optimized in future prototypes. Figure 6(d) shows the overlap between all three views without a method of blocking or encoding either A0 or A1,2, thus resulting in unusable data. Figures 6(e) and 6(f) show S1 and S2, respectively, captured independently without the vignetting aperture. They demonstrate the extent of overlap that causes the crosstalk. Overall, the image quality of these figures appears sufficient, as predicted during the lens design phase.

 figure: Fig. 6.

Fig. 6. TAMLO prototype raw data: (a) WFOV, SFOV simultaneous stereo image capture (b) with and (c) without vignetting aperture, (d) overlapping WFOV and SFOV images, (e) S1 and (f) S2 captured independently without vignetting aperture.

Download Full Size | PDF

5. Calibration and absolute depth mapping

To calculate correct disparity and absolute depth maps, the stereo systems require camera parameter and distortion calibration. For a conventional stereo system with two independent cameras, methods for calibrating camera parameters and distortion have been thoroughly developed [18]. The conventional calibration assumes each of the cameras has rotational symmetry, so the lens distortion can be modeled with a radial polynomial. The TAMLO effectively creates two virtual stereo cameras with their optical axes tilted from each other, but their distortion model is no longer rotationally symmetric. Instead, because the TAMLO captures each stereo image with an off-axis aperture, the distortion model is bilaterally symmetric and can have additional distortion from the finite thickness of the prism deflector. Analytically calibrating the unique distortions in the TAMLO would require rigorous theoretical analysis. Alternatively, a numerical solution can be developed by taking advantage of the additional WFOV data, which was captured with rotational symmetry.

The goal of our calibration was to obtain the intrinsic parameters of the TAMLO optics and model the imaging process as a projection by an ideal thin lens along with distortion correction. The process is summarized here and will be further discussed in future work solely focused on calibration. First the WFOV system was calibrated using the well-established method in [18] so that it could be modeled as a pinhole camera with radial distortion correction. Because the apertures A1,2 are in the same plane as A0, it can be assumed that their representative pinhole models also lie in the same plane as the one for A0. A planar checkerboard was then placed perpendicular to the optical axis of the TAMLO lens and at the working distance conjugate to the image sensor. This object field was captured by IIW and S1,2, as shown in Figs. 7(a) and 7(b), respectively. S2 is like S1 so it isn’t shown. Using the WFOV calibration data, IIW was undistorted (IIWU), as shown in Fig. 7(c). Within the designed SFOV, corresponding image features outlined in red between IIW and S1,2 were determined so that the light rays in the SFOV system could be digitally bent by translating S1,2, pixel by pixel, to the corresponding pixel coordinates that contain the matching image in IIWU. In other words, the light rays from the SFOV system were digitally bent so that they focused with the calibrated chief rays in IIWU. This is illustrated in Fig. 7(c), where the calibrated stereo image S1U is directly overlapping IIWU after digital bending and summed together for visualization. S1U and the area of IIWU underneath S1U look the same, so the brightness is doubled after summation. This technique effectively removes both the distortion from the stereo images and the translation from the prism deflector and converts the TAMLO into a thin lens model. The amount of digital bending is stored for each pixel of S1,2 in a lookup table for calibrating any subsequent stereo images. Although the lookup table was generated from a 2D object field, it applies to 3D object fields because each pixel of S1,2 corresponds to unique object angles defined by the 3D object point location and A1,2. The final step of this calibration was to determine the parameters of the ideal thin lens model. The image distance was already determined from the focal length of the WFOV system’s pinhole model. The object distance to any point on the planar checkerboard placed at the conjugate working distance could be determined using the extrinsic parameters from the pinhole model. Knowing object and image distance, the effective focal length of the thin lens model was found from the thin lens equation. To find the baseline between the pinhole models of A1,2, two object points at different depths were captured by S1,2, which were then calibrated using the lookup table. First order ray tracing was performed from the two known object points to the stereo pinhole models of unknown baseline, refracted by the effective focal length, and then further traced to the corresponding image points in S1U,2U. The baseline could then be algebraically solved. Conceptually, this calibration recovers an ideal thin lens model that obtains depth from defocus.

 figure: Fig. 7.

Fig. 7. Images of (a) WFOV and (b) SFOV before and (c) after thin lens modeling and distortion calibration. Features outlined in red and the image region labeled S1U correspond to the same region of checkerboard squares in the object field.

Download Full Size | PDF

After calibration, fully processed TAMLO results of the tilted ruler and a 3D bladder model were generated to complete the proof of concept. The following results were rotated counterclockwise by 90° from the original image orientation so that the stereo views can be displayed with parallax along the horizontal direction and can be viewed with 3D glasses. Figures 8(a) and 8(d) show the undistorted WFOV images, as indicated by the straightened lines of the ruler and checkerboard. Figures 8(b) and 8(e) show the calibrated SFOV images overlaid as a red and cyan anaglyph, which demonstrates parallax based on the difference in disparity between corresponding object points. Close observation of Fig. 8(b) illustrates a reversal in the arrangement of the cyan and red colors from the 4.5 to 7 cm tick marks. This indicates the center of the image has zero disparity and is the conjugate working distance to the image sensor while the right and left of the image are closer and farther away, respectively. Similarly, Fig. 8(e) illustrates large disparity at the screwdriver, indicating that it is much closer than the bladder model. Figures 8(d) and 8(e) demonstrate good image quality for both WFOV and SFOV imaging in a surgical setting. Finally, the calibrated stereo images were processed to produce accurate depth maps in Figs. 8(c) and 8(f). The color bars have units of pixel disparity, which were then converted to absolute depth values in millimeters as shown on the right of the color bars using the thin lens model parameters found during calibration. Although the original lens design had a 120 mm working distance, tolerances in the 3D printed resulted in a backward shift of the sensor, so the conjugate working distance or 0 pixel disparity in these figures is located at ∼71 mm. According to Fig. 2, the depth resolution increases to ∼1.5 mm at this closer working distance. Figure 8(c) shows the linear change in depth corresponding to the tilted ruler without any depth resolution artifacts, thus confirming depth mapping ability. Similarly, Fig. 8(d) shows the closer distance of the screwdriver and the correct surface profile of the bladder model.

 figure: Fig. 8.

Fig. 8. Fully calibrated TAMLO results of a tilted ruler (top row) and a 3D bladder model (bottom row): (a, d) WFOV, (b, e) SFOV images overlaid as an anaglyph, (c, f) depth maps in units of pixel disparity and absolute depth.

Download Full Size | PDF

6. Conclusion

In this paper, a novel prism-based tri-aperture monocular laparoscopic objective was conceptualized, designed, prototyped, and calibrated. This system achieved WFOV and SFOV imaging with sufficient image quality. Compared to the SFOV, the WFOV sees 2x the object field along the baseline axis. Overlapping crosstalk between the stereo images was also addressed. The calibration of the stereo views using the rotationally symmetric WFOV image as a reference was then introduced. Completion of the calibration enabled removal of distortion from the WFOV and SFOV images, which were then processed to generate accurate, absolute depth maps. The TAMLO certainly demonstrates the potential for optically combining WFOV and SFOV imaging in a compact system. Such a system may pave the way towards restoring the binocular and large, foveated FOV qualities of human vision within the minimally invasive surgical setting. In future work, the calibration details will be fully discussed, and the design of the relay lens group will be considered. Otherwise, the TAMLO can be used as a chip-on-tip system as demonstrated by our prototype. In either case, the system also needs to be configured with proper tri-aperture selector hardware and automated in software.

Funding

National Institute of Biomedical Imaging and Bioengineering (1R01EB18921).

Disclosures

Dr. Hong Hua has a disclosed financial interest in Magic Leap Inc. The terms of this arrangement have been properly disclosed to The University of Arizona and reviewed by the Institutional Review Committee in accordance with its conflict of interest policies.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. J. Geng and J. Xie, “Review of 3-D endoscopic surface imaging techniques,” IEEE Sens. J. 14(4), 945–960 (2014). [CrossRef]  

2. Intuitive, “Da Vinci Vision,” https://www.intuitive.com/en-us/products-and-services/da-vinci/vision#%23.

3. Olympus, “ENDOEYE FLEX 3D,” https://medical.olympusamerica.com/products/laparoscopes/endoeye-flex-3d.

4. S. Y. Bae, R. J. Korniski, M. Shearn, H. M. Manohara, and H. Shahinian, “4-mm-diameter three-dimensional imaging endoscope with steerable camera for minimally invasive surgery (3-D-MARVEL),” Neurophotonics 4(1), 011008 (2017). [CrossRef]  

5. E. Kwan, Y. Qin, and H. Hua, “High resolution, programmable aperture light field laparoscope for quantitative depth mapping,” OSA Continuum 3(2), 194–203 (2020). [CrossRef]  

6. J. Liu, D. Claus, T. Xu, T. Keßner, A. Herkommer, and W. Osten, “Light field endoscopy and its parametric description,” Opt. Lett. 42(9), 1804–1807 (2017). [CrossRef]  

7. V. I. Batshev, A. Machikhin, and Y. Kachurin, “Stereoscopic tip for a video endoscope: problems in design,” Proc. SPIE 10466, 23rd Int. Symp. Atmos. Ocean Opt. Atmos. Phys.104664D, (2017).

8. X. Cui, K. Bin Lim, Y. Zhao, and W. L. Kee, “Single-lens stereovision system using a prism: position estimation of a multi-ocular prism,” J. Opt. Soc. Am. A 31(5), 1074–1082 (2014). [CrossRef]  

9. S.-P. Yang, J.-J. Kim, K.-W. Jang, W.-K. Song, and K.-H. Jeong, “Compact stereo endoscopic camera using microprism arrays,” Opt. Lett. 41(6), 1285–1288 (2016). [CrossRef]  

10. Y. Qin, H. Hua, and M. Nguyen, “Multiresolution foveated laparoscope with high resolvability,” Opt. Lett. 38(13), 2191–2193 (2013). [CrossRef]  

11. Y. Qin, H. Hua, and M. Nguyen, “Characterization and in-vivo evaluation of a multi-resolution foveated laparoscope for minimally invasive surgery,” Biomed. Opt. Express 5(8), 2548–2562 (2014). [CrossRef]  

12. Y. Qin and H. Hua, “Optical design and system engineering of a multiresolution foveated laparoscope,” Appl. Opt. 55(11), 3058–3068 (2016). [CrossRef]  

13. Y. Qin and H. Hua, “Continuously zoom imaging probe for the multi-resolution foveated laparoscope,” Biomed. Opt. Express 7(4), 1175–1182 (2016). [CrossRef]  

14. J. I. Katz, S. Lee, and H. Hua, “Improved multi-resolution foveated laparoscope with real-time digital transverse chromatic correction,” Appl. Opt. 59(22), G79–G91 (2020). [CrossRef]  

15. D. T. Kim, C. H. Cheng, D. G. Liu, K. C. J. Liu, and W. S. W. Huang, “Designing a New Endoscope for Panoramic-View with Focus-Area 3D-Vision in Minimally Invasive Surgery,” J. Med. Biol. Eng. 40(2), 204–219 (2020). [CrossRef]  

16. E. Kwan and H. Hua, “Tri-Aperture Monocular Laparoscopic Objective for Stereoscopic and Wide Field of View Acquisition,” inOSA Imaging and Applied Optics Congress (2021), p. 3Th2D.6.

17. J. E. Greivenkamp, Field Guide to Geometrical Optics (SPIE, 2004).

18. Z. Zhang, “A flexible new technique for camera calibration,” IEEE Trans. Pattern Anal. Mach. Intell. 22(11), 1330–1334 (2000). [CrossRef]  

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (8)

Fig. 1.
Fig. 1. (a) Proposed schematic layout of the prism-based tri-aperture monocular laparoscope design and (b) a magnified view of the TAMLO design with optical layout and key parametric specifications.
Fig. 2.
Fig. 2. The depth resolution range of the TAMLO as a function of pixel size at intermediate image #1 for three different working distances.
Fig. 3.
Fig. 3. Manufacturable TAMLO lens design for (a) WFOV and (b) SFOV acquisition. Corresponding (c, d) polychromatic MTFs and (e, f) tolerance analyses indicate sufficient performance for prototyping.
Fig. 4.
Fig. 4. Simulation of S1 image size and overlap when SFOV is extended with (a) no vignetting and (b) inserted vignetting aperture.
Fig. 5.
Fig. 5. TAMLO (a, b) optomechanical housing design and (c, d) prototype assembly
Fig. 6.
Fig. 6. TAMLO prototype raw data: (a) WFOV, SFOV simultaneous stereo image capture (b) with and (c) without vignetting aperture, (d) overlapping WFOV and SFOV images, (e) S1 and (f) S2 captured independently without vignetting aperture.
Fig. 7.
Fig. 7. Images of (a) WFOV and (b) SFOV before and (c) after thin lens modeling and distortion calibration. Features outlined in red and the image region labeled S1U correspond to the same region of checkerboard squares in the object field.
Fig. 8.
Fig. 8. Fully calibrated TAMLO results of a tilted ruler (top row) and a 3D bladder model (bottom row): (a, d) WFOV, (b, e) SFOV images overlaid as an anaglyph, (c, f) depth maps in units of pixel disparity and absolute depth.

Tables (2)

Tables Icon

Table 1. First order lens design specifications for TAMLO

Tables Icon

Table 2. Lens prescription for design in Fig. 3

Equations (6)

Equations on this page are rendered with MathJax. Learn more.

S F O V A 1 A 2 W F O V A 1 A 2 2 ,
θ D 1 = θ D 2 tan 1 I I W 4 f L G 2 .
θ D 1 = θ D 2 = α sin 1 [ n 2 sin 2 θ i sin α cos α sin θ i ] θ i ,
E B L = B L T S f L G 1 f L G 1 L T S ,
E B L = ( L W D + L E P ) ( | tan ( θ O A 1 ) | + | tan ( θ O A 2 ) ) | ,
d 2 L W D E B L | f L G 1 f L G 2 + t f L G 1 + L W D ( f L G 1 + f L G 2 t ) f L G 1 f L G 2 | P ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.