Abstract
Lensless cameras are a class of imaging devices that shrink the physical dimensions to the very close vicinity of the image sensor by replacing conventional compound lenses with integrated flat optics and computational algorithms. Here we report a diffractive lensless camera with spatially-coded Voronoi-Fresnel phase to achieve superior image quality. We propose a design principle of maximizing the acquired information in optics to facilitate the computational reconstruction. By introducing an easy-to-optimize Fourier domain metric, Modulation Transfer Function volume (MTFv), which is related to the Strehl ratio, we devise an optimization framework to guide the optimization of the diffractive optical element. The resulting Voronoi-Fresnel phase features an irregular array of quasi-Centroidal Voronoi cells containing a base first-order Fresnel phase function. We demonstrate and verify the imaging performance for photography applications with a prototype Voronoi-Fresnel lensless camera on a 1.6-megapixel image sensor in various illumination conditions. Results show that the proposed design outperforms existing lensless cameras, and could benefit the development of compact imaging systems that work in extreme physical conditions.
© 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement
1. Introduction
Imaging devices, such as photographic objectives and microscopes, have long been relying on lenses to focus light and create a projection of the scene onto photosensitive sensors. In such designs, as in human eyes, the focal length of the lens presents a fundamental limit for the overall device form factor. In addition, various optical aberrations preclude the use of single lenses for wide field-of-view (FOV) imaging. Instead, groups of compound lenses that are carefully designed must be used for good image quality.
With the ever increasing demand for compactness of imaging optics, great efforts have been motivated to develop lensless cameras during the past few years. To date two prevailing lensless methods exist, natural compound-eye mimicry [1] and heuristic point spread function (PSF) engineering [2].
Analogous to insect eyes, various artificial compound-eye designs have been proposed to directly mimic the eye structures in nature. Early artificial compound-eye structures tessellate regular Micro-Lens Arrays (MLAs) on planar or curved surfaces. Each lenslet is an independent unit to create a small image. By tessellating an array of lenslets on a planar surface, TOMBO [3,4] produces final images by a backprojection algorithm to stitch sub-images together. To prevent optical cross-talks, a light-blocking layer has to be employed under each micronlens. The number of output image pixels is equal to the number of lenslets. Gabor superlens [5] employs sophisticated multi-layer planar MLAs and aperture arrays along the optical path to re-arrange light rays on the sensor. Followup implementations either stitch the sub-images algorithmically (eCley [6]) or optically (oCley [7]). A recent artificial ommatidia lensless camera [8] employs plasmonic structures to allow for larger FOV for planar compound eyes. Tessellating lenslets on curved substrates is more challenging, but yields a wide FOV. A digital compound-eye camera [9] tessellates a uniform elastomeric micro-lens array with stretchable electronics on a hemisphere to fully resemble the arthropod eyes. CurvACE [10] achieves 180$^{\circ }$ horizontal FOV and 60$^{\circ }$ vertical FOV with polymer microlenses and flexible printed circuits. These methods focus more on the optical tessellation, with less attention on the computational reconstruction. The resolution is relatively low, and only simple scene targets can be imaged.
The other strategy considers the entire system from the perspective of the PSF it generates. Instead of mapping each scene point to a single focus spot on the image plane, as in conventional lens-based systems, such lensless cameras render each scene point as a distributed pattern that covers large areas of the image plane. The captured raw data is interpreted by computational imaging algorithms based on the physical model to reconstruct the latent image. Early implementations make use of amplitude masks [11] to create patterns induced by shadowing effects. In order to improve the numerical conditioning, separable masks have been proposed in a coded-aperture design [12] and FlatCam [13] for optimal amplitude mask design. Fresnel Zone Aperture (FZA) [14] is also used for lensless imaging. These amplitude masks inherently suffer from low light efficiency. Phase-only lensless cameras improve the overall throughput without blocking light. A binary phase profile with tailored odd-symmetry gratings in PicoCam [15,16] produces spiral-shaped PSFs. An important pioneer work is DiffuserCam [17], where a non-optimized diffuser is used to generate caustic patterns as the PSF. A Perlin pattern is later introduced in PhlatCam [18] as a better heuristic PSF. Random lenslets have also been favored in 3D light field microscopy, such as Miniscope3D [19] and Fourier diffuserScope [20]. A random lenslet diffuser has also been used as a lensless on-chip fluorescence microscope [21]. A custom microlens array (MLA) has been adopted as a single optical component in the computational miniature mesoscope [22,23] for fluorescence imaging. Very recently a learned 3D lensless camera [24] co-designs the MLA and the neural network to achieve single-shot 3D imaging without the need of PSF calibration.
Here we report a new diffractive lensless camera with spatially-coded Voronoi-Fresnel phase to improve the imaging performance by leveraging the benefits of both the compound-eye structure and PSF engineering. Inspired by the random and uniform distribution of lenslets in the apposition compound eyes [25], and motivated by an observation from the properties of the engineered heuristic PSFs [18], we find that the image quality after algorithmic reconstruction is positively related to the sparse distribution of high-contrast bright spots in the PSFs, subject to the phase being physically realizable. In other words, lensless cameras favor PSFs with concentrated sparse patterns. From the perspective of Fourier optics, Modulation Transfer Function (MTF) [26], the Fourier counterpart of PSF, offers a more comprehensive measure to quantify this phenomenon. High frequency details are better recoverable in the reconstruction algorithms if the cut-off frequency is kept as large as possible, as well as the MTF is uniformly distributed in all directions in the Fourier domain. However, MTF is a 2D function of spatial frequencies. We therefore define MTF volume (MTFv) as a single-number metric for information maximization. A larger MTFv value encourages the system to transmit more information from hardware to software.
By applying this principle, we find that a certain number of individual point PSFs makes an excellent match to the preferred PSFs in lensless imaging. To this end it is possible to engineer the optimal PSFs from the compound-eye mimicry. We devise a metric-guided optimization framework with modified Lloyd’s iterations to find non-trivial and non-heuristic solutions to the problem. The resulting PSF is a collection of diffraction limited diverse directional spots, with optimized spatial locations and total number. They do not work individually as in conventional compound eyes, but their union is the equivalent PSF to be coupled with computational algorithms to reconstruct sharp images. With significant lower requirement in optics and more flexibility in computation, the proposed method offers a solution to high quality 2D photography devices that work in extreme physical conditions, such as wearable cameras, in-cabin monitoring, and capsule endoscopy. Unconventional imaging applications in various disciplines are expected to be triggered owing to the emerging importance of flat optics imaging.
2. Methods
2.1 Overview
An overview of the proposed Voronoi-Fresnel lensless camera is shown in Fig. 1. Inspired from the delicate distributions of the ant’s ommatidia (Fig. 1(a)), our lensless camera consists of an optimized phase element just a few millimeters above the image sensor (Fig. 1(b)). The phase features an irregular array of quasi-Centroidal Voronoi cells containing a base first-order Fresnel phase function (Fig. 1(c)). By applying the proposed design principle, our Voronoi-Fresnel phase yields optimized PSF (Fig. 1(d)) and MTF (Fig. 1(e)). An image reconstruction pipeline (Fig. 1(f)) is then employed to recover high-quality images from the non-interpretable raw data.
2.2 Voronoi-Fresnel phase
Our Voronoi-Fresnel phase is composed of a base first-order Fresnel phase that is duplicated in various sub-regions on the 2D plane (Fig. 1(c)). The base Fresnel phase at the design wavelength $\lambda$ is defined as
We divide the design space into regions that form a complete tessellation of the whole area. A typical tessellation is a Voronoi diagram, which is a collection of sub-regions that contain points closer to the corresponding generating sites than any other sites. Each sub-region, also known as a Voronoi cell $V_i$, features a center location and a few vertices. The origin of the Fresnel phase function coincides with the center location, and the polygon determined by the vertices creates a distinct aperture for that cell. We refer to each Voronoi cell with the Fresnel phase as a Voronoi-Fresnel cell. The entire Voronoi-Fresnel phase is a collection of all the constituent Voronoi-Fresnel cells,
2.3 PSF and the MTFv metric
The PSF is characterized by the Fresnel diffraction pattern of the entire Voronoi-Fresnel phase on the image sensor, which is the squared magnitude of the diffracted optical field. The panchromatic PSF can then be calculated as an integral of spectral PSFs over the effective spectral range $[\lambda _1, \lambda _2]$,
Since the entire phase function is a collection of the base Fresnel phase, we further investigate the individual PSFs that are generated from the polygonal sub-apertures. As demonstrated in Supplemental Document 1, when adjacent Fresnel phase centers are at the centroids of the cell, the cross interference between individual cells are negligible, with maximum error less than 1%. The equivalent PSF can be approximated by the union of individual PSFs,
These irregular apertures are significant, since the resulting diffraction pattern generates a set of compact directional filters depending on the geometries. We show such directional filtering effect in Fig. 2. The center of a base Fresnel phase is first shifted randomly off the origin. A polygon aperture is then imposed on the phase profile. Here, without loss of generality, we illustrate four simple polygons, a triangle, a rectangle, a pentagon, and a hexagon (Fig. 2(a)). The corresponding panchromatic PSFs are rendered in false color in Fig. 2(b). Long tails in the PSFs show up in the perpendicular directions to the polygon edges due to diffraction. A collection of such directional spots resulted from these cells make up the effective PSF.
It is more effective to evaluate the PSF through its frequency counterpart, the MTF. With higher MTF across all frequencies, the imaging system preserves more details in the scene, a property that has long been used to evaluate the quality of lens-based optics. It can be seen from the log-scale panchromatic MTFs in Fig. 2(c) that the long tails in the spatial domain cause the MTF to drop in certain directions, while information is maintained in other directions. For example, in the triangle and rectangle cases, the elongated directions in the PSF are compressed in the MTF, whereas the squeezed directions in the PSF are then pulled in the frequency domain. This is a result of the scaling property of the Fourier transform. As the polygon becomes more regular in all edges (e.g., the pentagon and hexagon examples), the directional filtering tends to be more isotropic. A circular aperture as in conventional lenses would result in no directional filtering effect at all.
The distribution of the panchromatic PSF, or equivalently the MTF, is key to the imaging performance. However, MTF is a 2D function in the Fourier domain. It is difficult to use directly as a metric to optimize the optical element. We propose to use normalized panchromatic MTF volume (MTFv) as a single-number metric to evaluate the system performance,
The MTFv is not only simple for computation, but also shares a connection with the well-established optics metric, Strehl ratio, that is widely used in image quality evaluation. As derived in Supplemental Document 1, Strehl ratio can be approximated by integrating the MTF as an upper bound [27]. Since the reference diffraction limited quantity only affects the denominator, we can ignore it when used as a figure-of-merit, leading to the above MTFv metric. In addition, MTFv is in principle a generic metric that imposes no restrictions on the intensity distributions of the PSFs, so it is applicable for all phase-type lensless designs. A larger MTFv value encourages more information to be recorded by the optical system, so we can employ it as a guide to seek for optimal PSFs generated by the Voronoi-Fresnel phase.
2.4 Optimization and properties
MTFv is a highly nonlinear and non-convex function of the number of Voronoi regions $K$ and their center locations $\left (\xi _i, \eta _i\right )$. It is challenging to optimize the MTFv over the entire parameter space. Therefore, we do not aim at finding a closed-form solution to maximizing MTFv. Instead, we devise an optimization framework to search for a feasible solution. We show in the Supplemental Document 1 that the effective MTF largely depends on three factors, the diffraction by each aperture; the phase delay terms in the Fourier domain that are introduced by the amount of spatial shifts from the centered PSFs; and the total number of the Voronoi-Fresnel cells $K$.
The smallest dimension in the aperture geometry plays an important role due to diffraction. The phase delay terms would vary dramatically as the distances between each pair of center locations change. Given a certain number of sites $K$ within a fixed area, there are infinite Voronoi tessellations, whereas the MTFv values vary significantly. Taking the extreme case of $K = 1$, we get a compact single spot PSF, but it is impractical to realize such an optical element that covers the whole image sensor. On the other hand, when $K$ approaches infinity, the PSF becomes pixel-wise uniform, and carries no useful information. Therefore, there exists an optimal number $K$ that maximizes the MTFv.
Our solution to the Voronoi-Fresnel phase optimization is based on the above conjecture that the optimal center locations would require uniform distributions of the Voronoi cells over the 2D design space, as well as keeping some degrees of irregularity of each Voronoi region to diversify the spatial filtering of individual PSFs. To take the above factors into account, our optimization framework employs a two-step search scheme. In the first step, we fix $K$, and adopt a quasi-CVT routine to maximize the panchromatic MTFv, as summarized in Algorithm 1. The second step is then a parameter sweep to find the best $K$.
In the first step, we initialize a set of $K$ random coordinates. A Voronoi tessellation is constructed to produce the Voronoi-Fresnel phase function by Eq. (2). The corresponding MTFv is computed by Eq. (5). In each iteration, we compute the centroids of each cell and update the center coordinates with the centroids. A new Voronoi tessellation is constructed, leading to a new MTFv for the updated Voronoi-Fresnel phase. The new site coordinates and the corresponding MTFv are recorded only if the current MTFv is larger than the existing maximum MTFv, otherwise are discarded. After each iteration, we compute the root-mean-square (RMS) distance between the new sites and the previous sites as the residual error. The iterations terminate when the residual error is smaller than a preset tolerance. To make the algorithm fabrication-aware, we set the terminating tolerance tol as a hundredth (0.01) of the smallest feature size that can be fabricated. The iterations terminate after a maximum number of iterations maxiter is reached.
An exemplary quasi-CVT optimization is shown in Fig. 3(a) followed by a parameter sweep step in Fig. 3(b). This simulation uses 240 $\times$ 160 sensor pixels with a pixel pitch of 3.45 $\mu$m. The distance is 2 mm from the optical plane to the sensor plane. The quasi-CVT optimization is initialized by a random set of points as the center locations of the base Fresnel phase (green dots). The green edges mark the individual apertures. As the optimization evolves, the Voronoi-Fresnel cells tend to scatter uniformly in the design space, while a certain degree of irregularity of the apertures is maintained. When the optimization converges, an optimal phase profile is achieved. See Visualization 1 for an animation of the optimization process. To account for the random initialization effects, for each $K$ we run the quasi-CVT steps 100 times with different initializations, and take the mean value as the resulting MTFv. We take the standard deviation among the 100 runs as the uncertainty (Fig. 3(b)). After repeating optimization for different $K$ values in the same manner, the mean MTFv can be plotted against $K$, and a polynomial curve is fitted to the data. The maximum MTFv is found to occur when $K = 23$ in this example. The corresponding Voronoi-Fresnel phase is selected as the final result.
A notable difference between our quasi-CVT algorithm and conventional CVT is that, a certain degree of irregularity should be preserved in quasi-CVT, while the CVT tends to converge to a regular hexagonal grid. Theoretically, the optimal CVT in a 2D plane results in a hexagonal pattern [28]. To demonstrate that the optimization is non-trivial, we compare the optimized result from the above algorithm against a regular rectangular and a hexagonal Voronoi-Fresnel phase in Fig. 4. The design area is 256 $\times$ 256 pixels, and the number of cells is 64 in all three cases. The two regular grids are equivalent to common off-the-shelf microlens arrays. The MTFv on the regular rectangular (MTFv = 1.14) or hexagonal grid (MTFv = 1.19) is significantly lower than the optimized result (MTFv = 2.29). Periodic patterns exist in both regular grids. This attributes to the fact that the regular grids lack of diversity in the directional filtering. Therefore, the optimization is non-trivial. Simple and regular off-the-shelf microlens arrays are not well-suited for this purpose.
The parameter sweep in the second step of the algorithm is time-consuming, especially when the design space is large. We can facilitate the search of best number of sites in the full scale from smaller scales. We evaluate the optimal $K$ in various common scales for various aspect ratios of 1:1, 4:3, 3:2, 16:9, 21:9, and plot the MTFv with the design area in Fig. 5. See detailed analysis in Supplemental Document 1. The result indicates our algorithm is linearly scalable with respect to the design area, regardless of the aspect ratios of the design area.
2.5 Image reconstruction
We formulate the image formation process as a convolution between the ground-truth image and the PSF. It can be expressed in the matrix-vector product form,
where ${\mathbf {x}} \in \mathbb {R}^{3N}$ is the ground-truth image with $N$ pixels in three color channels, ${\mathbf {A}} \in \mathbb {R}^{3N \times 3N}$ is a matrix that represents the color PSF, and ${\mathbf {y}} \in \mathbb {R}^{3N}$ is the captured raw data. The data is degraded by additive noise ${\mathbf {n}}$.There are various methods to solve the image based on the above image formation model. A straightforward way is to solve an inverse problem with effective image priors in an optimization framework. Recent data-driven image reconstruction methods make use of end-to-end deep neural networks to inference the required image. They usually require a large dataset captured with specific designs, such as diffuserCam [29,30], or PhlatCam [31], and cannot be generalized from hardware to hardware. In this paper, we focus on the advantages of the optical optimization, so we adopt the image deconvolution method with a Total Variation (TV) regularization to demonstrate and compare the optical advantages agianst prior works. The TV regularizer encourages the sparsity of edges in natural images, which has proved to be effective in lensless image reconstruction [11,17]. Specifically, we solve
3. Results
3.1 Simulation results
We first validate our methods in simulation, and compare its performance with very recent phase-type lensless cameras that share the most similar inspirations, DiffuserCam [17] and PhlatCam [18]. To ensure a fair comparison, all the simulations are performed with the same conditions. We assume the sensor pixel pitch is 4 $\mu$m, and the resolution of the phase element is 1 $\mu$m, which provides a good sampling rate between the two. The sensor pixels are 256 $\times$ 256, and Voronoi-Fresnel phase is 1024 $\times$ 1024. The distance between the optical element and the sensor is 2 mm. The PSFs for the Voronoi-Fresnel design and the PhlatCam design are obtained through a full spectrum simulation, from 400 nm to 700 nm with an interval of 10 nm (totally 31 spectral bands). We are not able to simulate a diffuser PSF, so we adopt the closest PSF from DiffuserCam, and assume the intensities are the same for the three color channels. To take the spectral variations into account, we use multispectral image data [33,34] for the full spectrum simulation.
An example raw data with their respective PSFs are shown in Fig. 6(a). The raw data from our lensless camera is more structured rather than flattened in the other two cases. For a complete comparison, here we also include the regular Fresnel array configurations in both rectangular grid and hexagonal grid, as shown in Fig. 4. The corresponding reconstructed images are shown in Fig. 6(b). We evaluate the performance by Peak Signal to Noise Ratio (PSNR) and Structural Similarity Index (SSIM) in Fig. 7. The results show that, our Voronoi-Fresnel design outperforms existing methods in both PSNR and SSIM, owing to the MTFv guided optimization. The optimized Voronoi-Fresnel configuration also performs better than the two reference regular Fresnel configurations.
3.2 Experimental results
3.2.1 Prototype design
A prototype Voronoi-Fresnel lensless camera is built with a board-level camera FLIR BFS-GE-16S2C-BD2 and a custom fabricated phase element. The image sensor (Sony IMX273) has 1440 $\times$ 1080 pixels, with a pixel pitch of 3.45 $\mu$m. The optical distance in-between is 3 mm, where the cover glass on the sensor has a thickness of 0.7 mm. The sensor’s angular response covers approximately $\pm \mathrm {20}^{\circ } \times \pm \mathrm {15}^{\circ }$ FOV, so the shift-invariance of the PSF should be maintained within this angular range. Marginal cells beyond the FOV are excluded for this purpose. The optimized phase profile is shown in Fig. 8(a). The excluded cells are empty of the Fresnel phase. A calibration PSF is shown in Fig. 8(b) with a zoom-in area highlighting three adjacent spots. The optical element is fabricated on a 0.5-mm-thick fused silica substrate by a combination of photolithography and reactive-ion etching techniques. At a nominal wavelength of 550 nm, the $2\pi$ phase modulation corresponds to a total depth of 1200 nm, which is discretized into 16-levels for fabrication. The lateral fabrication resolution is 1.15 $\mu$m, making each sensor pixel upsampled by 3 $\times$ 3 of the optical pixel. The optical element is assembled onto the sensor by 3D-printed mechanical mounts (Fig. 8(c)). An optical microscope image (Nikon Eclipse L200N, 5$\times$) and a 3D measurement (Zygo NewView 7300, 20$\times$) of the fabricated sample are shown in Fig. 8(d). Details of the prototype design and fabrication can be found in the Supplemental Document 1.
3.2.2 Prototype results
In the following, we present the raw data in full sensor resolution, and the reconstructed images are cropped to 640 $\times$ 480 pixels to cover the $\pm \mathrm {20}^{\circ } \times \pm \mathrm {15}^{\circ }$ FOV. The object distance is about 30 cm away from the lensless camera. The exposure time is set to make sure all pixels not saturated ($\sim$ 240/255) in 8-bit mode. Gamma correction is disabled in the capture, and raw data are used in the reconstruction.
We evaluate the image performance in two illumination scenarios. The first is self-illuminating images displayed on a computer monitor. Self-illuminating objects emit light in a confined angular range, so little stray light or cross-talk is introduced in the captured data. Fluorescence imaging would be a good application for this mode. Figure 8(e) and 8(f) show the captured raw data. The reconstructed image in Fig. 8(i) reveals fine details in the camera and human face. A zoom-in area of the camera object is shown in Fig. 8(m). We evaluate the spatial resolution with a USAF resolution target in Fig. 8(j), with two cross lines (yellow and magenta) plotted in Fig. 8(n) for the RGB color channels. Line pairs in the central area are clearly visible.
The second is real objects with ambient illumination (Fig. 8(g) and 8(h)). This is a more realistic scenario for photography outside of a lab environment or in applications like endoscopy. Ambient illumination poses a severe challenge for existing flat or lenseless imaging systems, since “stray” light can enter the camera at angles outside the nominal field of view, which is then not modeled by the reconstruction algorithm. We show the reconstructed car toy image example in Fig. 8(k) and (a) printed USAF resolution chart in Fig. 8(l). Since the ambient illumination is not uniform across the scene, the intensity fall-off from on-axis to off-axis, which is vignetting, is more obvious than in the self-illuminating case. The recovered image is still able to reveal sufficient details despite of the complicated environmental light condition. Similarly we evaluate the spatial resolution with the printed USAF resolution target. The cross lines in Fig. 8(p) indicate that the line pairs can be discriminated very well in green and blue channels, while it becomes less reliable in the red channel in the high frequency line pairs. The differences in color channels also indicate residual chromatic aberrations exist in the reconstructed images. We present a more thorough discussion of the prototype, including characteristic measurements, color reproduction, resolution analysis, and additional results in Supplemental Document 1.
4. Discussion
4.1 Advantages over existing designs
The proposed Voronoi-Fresnel lensless camera differentiates itself from existing lensless technologies that focus on either biomimetic optics or PSF engineering. Our design not only makes use of the compound-eye structure optically, but also facilitates the subsequent algorithm computationally. The compositing cells in our design do not work individually as in compound eyes, but form an optimized PSF collectively. Optical cross-talk between adjacent cells is not a concern, so no blocking layer is necessary. The reconstruction algorithm is not to stitch sub-images, but to solve an inverse problem from the physical model. Most prominently, unlike the regular tessellations, our method features an optimized irregular structure. The number of cells is determined by the imaging performance, not by the final image pixel counts.
Our Voronoi-Fresnel phase is mostly inspired by phase-only designs for PSF engineering, but pushes the optimality of the PSF structure further. The resulting PSF not only possesses the properties assumed in existing works, but also exhibits some unique features, such as more compact directional filtering, non-trivial random yet uniform distribution, and an optimal number of diffraction limited spots. As we have demonstrated, the distribution of the focusing units matter significantly. A search on the number of units is also necessary to find the optimal solution. Another related optimization metric is the auto-correlation of the PSF as used in Miniscope3D [19]. Since auto-correlation is not a single number, it cannot be used directly as an optimization metric. A diffraction limited MTF is necessary as the reference instead. We show a simple example in Supplemental Document 1 that, without a reference, this metric could lead to ambiguity. In contrast, our MTFv concept distills the spatial and spectral information in PSF into a single figure-of-merit that can be used for numerical optimization. MTFv requires no reference value, and evaluates the amount of useful information by itself. Different from the custom MLAs in computational miniature mesoscope [22,23] and learned 3D lensless camera [24], the number and geometries of our Voronoi-Fresnel cells are determined automatically by the algorithm.
Additional benefits of the proposed design lie on various aspects. Since our design is a tessellation of the base Fresnel phase, it facilitates large-area design in high spatial resolution, which alleviates the sampling load for pixel-wise phase element optimization on mega-pixel sensors. The resulting smooth microstructures ease fabrication requirements than otherwise high-frequency random features in prior designs. The Voronoi-Fresnel design finds an in-between strategy that takes advantages of both compound-eye and PSF engineering methods.
4.2 Optical characteristics
Compared to its lens counterparts, the performance of the Voronoi-Fresnel lensless imaging can also be characterized and analyzed by effective focal length, spatial resolution, FOV and so on. Since the Voronoi-Fresnel phase is a collection of the same base Fresnel phase, the effective focal length is also the focal length of the base Fresnel phase. The FOV is determined by three factors, as shown in Fig. 9(a). First, the image sensor usually has a cut-off angle $\alpha$, beyond which light is not sensible. Each Voronoi-Fresnel cell is limited by this cut-off angle. The outmost object that the central cell $V_{0}$ can see is $O_{0}$ in the object space. Second, the marginal cell has a lateral center displacement $h_{n}$ from the optical axis. This corresponds to an angular displacement of $\arctan {\left ( h_{n} / z_{o} \right ) }$. The equivalent half FOV is then
The resolution of lensless cameras is usually object-dependent. Since the base Fresnel phase is a first-order approximation of an ideal lens, the individual cell is closely diffraction-limited. There exists a theoretical limiting resolution. However, variations in the aperture shapes do exist between different Voronoi-Fresnel cells. We define an effective diameter $\bar {d}$ for the base Fresnel phase by statistically calculating the Root-Mean-Square (RMS) distance of all the vertices to their respective center locations. Assuming the $i$-th cell has $M_{i}$ vertices, and there are $N$ cells in total, the RMS diameter is
4.3 Depth-dependent PSFs
Similar as the PSFs in DiffuserCam [17] and PhlatCam [18], our PSF expands laterally when the depth is closer to the sensor, while keeping nearly constant when the object is far away. To demonstrate the PSF expansion, we simulate the depth varying PSFs for the optimized example in Fig. 4(c). The results are shown in Fig. 10. We label the centroids of the spots in the PSF at infinity in red as a reference. The centroids of the PSFs at the evaluated depths are shown in green, and the corresponding displacements are denoted with blue lines. It can be seen that the spot locations displace much more for the small object distance of $d = 10$ mm than larger distances of 100 mm and 1000 mm, with respect to the infinity PSFs. The lateral displacements of the spots make the PSFs at different depths distinct and less correlated with each other.
Since the PSFs are depth-dependent, our Voronoi-Fresnel lensless camera is also capable of imaging 3D scenes. To demonstrate this, we show in Fig. 11 an example of refocusing at various depths with the prototype. We place three objects in front of the lensless camera at 50 mm, 100 mm and 200 mm, respectively. With one single shot, we can reconstruct the corresponding images that are focused at different depths. For example, in Fig. 11(a) it is focused at the letters “KAUST” at 50 mm, and the rest of the scene are severely blurred. In Fig. 11(b) the logo in the middle (100 mm) is focused, and the front letters are completely unrecognizable, while the drawing in the back is slightly blurred. Similarly when the drawing in Fig. 11(c) is sharp, the other two objects are out-of-focus.
We also note that, in 2D photography applications the PSFs are often nearly constant. To account for depths, not only geometric displacement in the PSFs but also their spectral variations could be employed to encode depths. The above example only shows the geometric effect. It is also possible to optimize the base phase in each cell to make the PSFs more suitable for 3D applications. Furthermore, in the reconstruction, this requires depth-aware optimization of the phase together with the reconstruction scheme. The end-to-end optimization in a deep learning framework would be explored in future work.
4.4 Limitations and future work
The proposed Voronoi-Fresnel lensless camera also exhibits a few limitations. First, the two-step optimization algorithm is time-consuming due to the parameter sweep step. Although the scalability property helps estimating the correct range, it is still expected to develop more efficient algorithms to accelerate the process for large-area designs. Second, the FOV is currently limited by the cut-off angle of the sensor, which is mainly due to the microlens array embedded in the image sensor. The intensity fall-off may be a concern for wide-angle applications. It may be possible to design telecentric Voronoi-Fresnel phases to mitigate the situation, although fabrication complexity would be increased. Another limitation for the current prototype is the stray light. A common practice in conventional lenses to control stray light is to use baffles. However, it is difficult to implement such an idea without sacrificing the compactness. In the apposition type compound eyes, evolution has developed a screening pigment around the ommatidium to block stray light and cross-talk from entering the photoabsorbing rhabdom. Similar structures can be fabricated by etching deep trenches around the boundaries of Voronoi-Fresnel cells, and fill in black materials. Finally, an end-to-end framework that incorporates the optical element to be optimized together with the reconstruction network is still lacking, and is worth exploring in future work.
5. Conclusion
We have demonstrated a compound eye inspired lensless camera with optimal information preservation in optics. Following the proposed Fourier domain metric MTFv, we are able to tailor a spatially-coded Voronoi-Fresnel phase for better computational image reconstruction. Experimental results show the superior image quality of the prototype lensless camera in various illumination conditions. The advantages of the proposed Voronoi-Fresnel lensless camera offer a simple yet cost-effective imaging solution to significantly reducing the volume of imaging devices. The possibility of mass production makes it a promising candidate in applications such as fluorescence imaging, endoscopy, and internet-of-things.
Funding
King Abdullah University of Science and Technology (Individual Baseline Funding); National Natural Science Foundation of China (62172415); National Key Research and Development Program of China (2019YFB2204104).
Acknowledgments
This work was partly done in the Nanofabrication CoreLabs (NCL) at KAUST.
Disclosures
The authors declare no conflicts of interest.
Data availability
Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.
Supplemental document
See Supplement 1 for supporting content.
References
1. S. Wu, T. Jiang, G. Zhang, B. Schoenemann, F. Neri, M. Zhu, C. Bu, J. Han, and K. D. Kuhnert, “Artificial compound eye: a survey of the state-of-the-art,” Artif. Intell. Rev. 48(4), 573–603 (2017). [CrossRef]
2. V. Boominathan, J. T. Robinson, L. Waller, and A. Veeraraghavan, “Recent advances in lensless imaging,” Optica 9(1), 1–16 (2022). [CrossRef]
3. J. Tanida, T. Kumagai, K. Yamada, S. Miyatake, K. Ishida, T. Morimoto, N. Kondou, D. Miyazaki, and Y. Ichioka, “Thin observation module by bound optics (TOMBO): concept and experimental verification,” Appl. Opt. 40(11), 1806–1813 (2001). [CrossRef]
4. R. Horisaki, K. Kagawa, Y. Nakao, T. Toyoda, Y. Masaki, and J. Tanida, “Irregular lens arrangement design to improve imaging performance of compound-eye imaging systems,” Appl. Phys. Express 3(2), 022501 (2010). [CrossRef]
5. K. Stollberg, A. Brückner, J. Duparré, P. Dannberg, A. Bräuer, and A. Tünnermann, “The Gabor superlens as an alternative wafer-level camera approach inspired by superposition compound eyes of nocturnal insects,” Opt. Express 17(18), 15747–15759 (2009). [CrossRef]
6. A. Brückner, J. Duparré, R. Leitel, P. Dannberg, A. Bräuer, and A. Tünnermann, “Thin wafer-level camera lenses inspired by insect compound eyes,” Opt. Express 18(24), 24379–24394 (2010). [CrossRef]
7. J. Meyer, A. Brückner, R. Leitel, P. Dannberg, A. Bräuer, and A. Tünnermann, “Optical cluster eye fabricated on wafer-level,” Opt. Express 19(18), 17506–17519 (2011). [CrossRef]
8. L. C. Kogos, Y. Li, J. Liu, Y. Li, L. Tian, and R. Paiella, “Plasmonic ommatidia for lensless compound-eye vision,” Nat. Commun. 11(1), 1637 (2020). [CrossRef]
9. Y. M. Song, Y. Xie, V. Malyarchuk, J. Xiao, I. Jung, K. J. Choi, Z. Liu, H. Park, C. Lu, R. H. Kim, R. Li, K. Crozier, Y. Huang, and J. Rogers, “Digital cameras with designs inspired by the arthropod eye,” Nature 497(7447), 95–99 (2013). [CrossRef]
10. D. Floreano, R. Pericet-Camara, S. Viollet, F. Ruffier, A. Brückner, R. Leitel, W. Buss, M. Menouni, F. Expert, R. Juston, M. K. Dobrzynski, G. L’Eplattenier, F. Recktenwald, H. A. Mallot, and N. Franceschini, “Miniature curved artificial compound eyes,” Proc. Natl. Acad. Sci. U.S.A. 110(23), 9267–9272 (2013). [CrossRef]
11. V. Boominathan, J. K. Adams, M. S. Asif, B. W. Avants, J. T. Robinson, R. G. Baraniuk, A. C. Sankaranarayanan, and A. Veeraraghavan, “Lensless imaging: a computational renaissance,” IEEE Signal Process. Mag. 33(5), 23–35 (2016). [CrossRef]
12. M. J. DeWeert and B. P. Farm, “Lensless coded-aperture imaging with separable Doubly-Toeplitz masks,” Opt. Eng. 54(2), 023102 (2015). [CrossRef]
13. M. S. Asif, A. Ayremlou, A. Sankaranarayanan, A. Veeraraghavan, and R. G. Baraniuk, “FlatCam: thin, lensless cameras using coded aperture and computation,” IEEE Trans. Comput. Imaging 3(3), 384–397 (2016). [CrossRef]
14. J. Wu, H. Zhang, W. Zhang, G. Jin, L. Cao, and G. Barbastathis, “Single-shot lensless imaging with Fresnel zone aperture and incoherent illumination,” Light: Sci. Appl. 9(1), 53 (2020). [CrossRef]
15. P. R. Gill, “Odd-symmetry phase gratings produce optical nulls uniquely insensitive to wavelength and depth,” Opt. Lett. 38(12), 2074–2076 (2013). [CrossRef]
16. D. G. Stork and P. R. Gill, “Lensless ultra-miniature CMOS computational imagers and sensors,” Proc. Sensorcomm pp. 186–190 (2013).
17. N. Antipa, G. Kuo, R. Heckel, B. Mildenhall, E. Bostan, R. Ng, and L. Waller, “DiffuserCam: lensless single-exposure 3D imaging,” Optica 5(1), 1–9 (2018). [CrossRef]
18. V. Boominathan, J. Adams, J. Robinson, and A. Veeraraghavan, “PhlatCam: designed phase-mask based thin lensless camera,” IEEE Trans. Pattern Anal. Mach. Intell. 42(7), 1618–1629 (2020). [CrossRef]
19. K. Yanny, N. Antipa, W. Liberti, S. Dehaeck, K. Monakhova, F. L. Liu, K. Shen, R. Ng, and L. Waller, “Miniscope3D: optimized single-shot miniature 3D fluorescence microscopy,” Light: Sci. Appl. 9(1), 171 (2020). [CrossRef]
20. F. L. Liu, G. Kuo, N. Antipa, K. Yanny, and L. Waller, “Fourier DiffuserScope: single-shot 3D Fourier light field microscopy with a diffuser,” Opt. Express 28(20), 28969–28986 (2020). [CrossRef]
21. G. Kuo, F. L. Liu, I. Grossrubatscher, R. Ng, and L. Waller, “On-chip fluorescence microscopy with a random microlens diffuser,” Opt. Express 28(6), 8384–8399 (2020). [CrossRef]
22. Y. Xue, I. G. Davison, D. A. Boas, and L. Tian, “Single-shot 3D wide-field fluorescence imaging with a computational miniature mesoscope,” Sci. Adv. 6(43), eabb7508 (2020). [CrossRef]
23. Y. Xue, Q. Yang, G. Hu, K. Guo, and L. Tian, “Deep-learning-augmented computational miniature mesoscope,” Optica 9(9), 1009–1021 (2022). [CrossRef]
24. F. Tian and W. Yang, “Learned lensless 3D camera,” Opt. Express 30(19), 34479–34496 (2022). [CrossRef]
25. S. Kim, J. J. Cassidy, B. Yang, R. W. Carthew, and S. Hilgenfeldt, “Hexagonal patterning of the insect compound eye: facet area variation, defects, and disorder,” Biophys. J. 111(12), 2735–2746 (2016). [CrossRef]
26. J. W. Goodman, Introduction to Fourier optics (Roberts and Company Publishers, 2005).
27. L. C. Roberts and C. R. Neyman, “Characterization of the AEOS adaptive optics system,” Publ. Astron. Soc. Pac. P. 114(801), 1260–1266 (2002). [CrossRef]
28. D. M. Yan, K. Wang, B. Lévy, and L. Alonso, “Computing 2D periodic centroidal Voronoi tessellation,” in 2011 Eighth International Symposium on Voronoi Diagrams in Science and Engineering, (IEEE, 2011), pp. 177–184.
29. K. Monakhova, J. Yurtsever, G. Kuo, N. Antipa, K. Yanny, and L. Waller, “Learned reconstructions for practical mask-based lensless imaging,” Opt. Express 27(20), 28075–28090 (2019). [CrossRef]
30. D. Bae, J. Jung, N. Baek, and S. A. Lee, “Lensless imaging with an end-to-end deep neural network,” in 2020 IEEE International Conference on Consumer Electronics-Asia (ICCE-Asia), (IEEE, 2020), pp. 1–5.
31. S. S. Khan, V. Sundar, V. Boominathan, A. Veeraraghavan, and K. Mitra, “FlatNet: towards photorealistic scene reconstruction from lensless measurements,” IEEE Trans. Pattern Anal. Mach. Intell. 44(4), 1934–1948 (2020). [CrossRef]
32. S. Boyd, N. Parikh, and E. Chu, Distributed optimization and statistical learning via the alternating direction method of multipliers (NOW Publishers Inc, 2011).
33. F. Yasuma, T. Mitsunaga, D. Iso, and S. K. Nayar, “Generalized assorted pixel camera: postcapture control of resolution, dynamic range, and spectrum,” IEEE Trans. Image Process. 19(9), 2241–2253 (2010). [CrossRef]
34. Y. Li, Q. Fu, and W. Heidrich, “Multispectral illumination estimation using deep unrolling network,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, (2021), pp. 2672–2681.