Super-resolution lensless imaging system based on a fast anti-diffraction algorithm

Zhencong Xiong; Wenjun He; Wenbo Wang; Yuegang Fu; Yuegang Fu

doi:10.1364/OE.500097

1. Introduction

Coded aperture imaging systems (CAIS) have their origins in $x$-ray and $\gamma$-ray imaging [1–4], as these ultrashort waves are too short to bend through glass or crystal. Therefore, CAIS have been developed as an alternative to the typical optical design using common optical lenses. CAIS modulate the wavefront in the way of a coded multiaperture mask pattern. The rays are not subjected to rigorous conditions as they propagate, and thus, the images from different light transmission units overlap on the image plane. This has also been considered to be a form of multiplexing in reference [5,6]. A specific design for the mask pattern and a suitable algorithm to arrange the incident lights for decoding are required. Patterns and algorithms are thus fundamental factors in identifying the class of CAIS. According to this, the CAIS can be classified into two versions.

The first version of CAIS focuses on those patterns based on periodic sequences that include uniformly redundant arrays (URAs) [7–9], modified uniformly redundant arrays (MURAs) [10,11], maximal-length linear shift register sequences (MLSs) [12], and some product matrices based on the above. The patterns invented for the first version of CAIS are varied but share a fundamental law, i.e., the result of the correlation operation between the mask pattern array and its auxiliary matrix (usually, it is the mask pattern array itself) must be an impulse response function. For convenience, we use the $\delta$ function for instead in the following. It indicates that the pattern has a good theoretical signal-to-noise ratio (SNR) [13]. This straightforward principle is still a fundamental law of mask pattern design today [14,15]. However, the fabrication of binary masks relied on mechanical manufacture, which limited the mask in terms of its size and tolerance [16]. Even in 2001, fabricating random mask patterns was a laborious task; therefore, investigations of random masks were rare at the time [12,17]. This was not the only problem associated with CAIS. Although the first version of CAIS defined a classical decoding theory—the Fourier theory and correlation operations model [8] —the image quality of the decoded picture was not satisfactory [18]. Studies did not address the inherent ill-posed problem of CAIS for the underdeveloped computer performance at that time. Even so, the first version of CAIS allowed a large number of options for the mask pattern.

The second version of CAIS is a reinvention of the first. In this version, CAIS has a more vivid name —lensless imaging system [19]. It inherits the mask patterns created in the last century and concentrates on solving the ill-posed problem of the system. The solution to the ill-posed problem is the regularization algorithm and it can decode the overlapped images obtained from the camera [20–22]. A typical example of the algorithm is the total variation augmented Lagrangian alternating direction algorithm (TVAL3) [23–25]. Some researches have concluded that TVAL3 is a type of alternating direction method of multipliers (ADMM) [21]. Even though these regularization algorithms help suppress the ill-posed problem, the nonlinearity of lensless imaging systems limits the improvement of the image quality. Thus, deep learning was introduced [26,27]. However, although typical deep-learning programs are suitable for fitting nonlinear systems, they do not fully suit lensless imaging systems. Iterative regularization algorithms such as TVAL3 serve as their pre-knowledge and optimizer [28]. In this way, deep-learning programs for lensless imaging systems act as supplements and upgrades for the regularization algorithms. The Fresnel zone aperture (FZA) lensless camera has become quite popular recently. Several studies have focused on improving the resolution of the FZA camera with phase-shift methods [29,30].

It can be concluded from the historical introduction of the lensless camera that improving the decoding algorithm can improve the image quality as well as preserve the intrinsic advantages of modern lensless imaging systems. In this paper, we propose a simple but highly efficient decoding method, the Fourier-ADMM algorithm and invent a lensless imaging system based on it. The Fourier-ADMM algorithm can unwrap overlapped images with a single random mask and suppress the diffraction from the tiny holes in the mask. Thus, it enables the implementation of a super-resolution procedure that enhances the resolution of the camera to a value that is significantly better than the theoretical one. The proposed lensless imaging system exhibits robustness, high resolution, and fast decoding without any sophisticated calibration.

2. Methods

2.1 Decoded algorithm of the lensless camera

Typically, in a lens system, each point on the imaging plane corresponds to a point on the objective plane. Because of the imperfections of the transfer function, for a realistic imaging system, a point on the imaging plane is an augmented disk, referred to as the airy disk, and the map of its distributed energy is usually called the point spread function (PSF). However, lensless systems do not use a lens to focus rays; thus, their PSF is neither a point nor a disk, but the mask pattern A itself.

(1)$${\mathbf{A}}*\delta= {\mathbf{A}} {\text{,}}$$

where * is the correlation operator and $\delta$ function stands for an ideal impulse response function. A vivid explanation for Eq. (1) is that, when a point source illuminates the mask pattern A, the image on the image plane is the pattern A itself. Of course, a meaningful object is never a point, but the object can be regarded as a point set. Eq. (1) is the basis of the convolution model for the lensless imaging system. In this way, the image of the lensless camera can be predicted using the Fourier transform. Therefore, the design of the mask pattern A is essential for a lensless imaging system, which should benefit both the data collection and decoding processes. As has been explained in the introduction, the result of the correlation operation between the mask pattern A and its auxiliary matrix G is expected to be a $\delta$ function. However, the correlation result is never a perfect $\delta$ function in reality, and we call the result the central peak. The sharpness of the peak determines the image quality when decoding and ensures that the decoding process has a unique solution,

(2)$${\mathbf{A}}* {\mathbf{G}}=peak {\text{.}}$$

The sharper the peak is, the lesser the sidelobe and background noises contained in the response function of the mask pattern in the lensless system. In other words, it exhibits a good SNR. Then, it can set the convolution model of the lensless system. The image function $I$ is the result of the mask pattern A correlating with the object function $O$. The correlation style can be transformed into the Fourier style,

(3)$$I= {\mathbf{A}}*O =\mathcal{F}^{{-}1}[\mathcal{F}( {\mathbf{A}})\mathcal{F}(O)] =\mathcal{F}^{{-}1}[\mathcal{F}(PSF)\mathcal{F}(O)] {\text{,}}$$

where $\mathcal {F}$ is the Fourier operator. In this way, a straightforward way to solve Eq. (3) is

(4)$$O =\mathcal{F}^{{-}1}\frac{\mathcal{F}(I)\mathcal{F}( {\mathbf{A}})}{\mathcal{F}( {\mathbf{A}})\overline{\mathcal{F}( {\mathbf{A}})}} {\text{,}}$$

where $\overline {\mathcal {F}}$ is the conjugate Fourier operator. Eq. (4) is the reliable decoding process of the first-version lensless imaging system. The simple algorithm framework and fast Fourier transform (FFT) computation module provide fast decoding speed. Not surprisingly, although reasonably good decoded pictures can be derived, the primitive decoding algorithm can no longer optimize the background noise and extract high-frequency information. Considering that the whole lensless system is always ill-conditioned, the image quality is never satisfactory.

The second version of the lensless imaging system desires high-resolution decoding pictures. Therefore, it must be capable of handling the problem of ill-conditioned equations when unwrapping overlapped images. Fortunately, the ill-conditioned equation is not a new problem in the field of mathematics, and the set of resolutions is called regularization methods. When applying regularization in the case of a lensless imaging system, it is necessary to apply a mathematic transform, as regularization methods cannot work with correlations.

(5)$$I= {\boldsymbol{\Phi}}\cdot O {\text{,}}$$

where ${\boldsymbol {\Phi }}$ is a matrix, which is the transform function of the optical system that is in a product form and not a typical convolution type. In this paper, we propose a new algorithm to quickly decode the overlapped images of the lensless imaging system. The algorithm combines the FFT and a typical global optimization method—the ADMM algorithm. Thus, we name it the Fourier-ADMM algorithm. It features a simple algorithm framework but produces decoded pictures at high resolutions. A more original presentation of the optimization function is

(6)$$y=\frac{1}{2}|| {\boldsymbol{\Phi}} \cdot O-I||^{2}_2 {\text{.}}$$

The variables in Eq. (6) are the same as that in Eq. (5). According to the principle of ADMM, the presentation (${\boldsymbol {\Phi }} \cdot O-I$) is divided into two parallel parts, ($I-x$) and (${\boldsymbol {\Phi }} \cdot O-x$), for achieving faster computation and higher accuracy. Here, $x$ is an intermediate variable; the variables $x$ and $O$ converge within the constraints of the hyperparameters $\zeta$ and $\xi$; and $\mu ^{}_1$ and $\mu ^{}_2$ are self-defined coefficients that can control the convergence accuracy.

(7)$$y=\frac{\mu^{}_1}{2}||I-x||^{2}_2+\zeta^{T}(I-x)+\frac{\mu^{}_2}{2}|| {\boldsymbol{\Phi}} \cdot O-x||^{2}_2+\xi^{T}( {\boldsymbol{\Phi}} \cdot O-x) {\text{,}}$$

where the superscript $T$ in Eq. (7) indicates the matrix transposition. Finally, to ensure that the variable $O$ somehow satisfies the nonlinearity of the lensless imaging system, an activation function sgn($\omega$) is introduced. The optimization function of the Fourier-ADMM algorithm is

(8)$$\begin{aligned} y= & \frac{\mu^{}_1}{2}||I-x||^{2}_2+\zeta^{T}(I-x)+\frac{\mu^{}_2}{2}|| {\boldsymbol{\Phi}} \cdot O-x||^{2}_2+\xi^{T}( {\boldsymbol{\Phi}} \cdot O-x)+ \\ & \frac{\mu^{}_3}{2}||O-\omega||^{2}_2+\rho^{T}(O-\omega)+sgn(\omega) {\text{,}} \end{aligned}$$

(9)$$sgn(\omega)=\left\{ \begin{aligned} & \infty & \omega\leq 0\\ & 0 & \omega\geq 0 {\text{,}}\\ \end{aligned} \right.$$

and the solutions of the three variables $x$, $O$, and $\omega$ are

(10)$$\left\{ \begin{aligned} x & = (\mu^{}_1\textbf{1}+\mu^{}_2\textbf{1})^{{-}1}(\mu^{}_2 {\boldsymbol{\Phi}}\cdot O+\mu^{}_1\textit{I}+\zeta+\xi)\\ O & =(\mu^{}_1 {\boldsymbol{\Phi}}^{H} {\boldsymbol{\Phi}}+\mu^{}_2\textbf{1})^{{-}1}[ {\boldsymbol{\Phi}}^{H}(\mu^{}_1x-\xi)-\rho]\\ \omega & =max(\rho/\mu^{}_3+O,0) {\text{,}}\\ \end{aligned} \right.$$

where the symbol $\textbf {1}$ represents an all-ones’ matrix and the superscript $H$ indicates the conjugate transpose of the matrix. The solutions of the three hyperparameters, $\zeta$, $\xi$, and $\rho$, are

(11)$$\left\{ \begin{aligned} \zeta & = \zeta+\mu^{}_1(I-x)\\ \xi & =\xi+\mu^{}_2( {\boldsymbol{\Phi}}\cdot O)-x)\\ \rho & =\rho+\mu^{}_3(O-\omega) {\text{.}}\\ \end{aligned} \right.$$

The traditional ADMM algorithm cannot complete the transformation from the matrix correlation into matrix multiplication; however, as we have shown above, the Fourier transform can substitute a correlation operation. That means a zero-padded function is no longer needed. If we take the Fourier style as an operator, the computation processes for $\boldsymbol {\Phi }$ are quite simple.

(12)$$\left\{ \begin{aligned} f_\Phi(x) & = {\boldsymbol{\Phi}}\cdot x=PSF*x=\mathcal{F}^{{-}1}[\mathcal{F}(PSF)\mathcal{F}(x)]\\ f_{\Phi^{H}}(x) & = {\boldsymbol{\Phi}}^{H}\cdot x=\overline{PSF}*x=\mathcal{F}^{{-}1}[\overline{\mathcal{F}(PSF)}\mathcal{F}(x)]\\ f_{\Phi^{H}\Phi}(x) & = {\boldsymbol{\Phi}}^{H} {\boldsymbol{\Phi}}\cdot x=\overline{PSF}*PSF*x=x {\text{,}}\\ \end{aligned} \right.$$

where $f_\Phi$, $f_{\Phi ^{H}}$, and $f_{\Phi ^{H}\Phi }$ are the Fourier style operators of $\boldsymbol {\Phi }$, $\boldsymbol {\Phi }^{H}$, and $\boldsymbol {\Phi }^{H}\boldsymbol {\Phi }$, respectively. The operator system in Eq. (12) represents an ideal situation and does not take diffraction into consideration. Thus, Eq. (10) can be simplified as

(13)$$\left\{ \begin{aligned} x & = (\mu^{}_1\textbf{1}+\mu^{}_2\textbf{1})^{{-}1}(\mu^{}_2\mathcal{F}^{{-}1}[\mathcal{F}(PSF)\mathcal{F}(O)]+\mu^{}_1\textit{I}+\zeta+\xi)\\ O & =(\mu^{}_1\textbf{1}+\mu^{}_2\textbf{1})^{{-}1}[\mathcal{F}^{{-}1}[\overline{\mathcal{F}(PSF)}\mathcal{F}(\mu^{}_1x-\xi)]-\rho]\\ \omega & =max(\rho/\mu^{}_3+O,0) {\text{.}}\\ \end{aligned} \right.$$

In this equation, nearly all computational processes have been rewritten in the Fourier style, which indicates that the decoding speed can be quite fast. Another interesting fact is that, when eliminating hyperparameters and self-defined parameters, Eq. (13) is almost the same as Eq. (4). This is essential for the super-resolution process.

Considering the diffraction limit, usually, the mask pixel size of the mask pattern is restrictive. In a previous work [20], it followed the rule of

(14)$$p_m=\sqrt{1.22\lambda d} {\text{,}}$$

where the size of the airy disk was equal to that of the hole on the mask. It certainly conforms to the instinct and the experiment, but is not suitable for our lensless imaging system.

The super-resolution technology is a method that makes optical imaging systems work beyond their theoretical resolution, and a main topic in this area is breaking the limit of diffraction. For a lensless imaging system, the diffraction from the tiny holes in the mask is obvious and unavoidable. If we follow the Fourier-ADMM decoding algorithm, the diffraction can hardly influence the decoding process. Suppose t is the transmission function of the hole on the mask and the PSF of the optical system is

(15)$$PSF=\mathcal{F}^{{-}1}[\mathcal{F}( {\mathbf{A}})\mathcal{F}(t)] {\text{,}}$$

where t is a transmission function of the light transmission unit, which can be regarded as the diffraction factor. Diffraction here can hardly be regarded as a simple noise because it now actually acts as a part of the Fourier operator in Eq. (12). When considering a lensless camera as a convolution model, it is part of the imaging process. The decoding process according to Eq. (4) is

(16)$$\left. \begin{aligned} & \mathcal{F}^{{-}1}\frac{\mathcal{F}(I)\mathcal{F}(PSF)}{\mathcal{F}(PSF) \overline{\mathcal{F}(PSF)}}\\ = & \mathcal{F}^{{-}1}\frac{\mathcal{F}(O)\mathcal{F}(PSF)\overline{\mathcal{F}(PSF)}}{\mathcal{F}(PSF) \overline{\mathcal{F}(PSF)}}\\ = & \mathcal{F}^{{-}1}\mathcal{F}(O)=O {\text{.}} \end{aligned} \right.$$

Although it seems reasonable to divide $\overline {\mathcal {F}(PSF)}\mathcal {F}(PSF)$ and eliminate the diffraction directly, it actually causes the ill-posed problem that has been mentioned. Moreover, it is similar to Eq. (13) if we leave out hyperparameters and some constants. The equation is

(17)$$\left\{ \begin{aligned} x\approx & \frac{1}{\mu^{}_1}\mathcal{F}^{{-}1}[\mathcal{F}(PSF)\mathcal{F}(O)]\\ O\approx & \frac{1}{\mu^{}_1+\mu^{}_2}\mathcal{F}^{{-}1}[\overline{\mathcal{F}(PSF)}\mathcal{F}(PSF)\mathcal{F}(O)]\\ = & {\frac{1}{\mu^{}_1+\mu^{}_2}\mathcal{F}^{{-}1}[\overline{\mathcal{F}(\mathbf{A})\mathcal{F}(t)}\mathcal{F}(A)\mathcal{F}(t)\mathcal{F}(O)]}\\ = & \frac{ {|\mathcal{F}(t)|^{2}}}{\mu^{}_1+\mu^{}_2}\mathcal{F}^{{-}1}[\mathcal{F}(O)] {\text{,}} \end{aligned} \right.$$

where $|\mathcal {F}(t)|^{2}$ is a constant array which is exactly the diffraction pattern. Compared to Eq. (16), it is not divided by $\overline {\mathcal {F}(PSF)}\mathcal {F}(PSF)$. Instead, it depends on those hyperparameters and gradually approaches the truth value over several iterations, to solve the ill-posed problem. The additional coefficient, $|\mathcal {F}(t)|^{2}$, is a fix and constant factor and it can be removed through optimization after considerable iterations. Again, it proves that diffraction here has a constant coefficient rather than the simple noise. Thus, when TV regularization is applied in the case of severe diffraction, it becomes ambiguous because it handles the problem in an unsuitable way. Additionally, in a severe diffraction situation when the operator $\mathcal {F}(t)$ is impossible to ignore, the diffraction factor in the 1-norm term $||\cdot {}||^{}_1$ cannot be transferred to a constant for no conjugate term to be coupled with. Eq. (15) to Eq. (17) show how diffraction affects the optical system and why Fourier-ADMM can lessen the diffraction effect. The conjugacy of the equation in mathematics transfers the diffraction factor $\mathcal {F}(t)$ into a constant, rather than keeping the diffraction factor as a partial Fourier operator, in every iteration step. This principle enables the camera to operate with superhigh resolution.

The Fourier-ADMM algorithm has a simple framework and avoids complex intermediate computation processes. The whole algorithm is based on FFT, without any matrix padding, matrix diagonalization, or matrix vectorization. This enables the algorithm to run at a fast speed with a higher degree of regularity and convergence. Its conjugated structure can suppress the diffraction activated by tiny holes.

2.2 Design of the lensless camera

The lensless imaging system is an achievement of minimalism in engineering, only containing two elements—a detector chip and mask. It projects incident rays onto the imaging plane, and rays pass through or are blocked off according to the binary mask pattern. The images from different light transmission holes overlap on the image plane, requiring a specific algorithm to decode them, such as the Fourier-ADMM algorithm mentioned in Section 2.1. The principles of the lensless camera are presented in Fig. 1.

Fig. 1. Lensless imaging system

Download Full Size | PDF

The simple structure of the lensless camera does not imply an easy design. Although the Fourier-ADMM algorithm is a fundamental solution to this problem in realistic lensless imaging systems, overlapping images can cause another problem, namely, mask pattern degeneration. Typically, each hole on the mask pattern corresponds to an image that occupies a certain area. These images are so close that they overlap each other. This leads to mask patterns that are no longer as clear as designed, but are inevitably blurred. A straightforward analogy is that the PSF of a lensed system with optical aberrations is blurred, and the same is true for a lensless imaging system.

Before we start, it is better to clarify the definition of fields of view (FoVs) for lensless imaging systems.

(18)$$FoV=\tan^{{-}1}(\frac{l+\omega_m}{2d}) {,}$$

where l is the object height, $\omega _m$ is the mask size, and $d$ is the mask-detector distance. Eq. (18) defines the maximum sensitive region by the field angle of the lensless system. It is typical but not sufficient for the simulation of mask pattern degeneration. In this paper, we follow the physical model presented in Fig. 1 and present the mask pattern degeneration

(19)$$A_{degrade}=\sum_{x}^{}\sum_{y}^{}\mathrm{circshift}(A,\Delta x,\Delta y) {,}$$

(20)$$\Delta x=\Delta y=\frac{ld}{2Dp_m} {,}$$

where circ-shift is a cyclic shift function, $\Delta x$ and $\Delta y$ are the shifts along the $x$- and $y$- axes, respectively; $l$ and $d$ have the same meaning as that in Fig. 1; $D$ is the objective distance; and $p_m$ is the pixel size of the mask. The mask pattern degeneration $A_{degrate}$ can be proved quantitatively through simulations and experiments.

Because of the mask pattern degeneration, some mask patterns are no longer fit for realistic situations, especially, those that follow a periodic sequence such as URA, MURA, or MLS. Previous papers on mask design focused on the SNR ratio in theory and the only reference for judging was Eq. (2). In other words, if the result of the auto-correlation operation for a mask pattern is a sharpened peak, the mask pattern is “good”. Though periodic sequences have an ideal $\delta$ function, in theory, their periodic impulse components are submersed in background noises triggered by PSF shifts. Thus, the mask pattern degeneration finally degenerates the $\delta$ function. For example, the mask pattern of MLS degenerates visibly. Fig. 2 shows the comparison of mask degeneration between MLS and a random mask. Subgraphs (a) and (b) are the results of simulations based on a $255\times 255$ MLS matrix and the highlighted line is a transverse contour of the degenerated $\delta$ function. Fig. 2(a) shows an ideal situation where the peak is sharpened, corresponding to an excellent image quality; subgraph (b) presents a mask degeneration situation with PSF blurring, which satisfies Eq. (19) as $\Delta$x=$\Delta$y=3. In Fig. 2(b), both sidelobes and background noise are visible. For the random mask, even if the mask degeneration increases significantly, the peak is still sharp. Subgraphs (c) and (d) present the results of simulations based on a $219\times 219$ random mask. The random mask maintains the sharpened peak and good image quality in both an ideal situation and realistic situation.

Fig. 2. Comparison of mask degeneration for MLS and random mask

Download Full Size | PDF

Thus, although the matrix size of the random mask is smaller than that of MLS, the random mask still performs better than the MLS considering mask pattern degeneration. The random mask relies on no periodic sequences. Thus, the sequence warping and recombination has almost no effect on the result.

3. Experiment

3.1 Lensless camera and experimental result

The lensless camera was developed with a 1/1.8" Sony IMX334 CMOS and a random mask. The detector comprised $3840\times 2160$ pixels with a pixel size of $2.0\times 2.0$ $\mu$m. The thickness of the cover glass was 0.33 mm and its refractive index was 1.525. The cover glass–CMOS distance was 0.93 mm. The random mask was created through lithography and had $270\times 270$ pixels of pixel size $8.0\times 8.0$ $\mu$m. The thickness of the glass mask was approximately 2.4 mm. More details are presented in Table 1.

Table 1. Parameters of the lensless camera

View Table

Additionally, the lensless camera was thin, small, and light. The volume of the camera was only $32.5\times 32.5\times 8.1$ $\mathrm {mm}^{3}$ and its weight was 32 g. Fig. 3 shows the lensless camera. The decoded pictures that camera captured are presented below. Fig. 4(a) is the experimental scene; subgraphs from (b) to (i) are the targets of different colors.

Fig. 3. Lensless camera

Download Full Size | PDF

Fig. 4. Experimental results of the lensless camera. Subgraph (a) is the experimental scene. Subgraphs from (b) to (i) are targets of different colors.

Download Full Size | PDF

3.2 Comparison between the Fourier-ADMM algorithm and other iterative regularization algorithms

In Section 2, we introduced the Fourier-ADMM algorithm and explained how it improved the computation speed and suppressed diffraction using its simple but conjugate framework. In this section, we will prove its superiority in terms of the resolution, computation speed, and image quality by comparing the Fourier-ADMM algorithm to other typical iterative regularization algorithms. The mask and camera we use in this section are the same as that mentioned previously.

In Fig. 5, we compare the resolutions of the Fourier-ADMM algorithm, TVAL3 algorithm [21,23–25], and gradient descent (GD) method [31] at an objective distance of 320 mm. The Fourier-ADMM algorithm’s resolution is better than 10.9 mrad and its convergence time is 37.0 s when operating on a 11th Gen Intel Core i9-11900K CPU. The resolution of TVAL3 is no better than 21.9 mrad, because of the diffraction of the system and its nonconjugated framework cannot eliminate the diffraction factor. Although the GD method exhibits the same resolution as the Fourier-ADMM algorithm for its well-known accuracy, its computation time is more than twelve times that of the Fourier-ADMM algorithm.

Fig. 5. Comparison of the resolution, computation speed, and image quality of Fourier-ADMM algorithm, TVAL3, and the gradient-descent (GD) method with the mask pixel size of 8 $\mathrm {\mu } m$.

Download Full Size | PDF

Some frequently used image-quality evaluation functions such as the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM), are not suitable here. PSNR and SSIM are two indexes used to evaluate the image noise. When performing a simulation, they may be useful because the pictures (target and decoding pictures) come from the same resource and the only requirement is to compare their noise. However, PSNR and SSIM do not work well when shooting pictures of a real object. The target and decoded pictures are entirely different, and it is wise to compare their similar features instead. Thus, we introduce a scale-invariant feature transform (SIFT) [32] to evaluate the decoded image quality.

SIFT is an algorithm that can match an object or scene in two different pictures. It has two indexes, matched point and SIFT score. Matched point is the common feature in the two pictures, which is distinguished by the algorithm, and SIFT score offers a quantization reference. Typically, we say that two graphs are well-matched and that they are “similar” when more than ten matched points are picked up and the SIFT score is greater than 1 %. In this paper, we use a lensless camera to capture a target image on a screen at a distance of 320 mm. Then, the SIFT algorithm evaluates the decoded picture and the target to measure the ability of the algorithm to restore the image.

With the SIFT algorithm, we can compare the abilities of different decoding algorithms to recover overlapped pictures. The Fourier-ADMM algorithm is ranked first in terms of both matching points and SIFT scores. For some complex images such as faces, the Fourier-ADMM algorithm can even recover exact details. For example, “Girl with a pearl earring” (Fig. 5, Line 2) has 30 matched points and the score is 9.17%. Thus, we can say that it is perfectly recovered and that it satisfies the visual requirement. TVAL3, however, constantly blurs the recovered picture, irrespective of whether the target is complex or simple. The GD method can achieve a reasonable resolution when no detail is required; for example, “the pepper” (Fig. 5, Line 4). However, it cannot recover some complex scenarios with its limited computational accuracy.

It also presents the comparison of image quality between the Fourier-ADMM algorithm, TVAL3, and the GD method with the mask pixel size of 32 $\mathrm {\mu } m$ in Fig. 6. When the mask pixel size is 32 $\mathrm {\mu }m$ within the diffraction limit, the results of the two algorithms are fairly good. It proves that the unsatisfactory results of TVAL3 and GD algorithm in Fig. 5 are due to diffraction. Additionally, all the algorithms based on the ADMM framework have to drive with initial configurations. The initial configurations of the Fourier-ADMM algorithm for the masks with a pixel size of 8 $\mathrm {\mu }m$ and 32 $\mathrm {\mu }m$ are $\mathrm {\mu }^{}_1=10^{-6}$, $\mathrm {\mu }^{}_2=10^{-6}$, and $\mathrm {\mu }^{}_3=4\times 10^{-5}$. Those of TVAL3 are $\mathrm {\mu }^{}_1=10^{-6}$, $\mathrm {\mu }^{}_2=10^{-6}$, $\mathrm {\mu }^{}_3=4\times 10^{-5}$, and ${\tau }=10^{-4}$, where ${\tau }$ is a part of a 1-norm term $||\cdot {}||^{}_1$. In fact, the concerning two algorithms are not so sensitive to the configurations when their parameters are small enough.

Fig. 6. Comparison of image quality of the Fourier-ADMM algorithm, TVAL3 algorithm, and GD method with a mask pixel size of 32 $\mathrm {\mu } m$.

Download Full Size | PDF

In summary, the Fourier-ADMM algorithm achieves better resolution, faster computation, and higher image quality than TVAL3 and other decoding algorithms.

3.3 Anti-diffraction ability of the super-resolution lensless camera

In this section, we will prove the ability of super-resolution of the lensless camera, although previous sections have presented indirect evidence of it. Super-resolution is a class of methods that allow imaging systems to operate beyond the diffraction limit. Here, we realize super-resolution for a lensless camera with the Fourier-ADMM algorithm that we have introduced above.

According to Eq. (14), the central wavelength $\lambda$ is 550 nm, optical path $d$ is approximately 1.5 mm, and ideal mask pixel size $p^{}_m$ is 31.7 $\mu$m. Once it works beyond the diffraction limit, every point on the imaging plane is augmented, causing the instant picture captured by the detector to overlap heavily. As a result, it is nearly impossible for most regularization algorithms to recover the images. In Fig. 7, the decoding results are the same for TVAL3 and the Fourier-ADMM algorithm when the mask pixel size is 32 $\mu$m within the diffraction limit. The situation changes when the mask pixel size is 16 $\mu$m and 8 $\mu$m, which is smaller than the theoretical size. The TVAL3 results are blurred and few details can be distinguished. However, the Fourier-ADMM algorithm can still compute reasonable graphs with almost linear improvements in resolution. As the objective distance is determined by 320 mm, the resolution can be calculated, which is 40.62 mrad (@32$\mu$m), 18.75 mrad (@16$\mu$m), and 10.93 mrad (@ 8$\mu$m). In other words, the Fourier-ADMM algorithm can unwrap overlapped images robustly beyond the diffraction limit. In this way, we prove the super-resolution capability of the lensless camera.

Fig. 7. Comparison of anti-diffraction abilities of the Fourier-ADMM algorithm and TVAL3.

Download Full Size | PDF

4. Conclusion

In this paper, we combined the Fourier transform and ADMM algorithm to improve the reconstruction speed and suppress the effects of diffraction in lensless imaging systems. We refer to the combined method as the Fourier-ADMM algorithm. This algorithm had a conjugated structure so that it could transfer the diffraction factor from a Fourier operator into a constant instead of keeping the diffraction factor as a partial Fourier operator in every iteration step, which could speed up the convergence of the algorithm. At the same time, we formulated the problem of mask pattern degeneration in practical situations and provided complete theoretical and simulation approaches for it. Following the theory and methodology, we invented a super-resolution lensless camera. The camera could work beyond the theoretical diffraction limit and dramatically enhance the resolution. The super-resolution lensless camera could suppress diffraction to achieve decoded pictures with high resolutions and superior image qualities. It could decode rapidly without any sophisticated calibration.

Funding

National Natural Science Foundation of China (61805025).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. R. Dicke, “Scatter-hole cameras for x-rays and gamma rays,” The Astrophysical Journal 153, L101 (1968). [CrossRef]

2. G. Stroke, G. Hayat, R. Hoover, and J. Underwood, “X-ray imaging with multiple-pinhole cameras using a posteriori holographic image synthesis,” Opt. Commun. 1(3), 138–140 (1969). [CrossRef]

3. P. M. Charalambous, A. J. Dean, J. B. Stephen, and N. G. Young, “Aberrations in gamma-ray coded aperture imaging,” Appl. Opt. 23(22), 4118 (1984). [CrossRef]

4. T. Zhang, L. Wang, J. Ning, W. Lu, X.-F. Wang, H.-W. Zhang, and X.-G. Tuo, “Simulation of an imaging system for internal contamination of lungs using mpa-mura coded-aperture collimator,” Nucl. Sci. Tech. 32(2), 17 (2021). [CrossRef]

5. X. Pan, X. Chen, T. Nakamura, and M. Yamaguchi, “Incoherent reconstruction-free object recognition with mask-based lensless optics and the transformer,” Opt. Express 29(23), 37962–37978 (2021). [CrossRef]

6. X. Pan, X. Chen, S. Takeyama, and M. Yamaguchi, “Image reconstruction with transformer for mask-based lensless imaging,” Opt. Lett. 47(7), 1843–1846 (2022). [CrossRef]

7. E. E. Fenimore, “Coded aperture imaging: predicted performance of uniformly redundant arrays,” Appl. Opt. 17(22), 3562 (1978). [CrossRef]

8. E. E. Fenimore and T. M. Cannon, “Coded aperture imaging with uniformly redundant arrays,” Appl. Opt. 17(3), 337–347 (1978). [CrossRef]

9. T. Cannon and E. Fenimore, “Coded aperture imaging: Many holes make light work,” Opt. Eng. 19(3), 193283 (1980). [CrossRef]

10. S. R. Gottesman and E. E. Fenimore, “New family of binary arrays for coded aperture imaging,” Appl. Opt. 28(20), 4344–4352 (1989). [CrossRef]

11. P. Olmos, C. Cid, A. Bru, J. C. Oller, J. L. de Pablos, and J. M. Perez, “Design of a modified uniform redundant-array mask for portable gamma cameras,” Appl. Opt. 31(23), 4742 (1992). [CrossRef]

12. A. Roberto, “Design of a near-field coded aperture cameras for high-resolution medical and industrial gamma-ray imaging,” Thesis, M.I.T (2001).

13. E. E. Fenimore, “Coded aperture imaging: the modulation transfer function for uniformly redundant arrays,” Appl. Opt. 19(14), 2465 (1980). [CrossRef]

14. G. Kuo, N. Antipa, R. Ng, and L. Waller, “Diffusercam: Diffuser-based lensless cameras,” in Computational Optical Sensing and Imaging 2017, (2017), p. CTu3B.2.

15. K. C. Lee, J. Bae, N. Baek, J. Jung, W. Park, and S. A. Lee, “Design and single-shot fabrication of lensless cameras with arbitrary point spread functions,” Optica 10(1), 72–80 (2023). [CrossRef]

16. E. E. Fenimore and T. M. Cannon, “Uniformly redundant arrays: digital reconstruction methods,” Appl. Opt. 20(10), 1858 (1981). [CrossRef]

17. C. Slinger, M. Eismann, N. T. Gordon, K. Lewis, and R. Wilson, “An investigation of the potential for the use of a high resolution adaptive coded aperture system in the mid-wave infrared,” Proceedings of SPIE - The International Society for Optical Engineering 6714, 671408 (2007). [CrossRef]

18. D. G. Stork and P. R. Gill, “Lensless ultra-miniature cmos computational imagers and sensors,” in The Seventh International Conference on Sensor Technologies and Applications, (2014), pp. 186–190.

19. M. J. Deweert and B. P. Farm, “Lensless coded-aperture imaging with separable doubly-toeplitz masks,” Opt. Eng. 54(2), 023102 (2014). [CrossRef]

20. M. S. Asif, A. Ayremlou, A. Sankaranarayanan, A. Veeraraghavan, and R. G. Baraniuk, “Flatcam: Thin, lensless cameras using coded aperture and computation,” IEEE Trans. Comput. Imaging 3(3), 384–397 (2017). [CrossRef]

21. N. Antipa, G. Kuo, R. Heckel, B. Mildenhall, E. Bostan, R. Ng, and L. Waller, “Diffusercam: lensless single-exposure 3d imaging,” Optica 5(1), 1–9 (2018). [CrossRef]

22. V. Boominathan, J. K. Adams, J. T. Robinson, and A. Veeraraghavan, “Phlatcam: Designed phase-mask based thin lensless camera,” IEEE Trans. Pattern Anal. Mach. Intell. 42(7), 1618–1629 (2020). [CrossRef]

23. C. Li, “An efficient algorithm for total variation regularization with applications to the single pixel camera and compressive sensing,” Thesis, Rice University (2010).

24. X. Yunhai, Y. Junfeng, and Y. Xiaoming, “Alternating algorithms for total variation image reconstruction,” Inverse Problems & Imaging 6(3), 547–563 (2012). [CrossRef]

25. C. Li, “Compressive sensing for 3d data processing tasks: Applications, models and algorithms,” Thesis, Rice University (2012).

26. J. C. Wu, L. C. Cao, and G. Barbastathis, “Dnn-fza camera: a deep learning approach toward broadband fza lensless imaging,” Opt. Lett. 46(1), 130–133 (2021). [CrossRef]

27. Y. Ma, J. Wu, S. Chen, and L. Cao, “Explicit-restriction convolutional framework for lensless imaging,” Opt. Express 30(9), 15266–15278 (2022). [CrossRef]

28. K. Monakhova, J. Yurtsever, G. Kuo, N. Antipa, K. Yanny, and L. Waller, “Learned reconstructions for practical mask-based lensless imaging,” Opt. Express 27(20), 28075–28090 (2019). [CrossRef]

29. T. Nakamura, T. Watanabe, S. Igarashi, X. Chen, K. Tajima, K. Yamaguchi, T. Shimano, and M. Yamaguchi, “Superresolved image reconstruction in fza lensless camera by color-channel synthesis,” Opt. Express 28(26), 39137–39155 (2020). [CrossRef]

30. X. Chen, X. Pan, T. Nakamura, S. Takeyama, T. Shimano, K. Tajima, and M. Yamaguchi, “Wave-optics-based image synthesis for super resolution reconstruction of a fza lensless camera,” Opt. Express 31(8), 12739–12755 (2023). [CrossRef]

31. B. Stephen, P. Neal, C. Eric, P. Borja, and E. Jonathan, Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers (now, 2011).

32. D. G. Lowe, “Distinctive image features from scale-invariant keypoints,” International Journal of Computer Vision 60(2), 91–110 (2004). [CrossRef]

Specification	Value
Volume of the camera/ $m m^{3}$	$32.5 \times 32.5 \times 8.1$
Weight/g	32
Detector	1/1.8" Sony IMX334 CMOS
Detector array	$3840 \times 2160$
Pixel size of detector/ $μ m$	$2.0 \times 2.0$
Thickness of cover glass/mm	0.33
Refractive index of cover glass	1.525
Cover glass–CMOS distance/mm	0.93
Mask	random mask
Mask pixel size/ $μ m$	$8.0 \times 8.0$
Mask array	$270 \times 270$
Thickness of mask/mm	2.4 mm

Super-resolution lensless imaging system based on a fast anti-diffraction algorithm

Abstract

1. Introduction

2. Methods

2.1 Decoded algorithm of the lensless camera

2.2 Design of the lensless camera

3. Experiment

3.1 Lensless camera and experimental result

3.2 Comparison between the Fourier-ADMM algorithm and other iterative regularization algorithms

3.3 Anti-diffraction ability of the super-resolution lensless camera

4. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (7)

Tables (1)

Equations (20)

Optics Express