Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Wave-optics-based image synthesis for super resolution reconstruction of a FZA lensless camera

Open Access Open Access

Abstract

A Fresnel Zone Aperture (FZA) mask for a lensless camera, an ultra-thin and functional computational imaging system, is beneficial because the FZA pattern makes it easy to model the imaging process and reconstruct captured images through a simple and fast deconvolution. However, diffraction causes a mismatch between the forward model used in the reconstruction and the actual imaging process, which affects the recovered image’s resolution. This work theoretically analyzes the wave-optics imaging model of an FZA lensless camera and focuses on the zero points caused by diffraction in the frequency response. We propose a novel idea of image synthesis to compensate for the zero points through two different realizations based on the linear least-mean-square-error (LMSE) estimation. Results from computer simulation and optical experiments verify a nearly two-fold improvement in spatial resolution from the proposed methods compared with the conventional geometrical-optics-based method.

© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Nowadays, advanced imaging systems attract much attention because image information is becoming increasingly important in broader areas. In traditional lens-based imaging systems, optical lenses need sufficient dimensions for focal length and demand a high precision assembly or manufacturing. To break such limitations, lensless cameras use a phase or amplitude mask instead of an optical lens in front of a bare image sensor, in which the object information is reconstructed by digital computation. This novel computational imaging system is more compact and cheaper compared with a lens-based camera [17]. The point spread function (PSF) of a lensless camera is a magnified shadow of the mask instead of a converged point in a lens-based camera. Therefore the captured image is a superposition of shadows and the reconstruction process is necessary [13,812]. In addition, the depth information is simultaneously recorded in the captured image because each object point casts a specific pattern on the sensor. So that the depth-dependent PSF can reconstruct the image at different depths. This character can be applied for three-dimensional (3D) imaging or [13,14] or post-capture refocusing [15,16].

Recently, a kind of lens-less camera employing Fresnel Zone Aperture (FZA) as a coded mask has been proposed [15]. We can analytically calculate the PSF of the FZA lensless camera at an arbitrary distance due to the special mask pattern. This feature allows the captured images to be reconstructed through simple deconvolution at a chosen focus distance. However, such a mask pattern causes some zero points in the frequency response of the imaging system [15]; In addition, a direct current (DC) component also exists in the frequency response because of the imaging principle of mask-based cameras [16]. The conventional method based on the geometrical optics model employs four FZAs with different phases to generate a flat frequency response using fringe scanning (FS) [17]. The FS-based conventional method can realize a fast deconvolution reconstruction at random depth. However, the forward model of the conventional method neglects the diffraction propagation. Diffraction causes extra zero points in the frequency response even though FS has eliminated the zero points caused by the mask itself.

The deep learning based method becomes popular in image processing and has been applied to the FZA lensless camera [12]. This method can reconstruct images with fewer artifacts and higher quality; however, a big dataset and computational power are necessary for model training. And the inherent generalization ability of deep learning is still an open problem. The compressive sensing (CS) method is usually used for image restoration problems. [11]proposed a total variation (TV) based CS model for image reconstruction in FZA lensless camera. It can achieve a high-quality result with a single shot. However, the regularization term can not truly express a scene with dense textures or including multiple objects at different depths [18]. And the CS-based method works well when correctly chosen parameters with sufficient iterative counts, which is time-consuming [10]. Our group models the Optical Transfer Function (OTF) of FZA lens-less camera after FS based on the wave-optics theory [19], and finds that diffraction causes zero-crossing in OTF, which limits the spatial resolution when using the conventional method. Furthermore, the position of the zero-crossing in OTF after FS is regarding the input wavelength and the mask-sensor distance. Initially, we proposed a wavelength-based image synthesizing method, compensating for the lost information by synthesis of different color channels. However, this method can distort the color information of the recovered images.

In this work, we theoretically analyze the imaging model of the single-FZA lensless camera based on wave-optics theory. To our best knowledge, this paper is the first to derive the OTF of a single FZA imaging system including diffraction propagation. We focus on the reasons for zero points in the OTF and propose a novel approach for image synthesis to compensate for it. Compared with [11], our method takes advantage of the FZA pattern, realizing a faster reconstruction based on the physical model from theoretical analysis. After we present the details of the theory, we reveal two realizations of the new image synthesizing strategy. Then we show the simulation and experimental results to confirm the effectiveness. Sec.2 introduces the geometrical sensing model of the FZA lens-less camera and the conventional reconstruction algorithm. Sec.3 describes the modified sensing model using wave-optics theory and the proposed methods. In Sec.4, the numerical simulation verifies the effectiveness of the proposed method. Sec.5 shows the optical experiment results with different settings, which also confirm the effectiveness of the proposed method. The following Sec.6 summarizes this work.

2. Conventional method based on the geometrical optics model

The intensity transmittance $T$ of an FZA mask is modeled as:

$${T}(x_{\rm p},y_{\rm p};\beta, \varphi)=\frac{1}{2}\Bigg[1+{\rm cos}\left\{\beta\left({x_{\rm p}}^{2}+{y_{\rm p}}^{2}\right)+\varphi\right\}\Bigg],$$
where $(x_{\rm p},y_{\rm p})$ are the Cartesian coordinates on the FZA plane, $\beta$ is a parameter controlling the pitch size of the FZA, and $\varphi$ is the initial phase of the mask. As shown in Fig. 1, an FZA mask is placed in front of a bare image sensor with a distance of $d$, and a point light source on the object illuminates the mask with a distance of $t$ on the axis crossing through the mask’s center. The cast shadow on the sensor, $h(x,y;\beta,\varphi )$, is the PSF of the imaging system, which can be geometrically described as:
$$h(x,y;\beta, \varphi)=\frac{1}{2}\Bigg[1+{\rm cos}\left\{\beta({x}^{2}+{y}^{2})+\varphi\right\}\Bigg],$$
where $(x,y)$ are the coordinates on the sensor plane, and,
$$x=\frac{t}{t+d}x_{\rm p}, \; \; y=\frac{t}{t+d}y_{\rm p}.$$

Let us assume the object space $(m,n)$ to be a flat plane parallel to the mask with a distance $t$, $f(m,n)$ is the intensity distribution of a two-dimensional target on the object plane. Assuming the imaging process is shift-invariant, the captured image on the sensor is a 2D convolution result between the PSF and the magnified intensity pattern of the object on the sensor. So the captured image $g$ can be written as:

$$g(x,y;\beta, \varphi)= \iint \frac{1}{2}\left[1+{\rm cos}\left\{\beta\Bigg({\left(x-x^{'}\right)}^{2}+{\left(y-y^{'}\right)}^{2}\Bigg)+\varphi\right\}\right] \cdot \frac{d}{t}f\left(\frac{d}{t}x',\frac{d}{t}y'\right){\rm d}x'{\rm d}y',$$
where $\cdot$ means multiply operation, and
$$x'=\frac{t}{d}m,\; \; y'=\frac{t}{d}n.$$

Use $f_{d}(x',y')$ instead of $\frac {d}{t}f(\frac {d}{t}x',\frac {d}{t}y')$, we will obtain the final 2D convolution formula as:

$$g(x,y;\beta,\varphi)=h(x,y;\beta,\varphi)*f_{d}(x,y),$$
where $*$ is the 2D convolution operator, $f_{d}(x,y)$ is the magnified intensity of the object on the sensor, and the subscript $d$ indicates the mask-sensor distance. After doing Fourier transform on Eq. (6), the imaging system can be written in the frequency domain as:
$$G(u,v;\beta,\varphi)=H(u,v;\beta,\varphi)\cdot F_{d}(u,v),$$
where $G$, $H$, and $F_{d}$ are the Fourier transform of $g$, $h$, and $f_{d}$, respectively. $H(u,v;\beta,\varphi )$ is the frequency response of the system, which can be derived as:
$$H(u,v;\beta, \varphi)=\pi \delta(u,v)+\frac{\pi}{2\beta}{\rm sin}\left\{\frac{\pi^{2}}{\beta}(u^{2}+v^{2})-\varphi\right\}.$$

A strong direct current (DC) component and some zero-crossing exist in the function, as shown in Eq. (8), which lose the object information and degrade the image quality. To solve these problems, [15]proposes a method using Fringe Scanning (FS) [20,21] to generate a smooth and flat frequency response $H_{\rm FS}$ by synthesizing images from multiple FZAs with different phases $(\varphi =0, \frac {1}{2}\pi, \frac {3}{2}\pi, 2\pi )$ as:

$$\begin{aligned}H_{\rm FS}(u,v;\beta)&=j\left\{H(u,v;\beta,0)-H(u,v;\beta,\pi)\right\} +\left\{H\left(u,v;\beta,\frac{3}{2}\pi\right)-H\left(u,v;\beta,\frac{1}{2}\pi\right)\right\}\\ &={\rm exp}\left(j\frac{\pi^2}{\beta}(u^{2}+v^{2})\right), \end{aligned}$$
so that the sensing process based on the FS in the frequency domain can be described as:
$$G_{\rm FS}(u,v)=H_{\rm FS}(u,v;\beta)\cdot F_{d}(u,v).$$

In such a case, a deconvolution filter $H_{\rm FS}^{\rm inv}$ can be applied in the image reconstruction for the FZA lens-less camera. The process is shown as:

$$\begin{aligned}\hat{F}_d(u,v)&=G_{\rm FS}(u,v)\cdot H_{\rm FS}^{\rm inv},\\ s.t.\;H_{\rm FS}^{\rm inv}&={\rm exp}\left({-}j\frac{\pi^2}{\beta}(u^{2}+v^{2})\right), \end{aligned}$$
where $\hat {F}_d$ is the reconstructed image in the frequency domain; inverse Fourier transform $\mathcal {F}^{-1}$ can be applied on it to get the reconstruction image $\hat {f}_d$ in the spatial domain as:
$$\hat{f}_d(x,y)=\mathcal{F}^{{-}1}\left\{\hat{F}_d(u,v)\right\}.$$

The geometrical-optics-based conventional method uses fringe scanning and realizes a simple image reconstruction for FZA lens-less camera; however, it ignores the diffraction effects during the sensing process. A more accurate analog sensing model considering the diffraction propagation is expected for a higher spatial resolution of the reconstruction images.

 figure: Fig. 1.

Fig. 1. Geometrical configuration of FZA lens-less camera with a point-source object.

Download Full Size | PDF

3. Proposed method based on wave optics model

3.1 Imaging model based on wave-optics theory

In a real situation, diffraction happens when the light goes through the mask, especially when the pitch size is smaller for a higher spatial resolution. In such a case, it is necessary to calculate the imaging model based on wave optics for an accurate result.

In this study, we use the angular spectrum method to calculate the wavefront $U(u,v;\beta,\varphi )$ on the sensor plane after diffraction propagation, here the magnification coefficient is omitted for simplicity and the calculation function is given by:

$$U(u,v;\beta,\varphi) = \mathcal{F}^{{-}1}\left\{\mathcal{F}\left\{{T}(x_{\rm p},y_{\rm p};\beta, \varphi)\right\}\cdot H(u,v)\right\},$$

$H(u,v)$is the Fresnel-region diffraction term, shown as:

$$H(u,v) = e^{j\frac{2\pi}{\lambda}d}\cdot{\rm exp}\left\{{-}j\pi\lambda d (u^{2}+v^{2})\right\},$$
where $\lambda$ is the wavelength and $d$ is the distance between mask and sensor. Fourier transform on the FZA mask is given in Eq. (8), so that we can achieve,
$$\begin{aligned}\mathcal{F}\left\{U(u,v;\beta,\varphi)\right\} &= \mathcal{F}\left\{{T}(x_{\rm p},y_{\rm p};\beta, \varphi)\right\}\cdot H(u,v)\\ &= {\rm exp}\left(j\frac{2\pi}{\lambda}d\right){\rm exp}\left\{{-}j\pi\lambda d\left(u^{2}+v^{2}\right)\right\} \cdot\Bigg[\pi\delta(u,v)\\ &+ \frac{\pi}{4\beta j}{\rm exp}\left\{j\left(\frac{\pi^{2}}{\beta}\left(u^{2}+v^{2}\right)-\varphi\right)\right\} - \frac{\pi}{4\beta j}{\rm exp}\left\{{-}j\left(\frac{\pi^{2}}{\beta}\left(u^{2}+v^{2}\right)-\varphi\right)\right\} \Bigg]. \end{aligned}$$

Calculating the inverse Fourier transform of Eq. (15), the wavefront on the sensor is modeled as:

$$\begin{aligned}U(u,v;\beta,\varphi) = {\rm exp}\left(j\frac{2\pi}{\lambda}d\right)\cdot\Bigg[\frac{1}{2} &+\frac{{\pi}^{2}}{4\left({\pi}^{2}-\beta\pi\lambda d\right)}{\rm exp}\left(\frac{-j\beta{\pi}^{2}\left(x^2+y^2\right)}{{\pi}^{2}-\beta\pi\lambda d}-j\varphi\right)\\ &+\frac{{\pi}^{2}}{4\left({\pi}^{2}+\beta\pi\lambda d\right)}{\rm exp}\left(\frac{-j\beta{\pi}^{2}\left(x^2+y^2\right)}{{\pi}^{2}-\beta\pi\lambda d}+j\varphi\right)\Bigg]. \end{aligned}$$

The point spread function ${\rm PSF}(x,y;\beta,\varphi )$ can be calculated as:

$$\scalebox{0.88}{$\begin{aligned}{\rm PSF}(x,y;\beta,\varphi) &= U(u,v;\beta,\varphi)U^{*}(u,v;\beta,\varphi)\\ &= \frac{1}{4}+\frac{{\pi}^{4}}{16{T_1}^{2}}+\frac{{\pi}^{4}}{16{T_2}^{2}}\\ &+{\rm cos}\left(\frac{2\pi}{\lambda}d\right) \cdot\Bigg[\frac{{\pi}^{2}}{4T_1}{\rm cos}\left(\frac{-\beta{\pi}^{2}(x^2+y^2)}{4T_1}+\frac{2\pi}{\lambda}d-\varphi\right) +\frac{{\pi}^{2}}{4T_2}{\rm cos}\left(\frac{\beta{\pi}^{2}(x^2+y^2)}{4T_2}+\frac{2\pi}{\lambda}d+\varphi\right)\Bigg]\\ &+{\rm sin}\left(\frac{2\pi}{\lambda}d\right)\Bigg[\frac{{\pi}^{2}}{4T_1}{\rm sin}\left(\frac{-\beta{\pi}^{2}(x^2+y^2)}{4T_1}+\frac{2\pi}{\lambda}d-\varphi\right) +\frac{{\pi}^{2}}{4T_2}{\rm sin}\left(\frac{\beta{\pi}^{2}(x^2+y^2)}{4T_2}+\frac{2\pi}{\lambda}d+\varphi\right)\Bigg]\\ &+\frac{1}{8}{\rm cos}\left(\frac{-\beta{\pi}^{2}(x^2+y^2)}{4T_1}+\frac{2\pi}{\lambda}d-\varphi\right) \cdot{\rm cos}\left(\frac{\beta{\pi}^{2}(x^2+y^2)}{4T_2}+\frac{2\pi}{\lambda}d+\varphi\right)\\ &+\frac{1}{8}{\rm sin}\left(\frac{-\beta{\pi}^{2}(x^2+y^2)}{4T_1}+\frac{2\pi}{\lambda}d-\varphi\right) \cdot{\rm sin}\left(\frac{\beta{\pi}^{2}(x^2+y^2)}{4T_2}+\frac{2\pi}{\lambda}d+\varphi\right), \end{aligned}$}$$
where $T_1 = {\pi }^{2}-\pi \lambda \beta d$ and $T_2 = {\pi }^{2}+\pi \lambda \beta d$, and $\cdot ^{*}$ is the complex conjugate. With the approximation ${\frac {\lambda \beta d}{\pi }}\approx 0$ and angle sum identity, we have:
$${\rm PSF}(x,y;\beta,\varphi)=\frac{1}{4} +\frac{1}{2}{\rm cos}\left\{\frac{{\beta}^{2}\lambda d}{\pi}\left(x^{2}+y^{2}\right)\right\}{\rm cos}\left\{\beta\left(x^{2}+y^{2}\right)-\varphi\right\} +\frac{1}{4}{\rm cos}^{2}\left\{\beta\left(x^{2}+y^{2}\right)-\varphi\right\}.$$

The Optical Transfer Function (OTF) of the single FZA imaging system is the Fourier transform of Eq. (18), shown as:

$$\begin{aligned}{\rm OTF}(u,v;\beta,\varphi) = \frac{3}{4}\pi\delta(u,v) &+\frac{j{\pi}^{2}}{8\beta(\beta\lambda d+\pi)}{\rm exp}\left(\frac{-j{\pi}^{3}}{\beta(\beta\lambda d+\pi)}(u^2+v^2)-j\varphi\right)\\ &-\frac{j{\pi}^{2}}{8\beta(\beta\lambda d+\pi)}{\rm exp}\left(\frac{j{\pi}^{3}}{\beta(\beta\lambda d+\pi)}(u^2+v^2)+j\varphi\right)\\ &+\frac{j{\pi}^{2}}{8\beta(\beta\lambda d-\pi)}{\rm exp}\left(\frac{-j{\pi}^{3}}{\beta(\beta\lambda d-\pi)}(u^2+v^2)+j\varphi\right)\\ &-\frac{j{\pi}^{2}}{8\beta(\beta\lambda d-\pi)}{\rm exp}\left(\frac{j{\pi}^{3}}{\beta(\beta\lambda d-\pi)}(u^2+v^2)-j\varphi\right)\\ &+\frac{j\pi}{32\beta}{\rm exp}\left(\frac{-j{\pi}^{2}}{2\beta}(u^2+v^2)-2j\varphi\right)\\ &-\frac{j\pi}{32\beta}{\rm exp}\left(\frac{j{\pi}^{2}}{2\beta}(u^2+v^2)+2j\varphi\right). \end{aligned}$$

After using approximation of $\frac {\beta \lambda d}{\pi }\approx 0$ and angle sum identity, we can obtain:

$$\begin{aligned}{\rm OTF}(u,v;\beta,\varphi)=\frac{3}{4}\pi\delta(u,v) &+\frac{\pi}{2\beta}{\rm sin}\left\{\frac{{\pi}^{2}}{\beta}\left(u^{2}+v^{2}\right)+\varphi\right\}{\rm cos}\left\{\lambda d \pi\left(u^{2}+v^{2}\right)\right\}\\ &+\frac{1}{16}\frac{\pi}{\beta}{\rm sin}\left\{\frac{{\pi}^{2}}{2\beta}\left(u^{2}+v^{2}\right)+2\varphi\right\}. \end{aligned}$$

The derived OTF of a single FZA imaging system is helpful to our further analysis. We believe it is the first time to give this formula in the study of the FZA lensless camera. From Eq. (20) we can recognize that a strong zero-frequency bias and some zero-points exist in the frequency response, which magnifies the noise when directly doing the deconvolution reconstruction.

3.2 Proposed methods

From Eq. (20), it can be found that the spatial frequencies where the OTF becomes zero are associated with $\varphi$, $\lambda$, $d$, and $\beta$, which means the initial phase of FZA, the wavelength, mask-sensor distance, and mask’s pitch size. [19] uses FS method to synthesize four captured images with $\varphi =0, \frac {\pi }{2},\pi$, and $\frac {3\pi }{2}$, and the first and third terms in Eq. (20) are eliminated. Then we have the OTF of the FS-based FZA system, $H_{\rm FS}\hbox{-}{\rm diff}$, as:

$$\begin{aligned}H_{\rm FS}\hbox{-}{\rm diff}(u,v;\lambda,d) &= \left\{{\rm OTF}\left(u,v;\beta,\frac{\pi}{2}\right)-{\rm OTF}\left(u,v;\beta,\frac{3}{2}\pi\right)\right\}\\ &+j\left\{{\rm OTF}\left(u,v;\beta,0\right)-{\rm OTF}\left(u,v;\beta,\pi\right)\right\}\\ &= H_{\rm FS}(u,v)\cdot H_{\rm w}(u,v;\lambda,d),\\ s.t.\; &H_{\rm w}(u,v;\lambda,d)={\rm cos}\left\{\pi \lambda d (u^2+v^2)\right\}, \end{aligned}$$
where $H_{\rm FS}(u,v)$ is given in Eq. (9), and $H_{\rm w}(u,v)$ is a diffraction modulator. As shown in Eq. (21), FS method removes most of the amplitude variation in the OTF except for that caused by $H_{\rm w}(u,v)$, and leaves some zero points which can be controlled by $\lambda$ and $d$. The former study [19]focuses on the wavelength and synthesizes different channels of one image to compensate for the lost information, however, it could influence the color quality. In this work, we propose an image synthesis super-resolution theory for FZA lensless camera based on the linear least-mean-square-error (LMSE) estimation and realize it through two different methods focusing on $d$ and $\beta$, respectively. [18] also proposes an image synthesis method by changing mask patterns, however, their objective is to improve the depth aware ability while our objective is to increase the resolution.

3.2.1 Method A: mask sensor distance based image synthesis method

To eliminate the zero points by synthesizing the captured images when using FS method without using the previous color-channel synthesis technique, we need to focus on $d$, the mask-sensor distance. We propose an image synthesis method that captures multiple images at different mask-sensor distances. [22] explains the details of the method. Therefore, we give a brief introduction as follows,

$$\begin{aligned}G_{\rm FS\_1}(u,v) &= H_{\rm FS}\hbox{-}{\rm diff}(u,v;d_{1})\cdot F_{d1}(u,v)+N_{1}(u,v),\\ G_{\rm FS\_2}(u,v) &= H_{\rm FS}\hbox{-}{\rm diff}(u,v;d_{2})\cdot F_{d2}(u,v)+N_{2}(u,v). \end{aligned}$$

$G_{\rm FS\_1}$ and $G_{\rm FS\_2}$ are two captured images in the frequency domain after FS with two mask-sensor distances $d_{\rm 1}$ and $d_{\rm 2}$, respectively. $N_{1}$ and $N_{2}$ are the 2D Fourier transform of the noise component in each imaging system. We use a synthetic method based on LMSE estimation, which is equivalent to the Wiener filter [23,24], and get the reconstruction image in frequency domain as:

$$\hat{F}_{d_{2}} = \textbf{M}\bigg[ M_{1}\cdot G_{\rm FS\_1}\cdot H_{\rm FS}^{\rm inv}(d_{1})\Bigg]+M_{2}\cdot G_{\rm FS\_2}\cdot H_{\rm FS}^{\rm inv}(d_{2}),$$
where the variables of spatial frequency are omitted for simplicity, $\textbf {M}[\cdot ]$ is the resize process, $M_{1}$ and $M_{2}$ are the deconvolution filters. By calculating the minimize of the ensemble average of $|F_{d2}-\hat {F}_{d_{2}}|^{2}$, we can obtain,
$$M_{1}=\frac{H_{\rm w}^{*}(d_1)}{|H_{\rm w}(d_1)|^2+\frac{\eta_1}{\eta_2}|H_{\rm w}(d_2)|^2+\frac{\eta_1}{\eta_f}},$$
$$M_{2}=\frac{H_{\rm w}^{*}(d_2)}{|H_{\rm w}(d_2)|^2+\frac{\eta_2}{\eta_1}|H_{\rm w}(d_1)|^2+\frac{\eta_2}{\eta_f}},$$
where $\eta _f$, $\eta _1$, and $\eta _2$ correspond to the spectral densities of original image, noises in the images captured with distance $d_1$ and $d_2$. The final image $\hat {f}_{d_{2}}$ can be reconstructed after performing inverse Fourier transform on Eq. (23).

The limitation of this method is the alignment error of the mask-sensor distance. Regarding the sensitivity of the method to the deviation of the mask-sensor distances from the designated value, we need to consider the resize process in Eq. (23) and the Wiener filter in Eqs. (24) and (25). The error in the magnification in the resize process due to the error in $d_{1}$ and $d_{2}$ is small and does not influence significantly in our experiment settings. However, if there is an error in zero-crossing frequency, the phase of the erroneous frequency components will be inverted, resulting in the artifact. The error in $d_{1}$ and $d_{2}$ should be small such that the zero-crossing frequency does not change. The calibration of the OTF will be desirable to mitigate the effect of such error.

3.2.2 Method B: Mask pattern based image synthesis method

The third term in Eq. (20) is eliminated if we use the FS method, then the zero-crossing point can only be controlled by $\lambda$ or $d$. [25] proposes a method by using higher harmonics of a binalized FZA without synthesizing information from changing $\lambda$ or $d$ to realize super-resolution, however, the weak higher harmonics signals influence the signal-to-noise ratio (SNR). In this work, exploiting Eq. (20) without applying the FS method, we propose a new realization for image synthesis based on the change of the $\beta$ value, which can be done easily by a spatial light modulator (SLM) without introducing an image alignment error in practice.

Former methods use FS and the DC component in the OTF is eliminated, also, the third term in Eq. (20) is removed as well. Here we use a pair of masks with the same $\beta$ but different $\varphi$, to remove the DC term meanwhile keeping the third term such that the zero points depend on $\beta$, as:

$$\begin{aligned}H_{K} &= {\rm OTF}\left(\beta_{K};0\pi\right)-{\rm OTF}\left(\beta_{K};\frac{3}{2}\pi\right)\\ &=\frac{\pi}{2\beta_{K}}{\rm cos}\left(\pi\lambda d\left(u^2+v^2\right)\right)\Bigg[ {\rm sin}\left(\frac{\pi^2}{\beta_{K}}\left(u^2+v^2\right)\right)\\ &+{\rm cos}\left(\frac{\pi^2}{\beta_{K}}\left(u^2+v^2\right)\right)\Bigg] +\frac{\pi}{8\beta_{K}}{\rm sin}\left(\frac{\pi^2}{2\beta_{K}}\left(u^2+v^2\right)\right), \end{aligned}$$
where the subscript $K$ means the different $\beta$ values. Thus, the images captured from mask patterns with different $\beta$ can be synthesized together and the zero-points can be compensated.

The imaging process could be described as:

$$\begin{aligned}G_{1}(u,v;\beta_{1}) &= H_{1}(u,v;\beta_{1})\cdot F(u,v)+N_{1}(u,v),\\ G_{2}(u,v;\beta_{2}) &= H_{2}(u,v;\beta_{2})\cdot F(u,v)+N_{2}(u,v), \end{aligned}$$
where $G(u,v;\beta _{K})$ is the captured image from the pair of masks with $\beta _{K}$ in frequency domain after processing; F is the original object in frequency domain; $N_{1}$ and $N_{2}$ are the noise in each imaging system.

Here we can use the framework of the LMSE estimation introduced in method A to calculate the reconstruction image as:

$$\hat{F} = P_{1}\cdot G_{1}(\beta_{1})+P_{2}\cdot G_{2}(\beta_{2}),$$
where $P_{1}$ and $P_{2}$ are the deconvolution filters, as:
$$P_{1}=\frac{H_{1}^{*}(\beta_{1})}{|H_{1}(\beta_{1})|^2+\frac{\eta_1}{\eta_2}|H_{2}(\beta_{2})|^2+\frac{\eta_1}{\eta_f}},$$
$$P_{2}=\frac{H_{2}^{*}(\beta_{2})}{|H_{2}(\beta_{2})|^2+\frac{\eta_2}{\eta_1}|H_{1}(\beta_{1})|^2+\frac{\eta_2}{\eta_f}},$$
where $\eta _f$, $\eta _1$, and $\eta _2$ correspond to the spectral density of original image, noise of the images captured with masks of $\beta _1$ and $\beta _2$. And the reconstruction image in spatial domain as:
$$\hat{f}(x,y)=\mathcal{F}^{{-}1}\left\{\hat{F}(u,v)\right\}.$$

The examples shown in the above two methods are a synthesis of two images. In practice, the proposed synthetic method can reconstruct the image from a series of given images by using the LMSE estimation, and the SNR will be improved after using more images.

4. Simulation

In the simulation, we design a time-division multi-pattern FZA mask, which can change the pattern case by case. The computational camera combines an axial stack of the FZA mask and a color image sensor. The sensor is assumed to be put on a linear translation stage so that the mask sensor distance $d$ can be changed. A two-dimensional object is placed before the mask of distance $t$, and the axis crossing through their centers is perpendicular to the mask plane; the whole system is shown in Fig. 2. Assume the pixel count of the sensor is 1024 $\times$ 1024, and the pixel size is 5.5 µm; the size of an FZA is 5.63 mm $\times$ 5.63 mm and noise-free.

 figure: Fig. 2.

Fig. 2. Structure of the simulation system.

Download Full Size | PDF

4.1 Simulation result from method A

Method A is an image synthesis method based on changing mask-sensor distance. In this simulation, we choose two different mask-sensor distances, which are $d_1$ = 5.0 mm and $d_2$ = 6.5 mm. Here we use four masks with the same $\beta$ = 25 ${\rm rad/mm^2}$ but different phases $(\varphi = 0, \frac {\pi }{2}, \pi,$ and $\frac {3\pi }{2})$ to apply the FS technique.

The USAF-1951 resolution chart (Fig. 3(a)) is employed as the 2D test target; The object-mask distance $t$ = 250 mm and object size is 180 mm by 180 mm. The sensing process is simulated based on the wave-optics theory, i.e., the convolution of the input image and the incoherent PSF calculated by the wave propagation simulation (equivalent to Eq. (17)), using Matlab software, and the additive white Gaussian noise makes the Signal-to-Noise Ratio (SNR) equal to 30 dB.

 figure: Fig. 3.

Fig. 3. (a) Original image; Reconstruction image from (b) proposed method A; Reconstruction images from the conventional method at mask sensor distance (c) 5.0 mm and (d) 6.5 mm. The red box indicates the resolution limitation.

Download Full Size | PDF

The reconstruction images are shown in Fig. 3, and the red box in each image indicates the visually assessed resolution limit. Here we use $\frac {\eta _1}{\eta _2}$ = 1 and $\frac {\eta _1}{\eta _f}$ = 0.001. Figures 3(c) and (d) are reconstructed by the conventional FS method without considering the diffraction with a mask-sensor distance of 5.0 mm and 6.5 mm, respectively. As shown in the images, the resolution limitations of Fig. 3(c) and (d) is Group 3/ element 2 and Group 3/ element 3, indicate 4.6 mm and 3.2 mm at the object plane, respectively. After applying the proposed method, the resolution has been improved to 1.6 mm, as shown in Fig. 3(b), Group 4/ element 3 is clear. The numerical calculation shows a two-fold improvement in the spatial resolution after using the proposed method.

The Modulation Transfer Function (MTF) line profiles of the mentioned imaging systems are shown in Fig. 4. Blue line, orange line and green line represent the conventional imaging system with mask sensor distances of 5.0 mm and 6.5 mm, and the synthesized system, respectively. The first zero point is considered as the resolution limitation. From Fig. 4, we can see that the resolution limitations $u_{max}$ of conventional methods are 10.8 cycles/mm and 12.4 cycles/mm on the image sensor plane at mask sensor distances 5.0 mm and 6.5 mm, respectively. Considering the magnification $M = t/d$, the resolution limits on the object plane $L_{obj}$ are given by $M / u_{max}$, 4.63 mm and 3.10 mm, respectively. After using proposed method A, the resolution limitation increases to about 22.9 cycles/mm, which corresponds to $M/22.9 = 1.68$ mm, shows the effectiveness of the method.

 figure: Fig. 4.

Fig. 4. Line profile of the MTF of imaging system using conventional method at mask sensor distance 5.0 mm and 6.5 mm; and the MTF from proposed method A. The spatial frequency is measured on sensor plane up to its Nyquist limitation.

Download Full Size | PDF

Color image parrot (Fig. 5(a)) is also used to test the performance of the proposed method. The parameter settings are the same as in the former simulation and the Gaussian white noise is added to make the SNR of the captured image 30 dB. Figure 5(c) and (d) are from the conventional method at mask sensor distances of 5.0 mm and 6.5 mm, respectively. Figure 5(b) is from method A. By visual evaluation of the zoomed areas shown in the images, we can observe a significant improvement in resolution in the proposed method compared with the conventional method, which confirms the effectiveness of the synthesis of the images captured with different mask-sensor distances. We also use Peak Signal-to-Noise Ratio (PSNR) to evaluate the quality of the reconstruction images. The PSNR of Fig. 5(c) and (d) are 20.3 dB and 20.4 dB, respectively. The PSNR of proposed method A is 23.4 dB, which confirms the performance of the proposed method.

 figure: Fig. 5.

Fig. 5. (a) Original image; Reconstruction image from (b) proposed method A; Reconstruction images from the conventional method at mask sensor distance (c) 5.0 mm and (d) 6.5 mm. The red box shows the zoom area.

Download Full Size | PDF

4.2 Simulation result from method B

Method B uses four images captured from masks with different $\beta$ values for synthesis. In this simulation, the mask sensor distance $d$ is fixed at 6.5 mm and the object mask distance is still 250 mm. We use two pairs of masks, each pair shares the same $\beta$ with different $\varphi$ $(0$ and $\frac {3\pi }{2})$; here we choose $\beta _{1}$= 25 ${\rm rad/mm^2}$, and $\beta _{2}$ = 21 ${\rm rad/mm^2}$.

The conventional geometrical optics based method is used for comparison, which is the same as the previous section. We use four masks that have the same parameter $\beta _{1}$ but different $\varphi$ $(0, \frac {1}{2}\pi, \pi, \frac {3}{2}\pi )$ to do the FS process. Therefore, it should be noted that the captured images are different while the number of captured images is the same. The reconstructed images are shown in Fig. 6, and the red box in each image indicates the resolution limitation. Here we use $\frac {\eta _1}{\eta _2}$ = 1 and $\frac {\eta _1}{\eta _f}$ = 0.0001.

 figure: Fig. 6.

Fig. 6. Reconstruction image using (a) conventional method and (b) proposed method B. The red box indicates resolution limitation.

Download Full Size | PDF

Figure 6(a) is reconstructed by the conventional method and (b) is from proposed method B. As shown in the images, the resolution limitation of Fig. 6(a) is Group 3/ element 3, which indicates 3.2 mm at the object plane. After applying the proposed method, the resolution has been improved to 0.71 mm, as shown in Fig. 6(b), Group 5/ element 2 is clear. The numerical simulation shows a significant improvement in spatial resolution after performing the proposed method.

The MTF line profiles of each imaging system mentioned above are shown in Fig. 7. The blue line shows the MTF of the imaging system from single FZA ($\beta _1$, $0\pi$), the orange line represents the MTF from synthesized FZA pair $H_1$ calculated by Eq. (26), and the green line shows the MTF of proposed method B. The MTF of the conventional method is the same as the case $d$ = 6.5 mm shown in Fig. 4. The resolution limitation ($u_{max}$) of the conventional imaging system is 12.4 cycles/mm at the sensor plane. After applying the proposed method, the loss information at zero points is compensated. The resolution limitation has been expanded to 57.5 cycles/mm on the sensor plane. The resolution limits on the image plane can be obtained by $M / u_{max}$, as 3.13 mm and 0.67 mm, respectively, which shows the effectiveness of the method.

 figure: Fig. 7.

Fig. 7. Line profile of the MTF of imaging system using single FZA ($\beta _1$, $0\pi$); the MTF from synthesized FZA pair $H_1$; and from the proposed method B. The spatial frequency is measured on sensor plane up to its Nyquist limitation.

Download Full Size | PDF

Moreover, the color parrot image is also used as the test target. The target image with 180 mm $\times$ 180 mm is put in front of the mask at a distance of 250 mm, and the mask-sensor distance is 6.5 mm. During the imaging process, white Gaussian noise is added to the captured image to make the SNR equal to 30 dB.

The reconstructed color images are shown in Fig. 8. Figure 8(a) is reconstructed from the conventional method and (b) is from the proposed method B. PSNR of Fig. 8(a) is 20.4 dB. By performing the proposed method B, the PSNR is increased to 23.8 dB. The contribution of the proposed method can be confirmed by the PSNR value. We can also judge the performance of the method based on the details shown in the zoom area. In the zoom area of Fig. 8(b), the patterns around parrot’s eye can be observed more clearly compared to the picture shown in Fig. 8(a); this means the proposed method can work well on the color image.

 figure: Fig. 8.

Fig. 8. (a) Reconstruction image using conventional method; (b) Reconstruction image using proposed method B. The red box shows the zoom area of the according picture.

Download Full Size | PDF

5. Experiment

5.1 Experiment result from method A

In [22], we performed a simple optical experiment using a spatial-division FZA lensless camera and a Siemens star chart shown on the Liquid Crystal Display (LCD) as a test target, the result confirms the effectiveness of the method. However, we did not check the performance of method A on natural objects. We also did not present the refocus reconstruction since the spatial division FZA mask will bring image variation in a short mask-object distance, which is difficult for image registration with multi objects in FS. Post-capture refocusing is a feature of the lensless camera while the lens-based camera needs a lens array or a special pupil mask through the concept of light-field [2629] for refocusing, which is complex in computation and structure.

In the optical experiment, the Holoeye LC 2012 Spatial Light Modulator (1024 pixels $\times$ 768 pixels, the pixel pitch is 36 µm) is used to display the FZA pattern (256 pixels$\times$256 pixels, $\beta = 12 {\rm rad/mm^2}$) as shown in Fig. 9(a). Two natural 3D objects are used for checking the performance of the proposed method A. Figure 9(b) shows object A and object B used in the experiment. Figure 9(c) shows the diagram of the experimental setup. The selected mask-sensor distance combination should let the first zero-crossing of the synthesized system appear at as high a frequency as possible through method A. Although the difference between the two distances should be small for this purpose, the S/N decreases when it is too small. Therefore, we experimentally selected the combination of the distances based on the balance of the limitation of resolution enhancement and the S/N of the final image. In addition, the minimum mask-sensor distance is 7.0 mm due to the physical constraints of the structure of the SLM and the polarizer. As the thinner camera is desirable, we choose the distance $d$ equal to 7.0 mm and 7.5 mm based on these factors; $t_{1}$ is 130 mm, and $t_{2}$ is 170 mm. The image sensor is CMOSIS CMV4000 (2048 pixels $\times$ 2048 pixels with the pixel pitch of 5.5 µm). For comparison, we consider the Alternating Direction Method of Multipliers (ADMM) with TV regularization [10], which is a popular method for image reconstruction. In this experiment, the ADMM-TV is used in refocusing reconstruction, namely the PSFs for two different distances are used to estimate two images focused on different depths. Each ADMM reconstruction is through 500 counts iterations with tuning parameter $\tau$=0.5.

 figure: Fig. 9.

Fig. 9. (a) Spatial Light Modulator (SLM); (b) Object A and Object B; (c) Diagram of experimental setup.

Download Full Size | PDF

The reconstruction results at focused distances of 130 mm and 170 mm are shown in Fig. 10. The red box represents the in-focus object while the blue box indicates the out-of-focus object. We can observe a blurred pattern around the edges of the image from conventional methods, like Fig. 10(a), (b), (c), and (d), which are caused by the mismatch between geometrical optics based forward model and the diffraction propagation. After performing the proposed method A based on wave optics theory, the blurred pattern has been removed and the edges are sharper, as indicated in Fig. 10(e) and (g). However, we acknowledge that the noise situation is a bit serious in the proposed method A because we need to manually change the mask sensor distance, this process brings the calibration artifacts for image registration. By visual evaluation, the results from ADMM-TV have a similar quality to the proposed method; however, the iteration-based method takes a much longer time in the reconstruction compared with the proposed method and needs to choose tuning parameters carefully.

 figure: Fig. 10.

Fig. 10. Left: Under focused distance 130 mm: (a) and (b) are the results from conventional method at mask sensor distance 7.0 mm and 7.5 mm, (e) and (f) are the results from proposed method A and ADMM-TV; Right: Under focused distance 170 mm: (c) and (d) are the results from conventional method at mask sensor distance 7.0 mm and 7.5 mm, (g) and (h) are the results from proposed method A and ADMM-TV; Red box shows the zoomed area from in-focus object and blue box shows the zoomed area from out-of-focus object.

Download Full Size | PDF

In addition, by observing the differences between in-focus and out-of-focus objects as shown in red boxes and blue boxes, the post-capture refocusing function of the FZA lensless camera is apparently confirmed when the object is in-focused, the reconstruction image is clear, while it is blurred when out-of-focused. Although we can still observe a difference between the images from ADMM-TV at different focusing distances, the blurred out-of-focused image has been reconstructed to some degree because of the TV regularization [30]. This result means the proposed method can keep the fidelity of the recovered images at the chosen focus distance.

For quantified analysis of method A based on the SLM-imaging system, we experimented on the 1951 USAF resolution chart (Fig. 11(a)) to check the performance. The same SLM shown in Fig. 9(a) is used to display FZA patterns ($\beta = 12 {\rm rad/mm^2}$ ). The experiment set is shown in Fig. 11(e), where the mask-sensor distances are the same as the former experiment: $d_{1}$=7.0 mm, and $d_{2}$=7.5 mm. Object-mask distance is set as $t$ = 250 mm. The resolution chart is shown on the LCD monitor (ASUS VA32AQ, pitch size 0.27 mm) in size 222 mm $\times$ 227 mm. The reconstruction results are shown in Fig. 11, where (b) is from proposed method A, (c) and (d) are from the conventional method with msak-senosr distacne 7.0 mm and 7.5 mm, respectively. The red box indicates the resolution limitation by visual evaluation, where it is 3.0 mm and 3.5 mm in the object plane of (c) and (d). Method A successfully increases the resolution to 1.4 mm in the object plane as shown in (b). The results prove the theory in real situations.

 figure: Fig. 11.

Fig. 11. (a) Resolution chart; the first row in (b), (c) and (d) is the reconstruction images from the proposed method, the conventional method at mask sensor distance 7.0 mm and 7.5 mm, respectively. Red box indicates the visually assessed resolution limitation, the zoomed area of the red box and the according line profile are in the third and second row of (b), (c) and (d), respectively. The line profiles are drawn after the min-max normalization of the image intensity. (e) is the Experiment platform.

Download Full Size | PDF

5.2 Experiment result from method B

We do experiments with method B to check its effectiveness in practice. Firstly, we also use the 1951 USAF resolution chart (Fig. 12(a)) to quantitatively evaluate the performance, shown on the LCD monitor in size 222 mm $\times$ 227 mm as the former experiment. The selected $\beta$ combinations here should let the synthesized frequency response have a highest possible first zero-crossing using method B. Although the difference between the two $\beta$ should be small for this purpose, the S/N decreases when it is too small. As in the case of method A, we need to select the combination of $\beta$ based on the balance of the limitation of resolution enhancement. In addition, the pixel width of SLM and the size of FZA pattern limit the range of $\beta$, therefore, two pairs of masks with $\beta = 12 {\rm rad/mm^2}$ and $\beta = 9 {\rm rad/mm^2}$ are chosen; the phase of the mask in each pair is 0 and $\frac {3}{2}\pi$. The conventional method for comparison uses four masks of $\beta = 12 {\rm rad/mm^2}$ with different phases and four masks of $\beta = 9 {\rm rad/mm^2}$ with different phases to do FS. The experiment settings are as same as Fig. 11(e). The mask sensor distance $d$ is 7.5 mm and the object mask distance $t$ is 250 mm. The experiment results are shown in Fig. 12. (c) and (d) are reconstructed by the conventional method with $\beta$ = 9 and 12, respectively. Figure 12(b) is the result of the proposed method B. As shown in the images, the red box indicates the resolution limitation by visual evaluation, where it is 3.5 mm and 3.0 mm of (d) and (c) in the object plane. After using the proposed method, the resolution limitation is increased to 1.6 mm in the object plane shown in (b). This result confirms the effectiveness of method B.

 figure: Fig. 12.

Fig. 12. (a) Resolution chart; Reconstruction images from (b) proposed method; conventional method using (c) $\beta$=9 and (d) $\beta$=12. Red box indicates the visually assessed resolution limitation. The zoomed area of the red box, and the according line profile are in the third and second rows of (b), (c), and (d), respectively. The line profiles are drawn after the min-max normalization of the image intensity.

Download Full Size | PDF

We also apply method B in the refocusing reconstruction experiment with natural objects. We use the same experiment settings of objects shown in Fig. 9 to capture the images, while only choose single mask-sensor distance $d$ = 7.5 mm. Two pairs of masks with $\beta = 12 {\rm rad/mm^2}$ and $\beta = 9 {\rm rad/mm^2}$ are chosen and the phase of the mask in each pair is 0 and $\frac {3}{2}\pi$ as before. We reconstruct the images by conventional methods and method B, the results are shown in Fig. 13.

 figure: Fig. 13.

Fig. 13. Reconstruction images from conventional method at focus distance 130 mm using (a) $\beta$=9; (b) $\beta$=12; at focus distance 170 mm using (d) $\beta$=9; (e) $\beta$=12; images are reconstructed from proposed method B at distance (c) 130 mm and (f) 170 mm. Red box shows the zoomed area from in-focus object and blue box shows the zoomed area from out-of-focus object.

Download Full Size | PDF

Figure 13(a) and (d) are reconstructed images using FZA ($\beta$=9) by conventional method at focus distances of 130 mm and 170 mm, respectively. Figure 13(b) and (e) are the magnified pictures of Fig. 10(b) and (d); Fig. 13(c) and (f) are reconstructed images using method B at a focus distance of 130 mm and 170 mm, respectively. The red boxes show the zoomed area from the in-focused object. By comparison, a significant improvement could be observed where the blurred patterns are sharper after using method B. In addition, compared with the results from method A, method B doesn’t bring extra alignment error and avoid artifacts, which is the advantage of method B. Blue box shows the zoomed area from the out-of-focused objects, we can easily observe the differences between in-focus and out-of-focus objects when changing the focused distance, which confirms the refocusing reconstruction ability of the proposed method B.

6. Conclusion

This work proposes a LMSE-based image synthesis approach for super-resolution in FZA lens-less camera that suffers from diffraction influences. Two different realization methods for image synthesis are presented. The first method synthesizes different images by changing mask-sensor distances. The second method optimizes the realization method by only changing the mask pattern without other actions and only requires four images. In the future, we will study the optimized combination of different mask-sensor distances for method A and different mask patterns for method B to make the proposed system more robust. Numerical simulation and optical experiments all confirm the effectiveness of the proposed methods. Numerical analysis shows that the proposed methods can increase the resolution to almost two times more than the former method which is based on the geometrical optics model. To check the practicality of the proposed method, we do optical experiments on the 2D resolution chart and 3D objects, and the results all show the resolution improvement. The introduced methods can also work well in the refocusing reconstruction, a promising feature of the lensless camera. The proposed approach will also be applicable to the image synthesis from multiple cameras with fixed mask patterns instead of using an SLM-based mask. It requires a synthesis technique that accommodates the difference of viewpoint positions.

Funding

Japan Science and Technology Agency (JST SPRING, Grant Number JPMJSP2106); Tokyo Institute of Technology (TAC-MI scholarship).

Acknowledgments

Portions of this work were presented at the IEEE International conference on image processing in 2021, Paper ID 1231.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. D. G. Stork and P. R. Gill, “Optical, mathematical, and computational foundations of lensless ultra-miniature diffractive imagers and sensors,” International Journal Advances in Systems and Measurements 7, 201–208 (2014).

2. M. J. DeWeert and B. P. Farm, “Lensless coded-aperture imaging with separable doubly-toeplitz masks,” Opt. Eng. 54(2), 023102 (2015). [CrossRef]  

3. M. S. Asif, A. Ayremlou, A. Sankaranarayanan, A. Veeraraghavan, and R. G. Baraniuk, “Flatcam: Thin, lensless cameras using coded aperture and computation,” IEEE Trans. Comput. Imaging 3(3), 384–397 (2017). [CrossRef]  

4. V. Boominathan, J. K. Adams, M. S. Asif, B. W. Avants, J. T. Robinson, R. G. Baraniuk, A. C. Sankaranarayanan, and A. Veeraraghavan, “Lensless imaging: A computational renaissance,” IEEE Signal Process. Mag. 33(5), 23–35 (2016). [CrossRef]  

5. X. Pan, T. Nakamura, X. Chen, and M. Yamaguchi, “Lensless inference camera: incoherent object recognition through a thin mask with lbp map generation,” Opt. Express 29(7), 9758–9771 (2021). [CrossRef]  

6. X. Pan, X. Chen, T. Nakamura, and M. Yamaguchi, “Incoherent reconstruction-free object recognition with mask-based lensless optics and the transformer,” Opt. Express 29(23), 37962–37978 (2021). [CrossRef]  

7. X. Pan, X. Chen, S. Takeyama, and M. Yamaguchi, “Image reconstruction with transformer for mask-based lensless imaging,” Opt. Lett. 47(7), 1843–1846 (2022). [CrossRef]  

8. N. Antipa, P. Oare, E. Bostan, R. Ng, and L. Waller, “Video from stills: Lensless imaging with rolling shutter,” in IEEE International Conference on Computational Photography (ICCP), (IEEE, 2019), pp. 1–8.

9. T. Nakamura, K. Kagawa, S. Torashima, and M. Yamaguchi, “Super field-of-view lensless camera by coded image sensors,” Sensors 19(6), 1329 (2019). [CrossRef]  

10. K. Monakhova, J. Yurtsever, G. Kuo, N. Antipa, K. Yanny, and L. Waller, “Learned reconstructions for practical mask-based lensless imaging,” Opt. Express 27(20), 28075–28090 (2019). [CrossRef]  

11. J. Wu, H. Zhang, W. Zhang, G. Jin, L. Cao, and G. Barbastathis, “Single-shot lensless imaging with fresnel zone aperture and incoherent illumination,” Light: Sci. Appl. 9(1), 53 (2020). [CrossRef]  

12. J. Wu, L. Cao, and G. Barbastathis, “Dnn-fza camera: a deep learning approach toward broadband fza lensless imaging,” Opt. Lett. 46(1), 130–133 (2021). [CrossRef]  

13. M. S. Asif, “Lensless 3d imaging using mask-based cameras,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), (IEEE, 2018), pp. 6498–6502.

14. N. Antipa, G. Kuo, R. Heckel, B. Mildenhall, E. Bostan, R. Ng, and L. Waller, “Diffusercam: lensless single-exposure 3d imaging,” Optica 5(1), 1–9 (2018). [CrossRef]  

15. K. Tajima, T. Shimano, Y. Nakamura, M. Sao, and T. Hoshizawa, “Lensless light-field imaging with multi-phased fresnel zone aperture,” in IEEE International Conference on Computational Photography (ICCP), (2017), pp. 76–82.

16. V. Boominathan, J. K. Adams, J. T. Robinson, and A. Veeraraghavan, “Phlatcam: Designed phase-mask based thin lensless camera,” IEEE Trans. Pattern Anal. Mach. Intell. 42(7), 1618–1629 (2020). [CrossRef]  

17. T. Shimano, Y. Nakamura, K. Tajima, M. Sao, and T. Hoshizawa, “Lensless light-field imaging with fresnel zone aperture: quasi-coherent coding,” Appl. Opt. 57(11), 2841–2850 (2018). [CrossRef]  

18. Y. Hua, S. Nakamura, M. S. Asif, and A. C. Sankaranarayanan, “Sweepcam — depth-aware lensless imaging using programmable masks,” IEEE Trans. Pattern Anal. Mach. Intell. 42(7), 1606–1617 (2020). [CrossRef]  

19. T. Nakamura, T. Watanabe, S. Igarashi, X. Chen, K. Tajima, K. Yamaguchi, T. Shimano, and M. Yamaguchi, “Superresolved image reconstruction in fza lensless camera by color-channel synthesis,” Opt. Express 28(26), 39137–39155 (2020). [CrossRef]  

20. I. Yamaguchi and T. Zhang, “Phase-shifting digital holography,” Opt. Lett. 22(16), 1268–1270 (1997). [CrossRef]  

21. D. Malacara, Optical Shop Testing (Wiley, 1978).

22. X. Chen, T. Nakamura, X. Pan, K. Tajima, K. Yamaguchi, T. Shimano, and M. Yamaguchi, “Resolution improvement in fza lens-less camera by synthesizing images captured with different mask-sensor distances,” in 2021 IEEE International Conference on Image Processing (ICIP), (2021), pp. 2808–2812.

23. M. Yachida, N. Ohyama, and T. Honda, “Image restoration using synthetic image processing method,” Opt. Commun. 74(1-2), 5–9 (1989). [CrossRef]  

24. S. L. Suryani, M. Yamaguchi, N. Ohyama, T. Honda, and K. Tanaka, “Estimation of an image sampled by a ccd sensor array using a color synthetic method,” Opt. Commun. 84(3-4), 133–138 (1991). [CrossRef]  

25. K. Tajima, Y. Nakamura, K. Yamaguchi, and T. Shimano, “Improving resolution of lensless imaging with higher harmonics of fresnel zone aperture,” Opt. Rev. 29(2), 153–158 (2022). [CrossRef]  

26. A. Gershun, “The light field,” J. Math. Phys. 18(1-4), 51–151 (1939). [CrossRef]  

27. R. Ng, M. Levoy, M. Brédif, G. Duval, M. Horowitz, and P. Hanrahan, “Light Field Photography with a Hand-held Plenoptic Camera,” Research Report CSTR 2005-02, Stanford University (2005).

28. M. DaneshPanah, B. Javidi, and E. A. Watson, “Three dimensional imaging with randomly distributed sensors,” Opt. Express 16(9), 6368–6377 (2008). [CrossRef]  

29. X. Pan and S. Komatsu, “Light field reconstruction with randomly shot photographs,” Appl. Opt. 58(23), 6414–6418 (2019). [CrossRef]  

30. W. Zhang, L. Cao, D. J. Brady, H. Zhang, J. Cang, H. Zhang, and G. Jin, “Twin-image-free holography: A compressive sensing approach,” Phys. Rev. Lett. 121(9), 093902 (2018). [CrossRef]  

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (13)

Fig. 1.
Fig. 1. Geometrical configuration of FZA lens-less camera with a point-source object.
Fig. 2.
Fig. 2. Structure of the simulation system.
Fig. 3.
Fig. 3. (a) Original image; Reconstruction image from (b) proposed method A; Reconstruction images from the conventional method at mask sensor distance (c) 5.0 mm and (d) 6.5 mm. The red box indicates the resolution limitation.
Fig. 4.
Fig. 4. Line profile of the MTF of imaging system using conventional method at mask sensor distance 5.0 mm and 6.5 mm; and the MTF from proposed method A. The spatial frequency is measured on sensor plane up to its Nyquist limitation.
Fig. 5.
Fig. 5. (a) Original image; Reconstruction image from (b) proposed method A; Reconstruction images from the conventional method at mask sensor distance (c) 5.0 mm and (d) 6.5 mm. The red box shows the zoom area.
Fig. 6.
Fig. 6. Reconstruction image using (a) conventional method and (b) proposed method B. The red box indicates resolution limitation.
Fig. 7.
Fig. 7. Line profile of the MTF of imaging system using single FZA ($\beta _1$, $0\pi$); the MTF from synthesized FZA pair $H_1$; and from the proposed method B. The spatial frequency is measured on sensor plane up to its Nyquist limitation.
Fig. 8.
Fig. 8. (a) Reconstruction image using conventional method; (b) Reconstruction image using proposed method B. The red box shows the zoom area of the according picture.
Fig. 9.
Fig. 9. (a) Spatial Light Modulator (SLM); (b) Object A and Object B; (c) Diagram of experimental setup.
Fig. 10.
Fig. 10. Left: Under focused distance 130 mm: (a) and (b) are the results from conventional method at mask sensor distance 7.0 mm and 7.5 mm, (e) and (f) are the results from proposed method A and ADMM-TV; Right: Under focused distance 170 mm: (c) and (d) are the results from conventional method at mask sensor distance 7.0 mm and 7.5 mm, (g) and (h) are the results from proposed method A and ADMM-TV; Red box shows the zoomed area from in-focus object and blue box shows the zoomed area from out-of-focus object.
Fig. 11.
Fig. 11. (a) Resolution chart; the first row in (b), (c) and (d) is the reconstruction images from the proposed method, the conventional method at mask sensor distance 7.0 mm and 7.5 mm, respectively. Red box indicates the visually assessed resolution limitation, the zoomed area of the red box and the according line profile are in the third and second row of (b), (c) and (d), respectively. The line profiles are drawn after the min-max normalization of the image intensity. (e) is the Experiment platform.
Fig. 12.
Fig. 12. (a) Resolution chart; Reconstruction images from (b) proposed method; conventional method using (c) $\beta$=9 and (d) $\beta$=12. Red box indicates the visually assessed resolution limitation. The zoomed area of the red box, and the according line profile are in the third and second rows of (b), (c), and (d), respectively. The line profiles are drawn after the min-max normalization of the image intensity.
Fig. 13.
Fig. 13. Reconstruction images from conventional method at focus distance 130 mm using (a) $\beta$=9; (b) $\beta$=12; at focus distance 170 mm using (d) $\beta$=9; (e) $\beta$=12; images are reconstructed from proposed method B at distance (c) 130 mm and (f) 170 mm. Red box shows the zoomed area from in-focus object and blue box shows the zoomed area from out-of-focus object.

Equations (31)

Equations on this page are rendered with MathJax. Learn more.

$${T}(x_{\rm p},y_{\rm p};\beta, \varphi)=\frac{1}{2}\Bigg[1+{\rm cos}\left\{\beta\left({x_{\rm p}}^{2}+{y_{\rm p}}^{2}\right)+\varphi\right\}\Bigg],$$
$$h(x,y;\beta, \varphi)=\frac{1}{2}\Bigg[1+{\rm cos}\left\{\beta({x}^{2}+{y}^{2})+\varphi\right\}\Bigg],$$
$$x=\frac{t}{t+d}x_{\rm p}, \; \; y=\frac{t}{t+d}y_{\rm p}.$$
$$g(x,y;\beta, \varphi)= \iint \frac{1}{2}\left[1+{\rm cos}\left\{\beta\Bigg({\left(x-x^{'}\right)}^{2}+{\left(y-y^{'}\right)}^{2}\Bigg)+\varphi\right\}\right] \cdot \frac{d}{t}f\left(\frac{d}{t}x',\frac{d}{t}y'\right){\rm d}x'{\rm d}y',$$
$$x'=\frac{t}{d}m,\; \; y'=\frac{t}{d}n.$$
$$g(x,y;\beta,\varphi)=h(x,y;\beta,\varphi)*f_{d}(x,y),$$
$$G(u,v;\beta,\varphi)=H(u,v;\beta,\varphi)\cdot F_{d}(u,v),$$
$$H(u,v;\beta, \varphi)=\pi \delta(u,v)+\frac{\pi}{2\beta}{\rm sin}\left\{\frac{\pi^{2}}{\beta}(u^{2}+v^{2})-\varphi\right\}.$$
$$\begin{aligned}H_{\rm FS}(u,v;\beta)&=j\left\{H(u,v;\beta,0)-H(u,v;\beta,\pi)\right\} +\left\{H\left(u,v;\beta,\frac{3}{2}\pi\right)-H\left(u,v;\beta,\frac{1}{2}\pi\right)\right\}\\ &={\rm exp}\left(j\frac{\pi^2}{\beta}(u^{2}+v^{2})\right), \end{aligned}$$
$$G_{\rm FS}(u,v)=H_{\rm FS}(u,v;\beta)\cdot F_{d}(u,v).$$
$$\begin{aligned}\hat{F}_d(u,v)&=G_{\rm FS}(u,v)\cdot H_{\rm FS}^{\rm inv},\\ s.t.\;H_{\rm FS}^{\rm inv}&={\rm exp}\left({-}j\frac{\pi^2}{\beta}(u^{2}+v^{2})\right), \end{aligned}$$
$$\hat{f}_d(x,y)=\mathcal{F}^{{-}1}\left\{\hat{F}_d(u,v)\right\}.$$
$$U(u,v;\beta,\varphi) = \mathcal{F}^{{-}1}\left\{\mathcal{F}\left\{{T}(x_{\rm p},y_{\rm p};\beta, \varphi)\right\}\cdot H(u,v)\right\},$$
$$H(u,v) = e^{j\frac{2\pi}{\lambda}d}\cdot{\rm exp}\left\{{-}j\pi\lambda d (u^{2}+v^{2})\right\},$$
$$\begin{aligned}\mathcal{F}\left\{U(u,v;\beta,\varphi)\right\} &= \mathcal{F}\left\{{T}(x_{\rm p},y_{\rm p};\beta, \varphi)\right\}\cdot H(u,v)\\ &= {\rm exp}\left(j\frac{2\pi}{\lambda}d\right){\rm exp}\left\{{-}j\pi\lambda d\left(u^{2}+v^{2}\right)\right\} \cdot\Bigg[\pi\delta(u,v)\\ &+ \frac{\pi}{4\beta j}{\rm exp}\left\{j\left(\frac{\pi^{2}}{\beta}\left(u^{2}+v^{2}\right)-\varphi\right)\right\} - \frac{\pi}{4\beta j}{\rm exp}\left\{{-}j\left(\frac{\pi^{2}}{\beta}\left(u^{2}+v^{2}\right)-\varphi\right)\right\} \Bigg]. \end{aligned}$$
$$\begin{aligned}U(u,v;\beta,\varphi) = {\rm exp}\left(j\frac{2\pi}{\lambda}d\right)\cdot\Bigg[\frac{1}{2} &+\frac{{\pi}^{2}}{4\left({\pi}^{2}-\beta\pi\lambda d\right)}{\rm exp}\left(\frac{-j\beta{\pi}^{2}\left(x^2+y^2\right)}{{\pi}^{2}-\beta\pi\lambda d}-j\varphi\right)\\ &+\frac{{\pi}^{2}}{4\left({\pi}^{2}+\beta\pi\lambda d\right)}{\rm exp}\left(\frac{-j\beta{\pi}^{2}\left(x^2+y^2\right)}{{\pi}^{2}-\beta\pi\lambda d}+j\varphi\right)\Bigg]. \end{aligned}$$
$$\scalebox{0.88}{$\begin{aligned}{\rm PSF}(x,y;\beta,\varphi) &= U(u,v;\beta,\varphi)U^{*}(u,v;\beta,\varphi)\\ &= \frac{1}{4}+\frac{{\pi}^{4}}{16{T_1}^{2}}+\frac{{\pi}^{4}}{16{T_2}^{2}}\\ &+{\rm cos}\left(\frac{2\pi}{\lambda}d\right) \cdot\Bigg[\frac{{\pi}^{2}}{4T_1}{\rm cos}\left(\frac{-\beta{\pi}^{2}(x^2+y^2)}{4T_1}+\frac{2\pi}{\lambda}d-\varphi\right) +\frac{{\pi}^{2}}{4T_2}{\rm cos}\left(\frac{\beta{\pi}^{2}(x^2+y^2)}{4T_2}+\frac{2\pi}{\lambda}d+\varphi\right)\Bigg]\\ &+{\rm sin}\left(\frac{2\pi}{\lambda}d\right)\Bigg[\frac{{\pi}^{2}}{4T_1}{\rm sin}\left(\frac{-\beta{\pi}^{2}(x^2+y^2)}{4T_1}+\frac{2\pi}{\lambda}d-\varphi\right) +\frac{{\pi}^{2}}{4T_2}{\rm sin}\left(\frac{\beta{\pi}^{2}(x^2+y^2)}{4T_2}+\frac{2\pi}{\lambda}d+\varphi\right)\Bigg]\\ &+\frac{1}{8}{\rm cos}\left(\frac{-\beta{\pi}^{2}(x^2+y^2)}{4T_1}+\frac{2\pi}{\lambda}d-\varphi\right) \cdot{\rm cos}\left(\frac{\beta{\pi}^{2}(x^2+y^2)}{4T_2}+\frac{2\pi}{\lambda}d+\varphi\right)\\ &+\frac{1}{8}{\rm sin}\left(\frac{-\beta{\pi}^{2}(x^2+y^2)}{4T_1}+\frac{2\pi}{\lambda}d-\varphi\right) \cdot{\rm sin}\left(\frac{\beta{\pi}^{2}(x^2+y^2)}{4T_2}+\frac{2\pi}{\lambda}d+\varphi\right), \end{aligned}$}$$
$${\rm PSF}(x,y;\beta,\varphi)=\frac{1}{4} +\frac{1}{2}{\rm cos}\left\{\frac{{\beta}^{2}\lambda d}{\pi}\left(x^{2}+y^{2}\right)\right\}{\rm cos}\left\{\beta\left(x^{2}+y^{2}\right)-\varphi\right\} +\frac{1}{4}{\rm cos}^{2}\left\{\beta\left(x^{2}+y^{2}\right)-\varphi\right\}.$$
$$\begin{aligned}{\rm OTF}(u,v;\beta,\varphi) = \frac{3}{4}\pi\delta(u,v) &+\frac{j{\pi}^{2}}{8\beta(\beta\lambda d+\pi)}{\rm exp}\left(\frac{-j{\pi}^{3}}{\beta(\beta\lambda d+\pi)}(u^2+v^2)-j\varphi\right)\\ &-\frac{j{\pi}^{2}}{8\beta(\beta\lambda d+\pi)}{\rm exp}\left(\frac{j{\pi}^{3}}{\beta(\beta\lambda d+\pi)}(u^2+v^2)+j\varphi\right)\\ &+\frac{j{\pi}^{2}}{8\beta(\beta\lambda d-\pi)}{\rm exp}\left(\frac{-j{\pi}^{3}}{\beta(\beta\lambda d-\pi)}(u^2+v^2)+j\varphi\right)\\ &-\frac{j{\pi}^{2}}{8\beta(\beta\lambda d-\pi)}{\rm exp}\left(\frac{j{\pi}^{3}}{\beta(\beta\lambda d-\pi)}(u^2+v^2)-j\varphi\right)\\ &+\frac{j\pi}{32\beta}{\rm exp}\left(\frac{-j{\pi}^{2}}{2\beta}(u^2+v^2)-2j\varphi\right)\\ &-\frac{j\pi}{32\beta}{\rm exp}\left(\frac{j{\pi}^{2}}{2\beta}(u^2+v^2)+2j\varphi\right). \end{aligned}$$
$$\begin{aligned}{\rm OTF}(u,v;\beta,\varphi)=\frac{3}{4}\pi\delta(u,v) &+\frac{\pi}{2\beta}{\rm sin}\left\{\frac{{\pi}^{2}}{\beta}\left(u^{2}+v^{2}\right)+\varphi\right\}{\rm cos}\left\{\lambda d \pi\left(u^{2}+v^{2}\right)\right\}\\ &+\frac{1}{16}\frac{\pi}{\beta}{\rm sin}\left\{\frac{{\pi}^{2}}{2\beta}\left(u^{2}+v^{2}\right)+2\varphi\right\}. \end{aligned}$$
$$\begin{aligned}H_{\rm FS}\hbox{-}{\rm diff}(u,v;\lambda,d) &= \left\{{\rm OTF}\left(u,v;\beta,\frac{\pi}{2}\right)-{\rm OTF}\left(u,v;\beta,\frac{3}{2}\pi\right)\right\}\\ &+j\left\{{\rm OTF}\left(u,v;\beta,0\right)-{\rm OTF}\left(u,v;\beta,\pi\right)\right\}\\ &= H_{\rm FS}(u,v)\cdot H_{\rm w}(u,v;\lambda,d),\\ s.t.\; &H_{\rm w}(u,v;\lambda,d)={\rm cos}\left\{\pi \lambda d (u^2+v^2)\right\}, \end{aligned}$$
$$\begin{aligned}G_{\rm FS\_1}(u,v) &= H_{\rm FS}\hbox{-}{\rm diff}(u,v;d_{1})\cdot F_{d1}(u,v)+N_{1}(u,v),\\ G_{\rm FS\_2}(u,v) &= H_{\rm FS}\hbox{-}{\rm diff}(u,v;d_{2})\cdot F_{d2}(u,v)+N_{2}(u,v). \end{aligned}$$
$$\hat{F}_{d_{2}} = \textbf{M}\bigg[ M_{1}\cdot G_{\rm FS\_1}\cdot H_{\rm FS}^{\rm inv}(d_{1})\Bigg]+M_{2}\cdot G_{\rm FS\_2}\cdot H_{\rm FS}^{\rm inv}(d_{2}),$$
$$M_{1}=\frac{H_{\rm w}^{*}(d_1)}{|H_{\rm w}(d_1)|^2+\frac{\eta_1}{\eta_2}|H_{\rm w}(d_2)|^2+\frac{\eta_1}{\eta_f}},$$
$$M_{2}=\frac{H_{\rm w}^{*}(d_2)}{|H_{\rm w}(d_2)|^2+\frac{\eta_2}{\eta_1}|H_{\rm w}(d_1)|^2+\frac{\eta_2}{\eta_f}},$$
$$\begin{aligned}H_{K} &= {\rm OTF}\left(\beta_{K};0\pi\right)-{\rm OTF}\left(\beta_{K};\frac{3}{2}\pi\right)\\ &=\frac{\pi}{2\beta_{K}}{\rm cos}\left(\pi\lambda d\left(u^2+v^2\right)\right)\Bigg[ {\rm sin}\left(\frac{\pi^2}{\beta_{K}}\left(u^2+v^2\right)\right)\\ &+{\rm cos}\left(\frac{\pi^2}{\beta_{K}}\left(u^2+v^2\right)\right)\Bigg] +\frac{\pi}{8\beta_{K}}{\rm sin}\left(\frac{\pi^2}{2\beta_{K}}\left(u^2+v^2\right)\right), \end{aligned}$$
$$\begin{aligned}G_{1}(u,v;\beta_{1}) &= H_{1}(u,v;\beta_{1})\cdot F(u,v)+N_{1}(u,v),\\ G_{2}(u,v;\beta_{2}) &= H_{2}(u,v;\beta_{2})\cdot F(u,v)+N_{2}(u,v), \end{aligned}$$
$$\hat{F} = P_{1}\cdot G_{1}(\beta_{1})+P_{2}\cdot G_{2}(\beta_{2}),$$
$$P_{1}=\frac{H_{1}^{*}(\beta_{1})}{|H_{1}(\beta_{1})|^2+\frac{\eta_1}{\eta_2}|H_{2}(\beta_{2})|^2+\frac{\eta_1}{\eta_f}},$$
$$P_{2}=\frac{H_{2}^{*}(\beta_{2})}{|H_{2}(\beta_{2})|^2+\frac{\eta_2}{\eta_1}|H_{1}(\beta_{1})|^2+\frac{\eta_2}{\eta_f}},$$
$$\hat{f}(x,y)=\mathcal{F}^{{-}1}\left\{\hat{F}(u,v)\right\}.$$
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.