Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Deep learning-assisted light sheet holography

Open Access Open Access

Abstract

In a novel approach to layer-based holography, we propose a machine learning-assisted light sheet holography–an optimized holography technique which projects a target scene onto sheets of light along the longitudinal planes (i.e. planes perpendicular to the plane of the hologram). Using a convolutional neural network in conjunction with superposition of Bessel beams, we generate high-definition images which can be stacked in parallel onto longitudinal planes with very high fidelity. Our holography system provides high axial resolution and excellent control over the light intensity along the optical path, which is suitable for augmented reality and/or virtual reality applications.

© 2024 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

The goal of computer generated holography is to generate high resolution, three-dimensional (3D) holographic images, depicting a 3D object or scene through the modulation of a coherent laser source. This process employs computational algorithms to design computer generated holograms (CGH), enabling precise control over the phase and amplitude of the light wavefront in real time. This technology, with its potential for real-time reconstruction and manipulation of images, aims to revolutionize fields such as microscopy, display technology, medical imaging, and augmented/virtual reality.

Layer-based holography, a widely used computational algorithm, solves holography problems by conventionally slicing the 3D scenes or objects into multiple transverse planes parallel to the holographic display, each representing a specific depth within the scene [1]. The hologram is designed such that the sliced two-dimensional (2D) images are reconstructed at their respective depth and consequently, when these images are viewed collectively, they form a composite that appears as a 3D holographic image.

One significant drawback of this method, which we refer to as transverse layer-based holography, is the occurrence of ghosting and cross talks between adjacent planes. For instance, an increase in the number of planes inadvertently worsens the cross talk [2]. To maintain a consistent level of cross talk across the series of planes, the planes must be spaced further apart as depth increases [3]. Furthermore, given that the number of planes is highly constrained due to computational limitations, the creation of high-definition images along the longitudinal direction remains a challenge beyond the capacity of this approach.

To overcome these challenges, we introduce a neural network-based holography system, specifically devised for generating high resolution images on longitudinal planes, ideal for applications requiring high axial control over light intensity. Our longitudinally optimized holography system presents a novel approach to layer-based holography, by discretizing the scene into multiple longitudinal planes. Contrary to transverse planes, these longitudinal planes can be uniformly positioned at minimal distances from one another while reducing crosstalks. This strategy provides an enhanced depth of field along with good axial resolution. Holograms of this kind offer potential applications in augmented reality (AR), serving as a novel tool for generating depth cues and complex scenery. For instance, these holograms can be used to produce intricate features on images of objects and surfaces spanning over substantial depths, thereby enhancing the visual richness and detail of the AR experience. Moreover, in the field of light sheet microscopy, these longitudinal images can serve as high resolution custom illumination, a technique that has been demonstrated to improve image resolution and mitigate associated artifacts, thereby enhancing the overall image clarity [4,5]. To achieve this, we aim to design CGHs in real time using neural networks in conjunction with a non-iterative semi-analytical method (as described later). These CGHs are designed by finding the best possible boundary conditions for the light that most accurately generate a desired optical field or image.

The above problem is in general a multivariate, non-convex, non-linear, inverse problem, which makes it highly challenging to solve using conventional optimization techniques. Although propagation of light in free space is a linear transformation, the optimization approach used to solve the inverse problem is non-linear because: a) often the desired (and measurable) physical quantity of interest is the light intensity; and b) often a phase-only spatial light modulator (SLM) is deployed as the holographic display to implement the required boundary conditions used to generate the desired light field. As such, the experimenter must rely on phase-only modulation techniques.

In the past, researchers have employed a variety of approaches to address the aforementioned inverse problem. One of the most commonly used iterative methods is the Gerchberg-Saxton (GS) technique [6]. In this method, the optical field is propagated back and forth between the boundary and the image planes, where some form of amplitude constraints are applied at each end. This method is most efficient when propagation of light between the boundary and image planes is an invertible unitary transformation that conserves the total energy. Unfortunately, due to lack of such transformation, the GS method cannot be used in longitudinal holography. Another set of techniques used to solve the general holography problem are the gradient-based iterative methods, which also rely on optimization techniques to solve the inverse problem [7,8]. In these approaches, a system function that simulates the intensity profile in space from a given boundary condition is built. The optimal boundary condition is then calculated by minimizing a loss function that evaluates the error between the reconstructed and desired intensity profiles over hundreds of iterations. These methods offer the most accurate results at the cost of slow speed and increased computational complexity at run-time. Moreover, although gradient-based methods work well for transverse holography (as long as the number of transverse planes is limited), in the case of 3D and longitudinal holography, where the number of sampling points in the longitudinal direction is very large, gradient-based methods suffer from long computational time and high memory overhead, which makes them non-ideal for real-time applications.

There are also semi-analytical non-iterative methods such as point cloud method [9], look-up table method [10], and wavefront recording plane method [11], in which the desired optical field is meshed into small elements and each element is simply propagated back onto the boundary plane. The superposition of these fields yields the final CGH. However, these approaches generally suffer from low fidelity and accuracy although they can be fast and extended to real-time applications. More recently, a superposition of Bessel Beams [12,13] was used to generate images on the longitudinal plane by incorporating an ensemble of one-dimensional longitudinally structured light, also known as the Frozen Waves (FWs) [14]. However, the non-orthogonality of the basis used, and lack of phase optimization, limited the fidelity and accuracy of the generated images using this novel light sheet holography technique.

In recent years, deep neural networks have been used in combination with semi-analytical non-iterative techniques to synthesize high-definition holographic images, while maintaining good computational speed [1517]. Deep neural networks are customizable and trainable non-linear functions that can enhance the underlying analytical and/or semi-analytical methods by performing phase optimization, while at the same time modeling and compensating inherent imperfections and aberrations in the optical system.

In this paper, we present several convolutional neural networks developed to work in conjunction with the two-dimensional Frozen Wave method (2D-FW) [12]. Our goal is to create CGHs capable of generating high quality images on single and multiple longitudinal planes. To train our neural networks, we simulate the light intensity profiles from suggested CGHs by our hybrid system and compare them with the desired intensity profiles in order to evaluate the performance of the neural networks. The network’s parameters are optimized by minimizing a loss function that evaluates the error between the desired and simulated images.

This paper is organized as follows: in section 2, we start by reviewing light sheet holography based on 2D-FW method, which we use as the underlying semi-analytical algorithm to generate the CGHs. In this section we also introduce our hybrid system in which convolutional neural networks and the 2D-FW method work together to generate optimal CGHs. In section 3, we demonstrate the capability of our hybrid systems in synthesizing a high quality image on a single longitudinal plane through simulation and experimental results for three different setups. We compare the performance of our hybrid system against two candidate gradient-based methods. Additionally, we train a neural network to simultaneously generate three images on three longitudinal planes showing that our approach can be extended to multi-plane systems and 3D images. Finally, section 4 contains our conclusions and final remarks.

2. Methodology

2.1 Frozen wave expansion

Frozen waves are a class of non-diffracting and self-healing beams that are produced by superposition of coaxial Bessel beams with different orders and cone angles [14] . The scalar field for FWs of length $L$, is formed from the following superposition:

$$\Psi(\rho,\phi,z,t)=\sum_{\nu}{\sum_{j={-}N}^{N}{e^{{-}i\omega t}} C^{\nu j} j_v (k_\rho^j \rho)e^{ik_z^j z}e^{i\nu \phi}},$$
where $\nu$ is the Bessel order, $k_z^j=Q+2\pi j/L$ and $(k_\rho ^j)^2=\omega ^2/c^2-(k_z^j)^2$ are the longitudinal and transverse components of the wave vector, and Q is a design parameter (it represents the center spatial frequency). The total number of modes $N_b=2N+1$ is determined by the inequality $(k_\rho ^j)^2>0$. For each Bessel order ($\nu$), the brightness of the beam can be modulated to follow a morphological function, $F^\nu (z)$. For this, the expansion coefficients, $C^{\nu j}$, in Eq. (1) are obtained by a partial Fourier transform of the morphological function:
$$C^{\nu j}=1/L \int_0^L{F^\nu(z)e^{{-}i2\pi j/Lz}dz}$$

The FW expansion of zeroth order Bessel beams can be approximately extended to two dimensional space to be used for synthesizing two-dimensional intensity profiles on longitudinal planes [12,13]. To achieve this, the desired 2D intensity profile, $\hat {I}(x,z)$, spanning from $(0,0)$ to $(L_x,L)$ in the $x-z$ plane, is discretized into parallel lines along the $z$-direction (see Fig. 1 for the coordinate system used in this paper) and the intensity profile along each line is produced by a zeroth order FW. Considering only $C^{0j}$ in the FW expansion, we define $C^{0j}(x=x_s)$ as the expansion coefficients for the line sampled at $x_s = (s-1)L_x/N_x$, where $s$ is the sampling index ranging from $1$ to $N_x$. The complex field of the 2D-FW can be expressed as:

$$\Psi(x,y,z)=\sum_{s=1}^{N_x}{\sum_{j={-}N}^{N}{C^{0j}(x=x_s)J_0(k_\rho^j\sqrt{y^2+(x-x_s)^2})e^{ik_z^jz}}}$$

 figure: Fig. 1.

Fig. 1. Configurations of the holographic systems (for our various experiments). The holographic images are synthesized on: a) $x-z$ plane at $y = 0$. b) $x-z$ plane at $y=1.6$mm. c) Three $x-z$ planes separated by $1.2$mm distance, oriented perpendicular to the boundary region ($x-y$ plane at $z = 0$).

Download Full Size | PDF

The coefficients $C^{0j}(x=x_s)$ can be calculated from Eq. (2) by using $F^0(z)=\sqrt {\hat {I}(x=x_s,z)}$ as the morphological function for each $x_s$. We uniformly sample the desired intensity profile along the $x$-and $z$-directions, with $Nx$ and $Nz$ designating the number of sampling points along the $x$-and $z$-directions, respectively.

Furthermore, multiple such planes can be stacked along the $y$-direction to produce 3D images. However, there will be a certain level of crosstalk between the images, which must be minimized to ensure the fidelity of the reconstructed images. One source of crosstalk is the periodic nature of the FW expansion, which creates copies of the optical field along the propagation direction. These copies can interfere with images on adjacent planes, causing severe crosstalk between the planes. To mitigate this, the images must be padded with zeros along the $z$-direction to remove the copies from the working area in front of the boundary region. The adequate padding length is determined by the $Q$, aperture size, and image length [13].

In using the above method, it should be noted that the spatial overlap and interference between adjacent FWs introduce an intrinsic error and undesirable artifacts in the generated holographic images. Moreover, the phase of the morphological function $F^0(z)$ in Eq. (2) is a variable that can be optimized to enhance the image equality further. We can transform the desired amplitude profile to produce a modified $F^0(z)$ with optimal amplitude and phase to mitigate the interference between the FWs. We perform this task by employing a convolutional neural network trained to calculate optimized $F^0(z)$, which is given to the 2D-FW expansion. In the following sections, we demonstrate the design, training, and utilization of such networks to generate high quality holographic images.

2.2 Hybrid holography system

In our proposed digital holography system, the boundary region is positioned at the $x-y$ plane at $z=0$ and the holographic images are synthesized on the $x-z$ planes as shown in Fig. 1. Our convolutional neural network (CNN) receives an $N_x$ by $N_z$ pixel image with P number of channels where P is the number of $x-z$ planes and the k-th channel corresponds to the target amplitude profile, $A_k(x,z)$, on the k-th longitudinal plane located at $y_k$. The network then returns an optimal complex-valued tensor $E(x,z,k)$ with the same dimensions as the input, where the output corresponds to a feasible complex valued optical field on the image planes. We apply the 2D-FW algorithm on this complex field to obtain the coefficients $C^{0j}(x=x_s)$ in the 2D-FW expansion defined in Eq. (3) for each plane. The complex field $\Psi (x,y,z)$ is computed through the superposition of all Bessel beams as given by the following equation, which constitutes our analytical simulation method:

$$\Psi(x,y,z)=\sum_{k=1}^P{\sum_{s=1}^{N_x}{\sum_{j={-}N}^{N}{C^{0j}(x=x_s)J_0(k_\rho^j\sqrt{(y-y_k)^2+(x-x_s)^2})e^{ik_z^jz}}}}.$$

Subsequently, The complex boundary condition $E_b(x,y)$ is derived from the complex field evaluated at $z=0$, given by:

$$E_b(x,y)=\sum_{k=1}^P{\sum_{s=1}^{N_x}{\sum_{j={-}N}^{N}{C^{0j}(x=x_s)J_0(k_\rho^j\sqrt{(y-y_k)^2+(x-x_s)^2})}}}$$

We next convert the complex field at the boundary to a phase profile using an analytical phase modulation method [18] to be displayed on a phase-only SLM.

Since the 2D-FW method provides us with a good initial guess, the optimal complex field on the image planes is expected to retain spatial similarity to the target amplitude profile. This spatial correspondence allows us to effectively employ a U-net architecture [19], which leverages skip connections to preserve spatial information throughout the network. Importantly, this spatial relationship not only simplifies the functionality of the network, but also enhances both the generalizability and robustness of our system [20]. Our networks use a U-net architecture with four down and four up sampling blocks as shown in Fig. 2(a). The down-sampling blocks consist of two convolutional layers each followed by a batch normalization layer and a leaky ReLU activation function (-0.2 slope). The up-sampling blocks are implemented using transposed convolutional layers followed by a convolutional layer, a batch normalization layer, and a ReLU activation function. The output convolutional layer produces two images for each $x-z$ plane corresponding to the real and imaginary parts of the complex valued field that the network deems optimal as shown in Fig. 2(b).

 figure: Fig. 2.

Fig. 2. Deep neural network-assisted light sheet holography algorithm. a) The U-net architecture. Each convolutional layer in the down-sampling blocks (DS1-4) is followed by a batch normalization layer and a Leaky ReLU activation function. The up-sampling blocks (US1-4) are implemented by transposed convolutional layers followed by a convolutional layer with ReLU activation function. The output convolutional layer uses a linear function to produce the real and imaginary parts of the optimal complex field. b) The complex field generator is a U-net, which generates a complex valued tensor with the same dimensions as the target amplitude profile. The number of channels, P, is the number of $x-z$ planes in the system. c) The hybrid holography system calculates the optimal FW’s expansion coefficients and the transverse boundary condition. In the training routine, the network’s parameters are optimized such that the simulated amplitude profile on longitudinal planes match the target profile. d) Training datasets: the DIV2K dataset (dataset 1), featuring natural images, and a synthetic dataset composed of images with randomly placed circles (dataset 2).

Download Full Size | PDF

2.3 Training process

We train our networks in an unsupervised manner, demonstrated in Fig. 2(c), by simulating the amplitude profiles on all target $x-z$ planes from the predicted coefficients and comparing it with the target amplitude profiles. The networks’ parameters are optimized by minimizing a loss function that evaluates the errors between the reconstructed and target images over many samples in the training dataset via stochastic gradient descent algorithm. The simulation process must be executed rapidly to suit the training process. For example, one can first calcualte the complex boundary condition $E_b$ via Eq. (5) and propagate the light in space using band-limited Angular Spectrum Method (ASM) [21] to reconstruct the image. Despite being accurate, this approach requires a few seconds to perform the task, and therefore, is too slow to meet the training speed requirements. To overcome this speed limitation, we have devised two methods, one analytical and the other numerical, which approximately simulate the longitudinal profile during the training session. In both methods, we skip the step of calculating the complex boundary and instead simulate and store the complex field on all existing longitudinal planes, for every mode in the 2D-FW expansion. In the expansion, we are simulating a total of $N_b\times N_x \times P$ Bessel modes, where the simulated complex-valued field for each mode is represented by a tensor of dimensions $(N_x,N_z,P)$. Therefore, the total data required to store these fields will be $N_b\times N_x^2\times N_z\times P^2$ which grows quadratically with respect to the number of planes and sampling points along the $x$-direction. For example, for our main experiment in section 3.1, where we generate an image on a single longitudinal plane, this amounts to 125GB of data, which does not fit in the current GPUs. We deploy two computational strategies to reduce the memory requirements for each type of simulation. For the analytical method, we use Eq. (4) to evaluate the complex field on longitudinal planes at desired $y$-positions. To reduce the memory requirement, we take advantage of the fact that for ideal zeroth order Bessel beams, the magnitude is constant along the $z$-direction and the phase is constant along the $x$-direction. Therefore, we can change the order of summations in Eq. (3) and rewrite the expansion as follows:

$$\Psi(x,y,z)=\sum_{j={-}N}^{N}{e^{ik_z^jz}\sum_{s=1}^{N_x}{C^{0j}(x=x_s) J_0(k_\rho^j\sqrt{y^2+(x-x_s)^2})}}$$

This means we can store 1D amplitude profiles at $z = 0$ and add the $z$-dependent phase later in the simulation routine. This reduces the memory required to store profiles down to 232 MB for the example above, making this method scalable to higher resolutions. It should be noted that this method yields approximate results as the finite size of the boundary and the sampling step/pixel size is not considered. Particularly, for the approximation to be valid, the choice of Q factor in the FW expansion should ensure that all Bessel beams can be adequately resolved by the pixelated boundary and additionally they reach the farthest points in the image plane. On the other hand, for the numerical simulation method, we use the band-limited ASM to model the propagation of light in free space. Once again, the longitudinal magnitude profiles are simulated and stored for each individual Bessel beams in the 2D-FW expansion. To reduce memory requirements, the profiles are symmetrically truncated in the $x$-direction around the center of the Bessel beam with little impact on the accuracy. This is because of the fact that Bessel beams decay when moving away from the beam’s axis as shown in Fig. 3, where the figure shows a simulated profile for a Bessel beam centered at $x_s = 0.4$mm ($s = 50$) from the edge of the boundary with the mode number $j = -11$.

 figure: Fig. 3.

Fig. 3. The simulated amplitude profile on the target plane for a Bessel beam with mode number, $j = -11$ and axis coordinate, $x_s = 0.4$mm.

Download Full Size | PDF

During the training, the longitudinal amplitude profile is calculated by superposing the stored profiles weighted by the expansion coefficients that were obtained from the combination of the network and the 2D-FW expansion.

2.4 Training specification

We have used a combination of two datasets to train our networks. The first dataset comprised 800 natural images from DIV2K dataset [22]. The second dataset consisted of images with random number of circles of varying brightness and sizes. The datasets were augmented by random rotation and flipping operations. The loss function used for training is based on a normalized cross correlation accuracy that measures the similarity between the target amplitude profile, $\hat {A}(x,y,k)=\sqrt {\hat {I}}$, and the reconstructed amplitude profile $\hat {A}(x,z,k) = |\Psi |$, as follows:

$$Acc = \dfrac{\sum{A\hat{A}}}{\sum{A^2}\sum{\hat{A}^2}} \qquad Loss=1-Acc$$

The loss value is zero when the two images match exactly, and it reaches a maximum of 1 when there is no correlation between the two images. We trained our network over 20 epochs using Adam optimizer with a learning rate of $10^{-4}$. Training with the ASM simulation was performed on an A100 40GB Nvidia GPU, and the remaining training was carried out on a V4 15GB Nvidia GPU.

2.5 Experimental setup

Our holography setup consists of a phase-only spatial light modulator followed by a 4f-lens system with a spatial filter, as shown in Fig. 4. The SLM is located at the $x-y$ plane at the back focal point of the first lens. The SLM modulates the phase of the reference beam and the 4f system selects the first diffraction order in the k-space using a spatial filter (iris) located at the mutual focal point of the two lenses. The second lens then transforms this filtered order back to the real space. Essentially, we use this setup to generate the boundary complex field (at the fore focal plane of the second lens) using the phase-only mask displayed at the SLM’s plane.

 figure: Fig. 4.

Fig. 4. Longitudinal holography setup, consisting of a reflective phase-only SLM and a 4f-lens system. The combination can generate arbitrary complex field at $z=0$ by modulating the reference beam. To measure the light intensity on the $x-z$ planes, a digital camera mounted on a translation stage captures images while moving along the $z$-direction.

Download Full Size | PDF

To capture holographic images on longitudinal planes, a camera mounted on a translation stage captures images while moving along the $z$-direction, as shown in Fig. 4. Slices from these images at a certain vertical $y$-position are extracted and stacked together along the $z$-direction to reconstruct the light intensity profile on the $x-z$ plane. Before the measurement, we align the system with the camera using four Bessel beams that propagate along the $z$-direction as reference beams. Additionally, during a calibration run, the positions of the reference beams are recorded as a function of the camera’s position. This data is used to align the slices more accurately in a post-processing step.

2.6 Iterative methods

We have developed two iterative methods, GD-FW and GD-ASM, to benchmark the performance of our proposed method. In the GD-FW method, we employ GD to find the optimal coefficients in the 2D-FW expansion. We simulate the light intensity profile for a given set of coefficients using Eq. (3) and optimize the coefficients by comparing the simulated and target profiles. In the GD-ASM method, given a complex field at the boundary, we employ band-limited ASM to simulate the light intensity on longitudinal planes. We consider the complex field as the optimization variable, which we directly optimize using the GD method. We have used Adam optimizer along with optimal learning rates for both methods.

3. Results and discussion

3.1 Generating images on a single plane

We demonstrate the superior performance of our networks for generation of images on single and multiple longitudinal planes through simulation and experimental measurements. We have trained two convolutional networks, CNN1 and CNN2, using analytical and numerical simulations, respectively, in order to generate a single image on the $x-z$ plane at $y=0$. We have used $Q = 0.99955$ and $L = 0.2m$ as the FW expansion’s parameters, which implies 33 valid co-propagating Bessel modes $(i.e.,N_b=33)$ for each FW. The transverse boundary region is a 7.68mm by 4.8mm rectangular area, discretized into $960\times 600$ pixels with an 8um pixel size, the same as that of the SLM. We have used 512 sampling points along the $z$-direction. Consequently, single channel $960\times 512$ pixel images (as target amplitude profiles) were fed to both networks as inputs. We have applied our hybrid system utilizing CNN1 and CNN2 on the target intensity profiles shown in Fig. 5 and have calculated the optimal complex fields at the boundary. The resulting images on the $x-z$ plane were numerically reconstructed using band-limited ASM simulation. We repeated the same process with the baseline 2D-FW method and the two GD methods. Figure 5 shows the reconstructed images as well as the image accuracy obtained for each method for several target images. The image accuracy was measured using our cross-correlation loss function (Eq. (7)) as well as Peak Noise to Signal Ratio (PSNR). The GD-FW method was run for over 125 iterations taking one second, excluding the time to calculate the FW superposition, while the GD-ASM method was run over three iterations taking seven seconds.

 figure: Fig. 5.

Fig. 5. Examples of simulated intensity profiles obtained from the baseline 2D-FW method, CNN1 (trained using analytical calculations), CNN2 (trained using ASM simulations), and the two candidate iterative methods, GD-FW and GD-ASM. The image accuracy is measured by cross-correlation accuracy and Peak Signal to Noise Ratio (PSNR) between the reconstructed and target images.

Download Full Size | PDF

For these sample images, the baseline 2D-FW method took 220 ms to calculate the complex boundary, mostly spent on calculating the superposition of all Bessel beams at $z=0$. This time can be significantly reduced with further parallelization using multiple GPUs. The addition of CNN1 and CNN2 adds 80 and 130 ms to this time, bringing the total computation time to 300 and 350 ms, respectively. The images from the baseline 2D-FW method contain checkerboard artifacts and fringe patterns due to the interference between adjacent lines and moreover, patterns with high spatial frequencies are poorly resolved. The networks resolve these issues and generate higher quality and sharper images, while maintaining the same level of speed.

Figure 6(a) compares the accuracy of the reconstructed image for the sample in the first row of Fig. 5 obtained from the baseline 2D-FW method, our two hybrid systems (CNN1 and CNN2), and the two iterative methods as a function of computation time. Our two networks, CNN1 and CNN2, demonstrate superior accuracy within a short computation time compared to methods based on GD. The accuracy of the GD-FW method tends to plateau after around 1000 iterations or 10 seconds, as it relies on the analytical simulation method that carries some inaccuracies. The GD-ASM method which uses the accurate ASM simulation results in the most accurate images, however, after a significantly longer computational time.

 figure: Fig. 6.

Fig. 6. Comparing the performance of 2D-FW method, hybrid systems (CNN1 and CNN2), and iterative methods (GD-FW and GD-ASM). a) The cross correlation accuracy graph with respect to computation time for the target image shown in the first row of Fig. 5. GD-ASM is the slowest method but eventually yields the most accurate image as the computation time increases. b) Peak Signal to Noise Ratio (PSNR) of the reconstructed image for samples in the test dataset using various methods. The iterative methods, GD-FW and GD-ASM, were run for roughly 4 and 7 seconds, respectively. Both networks show higher PSNR compared to 2D-FW and iterative methods.

Download Full Size | PDF

On average, for 50 samples from the test dataset, our two networks (i.e., CNN1 and CNN2) achieve PSNRs of 16.1 and 16, respectively. This is a significant improvement over the baseline 2D-FW method, which has an average PSNR of 14.4.

We also have generated and measured the target intensity profiles experimentally using our holographic setup for the baseline 2D-FW method and our two neural networks. In agreement with the simulation results, our two networks produce more visually appealing images compared to the baseline 2D-FW method as shown in Fig. 7. The same artifacts in the 2D-FW method and improvements by our networks can be observed in the experimentally reconstructed images. In this case, we have chosen the $Q$ such that all Bessel modes survive 20cm of propagation, reaching the end of the image. This ensures the accuracy of the analytical simulations and results in competing performances between our two networks.

 figure: Fig. 7.

Fig. 7. Measured intensity profiles obtained from the baseline 2D-FW method, CNN1 (trained by analytical simulations), and CNN2 (trained by ASM simulations). The images obtained from 2D-FW suffers from checkerboard artifacts due to the interference between sampled lines. Both networks produce more accurate images with minimal artifacts. The measured and simulated images (Fig. 5) share similar characteristics, although the complexity of imaging procedures limits the accuracy of measurements.

Download Full Size | PDF

3.2 Generating images over an extended region

For this case, all the system’s parameters were kept the same as before except that we have increased the length of the images in the $z$-direction to 30cm, which is beyond the reach of some of the Bessel beams in the FW expansion (given the finite aperture of the boundary region). Two networks, CNN3 (trained by analytical simulations) and CNN4 (trained by ASM simulations) were trained as before to generate the images. Figure 8 shows the simulation and experimental results obtained from the baseline 2D-FW, CNN3, and CNN4. In this setup, the depth of focus for some of the Bessel modes are smaller than the image length, resulting in inaccuracy in the analytical simulation and consequently in the reconstructed image by the baseline 2D-FW and CNN3, especially in the side farther from the boundary (far right side of the figures). Conversely, CNN4 manages to generate undistorted image owing to the fact that numerical simulation remains valid despite the mismatch between the chosen parameters in the 2D-FW expansion.

 figure: Fig. 8.

Fig. 8. Simulated and measured intensity profiles obtained from 2D-FW, CNN3 (trained by analytical simulations) and CNN4 (trained by ASM simulations), for the setup where the image is extended beyond the reach of some Bessel modes. Both the 2D-FW method and CNN3 fail to generate accurate profiles beyond the depth of focus of the FWs.In contrast, CNN4, which is optimized based on accurate ASM simulations, is insensitive to the FW’s parameters such as the length and/or Q factor.

Download Full Size | PDF

3.3 Generating images at shifted plane

Networks trained by ASM simulation have also shown to be capable of producing high quality images on the $x-z$ planes that are shifted along the $y$-direction. We trained two networks, CNN5 and CNN6 (based on analytical and numerical simulations), to generate images on the $x-z$ plane at $y=1.6$mm, which is only 0.8mm away from the edge of the boundary region as shown in Fig. 1(b). In this case, the size of the aperture that generates the Bessel modes is reduced from the side closer to the edge of the boundary. Once again, the analytically simulated field deviates from the exact field after some distance from the boundary. Figure 9 shows the numerically simulated and experimentally measured intensity profiles on the $x-z$ plane at $y=1.6$mm, produced from the boundary obtained from the 2D-FW method, CNN5, and CNN6. The images created by the first two methods are distorted on the right side while the image produced by CNN6 maintains its quality throughout the image from left to right.

 figure: Fig. 9.

Fig. 9. Simulated and measured intensity profiles on a longitudinal plane at $y= 1.6$mm using 2D-FW method, CNN5 (trained by analytical simulations), and CNN6 (trained by ASM simulations). 2D-FW method and CNN5 generate distorted images as the FW expansion becomes inaccurate on planes close to the edge of the boundary region. Conversely, the network trained by the ASM simulation maintains high accuracy for the shifted plane.

Download Full Size | PDF

3.4 Generating images on multiple planes

We have developed our last network, CNN3D, to generate images on three longitudinal planes positioned at $y = 0$ and $y=\pm 1.2$mm. The setup’s geometry and dimensions are illustrated in Fig. 1(c). We pad the images with zeros along the $z$-direction, extending their length to 60 cm, specifically to eliminate copies of the fields generated due to the periodicity of the FW expansion from the working area.

The network receives three images corresponding to target magnitudes and estimates a feasible complex valued field (amplitude and phase) on the three planes. Due to memory constraints, we are limited to only use the analytical simulation to train our network. Just as before, we apply the 2D-FW expansion on the complex field for each plane to obtain the expansion coefficients and consequently the boundary complex field at the $x-y$ plane at $z=0$, via a weighted superposition of Bessel modes a $z=0$. We have applied our hybrid system to two target profiles shown in Fig. 10. The addition of the network improves the accuracy of the reconstructed images from $90{\%}$ and $94{\%}$ (obtained from the baseline 2D-FW method) to $81{\%}$ and $91{\%}$ for the sample one and two, respectively. Both the numerical simulations and the experimental measurements show that our system, compared to the baseline 2D-FW method, produces sharper images with less cross talk between neighboring planes.

 figure: Fig. 10.

Fig. 10. Simulated and measured intensity profiles on three longitudinal planes at $y = -1.2$mm, $0$mm, and $1.2$mm obtained from 2D-FW method and CNN3D trained by analytical simulations. The images produced by CNN3D are more accurate and show less cross talk between planes.

Download Full Size | PDF

The total number of modes and consequently the time to calculate the complex field at the transverse boundary from a set of coefficients grows quadratically with the number of planes. For this example, with three planes, this time amounts to approximately 200ms on the average. However, this time can be reduced to a comparable level (as in the case of single plane) through parallelization across multiple GPUs. CNN3D execution time adds 100ms to this time, bringing the total calculation time to 300ms on the average.

Before closing this section a few remarks are in order. In this work, we have focused on generating images with the same linear polarization. However, it is possible to further reduce cross talks between the planes and therefore increase the image fidelity by constructing images on adjacent plane with alternating polarization that are orthogonal to each other [23]. The holograms generating alternating polarized images can be multiplexed on the same SLM [24]. The neural network can be trained in such a way to account for this modification to improve the accuracy even further. Furthermore, in our system, we have exclusively considered paraxial optical fields, which prevents us from using beams with acute angles. This imposes constraints on the aspect ratio of the holographic images. Our work can be extended by employing non-paraxial fields, which should be modeled using vectorial fields. This will allow us to further reduce the aspect ratio by incorporating fields propagating with acute angles. Alternatively, we can use a 4f-lens system with non-unitary magnification to reduce the aspect ratio to any desired value. For example, using a demagnification of a factor of two reduces the aspect ratio by $50{\%}$[25]. The expansion parameter $Q$, which is dictated by the image length and SLM’s pixel size, directly determines the overall image quality. Reducing this parameter can increase the image resolution at the cost of maximum allowed image length. A neural network trained by the ASM simulation can simultaneously utilize two or more sets of 2D-FWs with different $Q$ to achieve enhanced resolution in regions closer to the boundary plane while allowing a long image length.

4. Conclusion

We have developed a new algorithm based on convolutional neural networks and superposition of Bessel beams for rapid generation of accurate images on longitudinal planes. We have generated high fidelity images at single and multiple longitudinal planes. Our results demonstrate a new way to augment analytical methods of solving holographic problems such as FWs with deep convolutional networks to improve their accuracy and versatility. Through numerical simulations and experimental measurements, we have shown that our networks significantly improve the accuracy of the baseline light sheet holography without compromising the computational speed. Our hybrid system outperforms two candidate gradient-based algorithms in terms of computational speed and memory cost at the run time. Generating complex intensity profiles with continuous depth along the optical path is particularly desired for applications such as light sheet microscopy and optical tweezers. Furthermore, our method promises a novel way to generate 3D images by discretizing 3D images into 2D images on longitudinal planes for augmented reality and 3D holographic displays.

Funding

Natural Sciences and Engineering Research Council of Canada.

Disclosures

The authors declare no conflict of interests.

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. J.-H. Park, “Recent progress in computer-generated holography for three-dimensional scenes,” J. Inf. Disp. 18(1), 1–12 (2017). [CrossRef]  

2. Z. Wang, T. Chen, Q. Chen, et al., “Reducing crosstalk of a multi-plane holographic display by the time-multiplexing stochastic gradient descent,” Opt. Express 31(5), 7413–7424 (2023). [CrossRef]  

3. G. Makey, Ö. Yavuz, D. K. Kesim, et al., “Breaking crosstalk limits to dynamic holography using orthogonality of high-dimensional random vectors,” Nat. Photonics 13(4), 251–256 (2019). [CrossRef]  

4. F. Wang, Z. Ma, Y. Zhong, et al., “In vivo nir-ii structured-illumination light-sheet microscopy,” Proc. Natl. Acad. Sci. 118(6), e2023888118 (2021). [CrossRef]  

5. P. J. Keller, A. D. Schmidt, A. Santella, et al., “Fast, high-contrast imaging of animal development with scanned light sheet–based structured-illumination microscopy,” Nat. Methods 7(8), 637–642 (2010). [CrossRef]  

6. R. Gerchberg and W. Saxton, “A practical algorithm for the determination of phase from image and diffraction plane pictures,” SPIE milestone series MS 94, 646 (1994).

7. J. Zhang, N. Pégard, J. Zhong, et al., “3d computer-generated holography by non-convex optimization,” Optica 4(10), 1306–1313 (2017). [CrossRef]  

8. P. Chakravarthula, Y. Peng, J. Kollin, et al., “Wirtinger holography for near-eye displays,” ACM Trans. Graph. 38(6), 1–13 (2019). [CrossRef]  

9. R. H.-Y. Chen and T. D. Wilkinson, “Computer generated hologram from point cloud using graphics processor,” Appl. Opt. 48(36), 6841–6850 (2009). [CrossRef]  

10. S.-C. Kim and E.-S. Kim, “Effective generation of digital holograms of three-dimensional objects using a novel look-up table method,” Appl. Opt. 47(19), D55–D62 (2008). [CrossRef]  

11. T. Shimobaba, N. Masuda, and T. Ito, “Simple and fast calculation algorithm for computer-generated hologram with wavefront recording plane,” Opt. Lett. 34(20), 3133–3135 (2009). [CrossRef]  

12. L. A. Ambrosio, “Millimeter-structured nondiffracting surface beams,” J. Opt. Soc. Am. B 36(3), 638–645 (2019). [CrossRef]  

13. A. H. Dorrah, P. Bordoloi, V. S. de Angelis, et al., “Light sheets for continuous-depth holography and three-dimensional volumetric displays,” Nat. Photonics 17(5), 427–434 (2023). [CrossRef]  

14. M. Zamboni-Rached, “Stationary optical wave fields with arbitrary longitudinal shape by superposing equal frequency bessel beams: Frozen waves,” Opt. Express 12(17), 4001–4006 (2004). [CrossRef]  

15. M. H. Eybposh, N. W. Caira, M. Atisa, et al., “Deepcgh: 3d computer-generated holography using deep learning,” Opt. Express 28(18), 26636–26650 (2020). [CrossRef]  

16. Y. Peng, S. Choi, N. Padmanaban, et al., “Neural holography with camera-in-the-loop training,” ACM Trans. Graph. 39(6), 1–14 (2020). [CrossRef]  

17. J. Wu, K. Liu, X. Sui, et al., “High-speed computer-generated holography using an autoencoder-based deep neural network,” Opt. Lett. 46(12), 2908–2911 (2021). [CrossRef]  

18. V. Arrizón, U. Ruiz, R. Carrada, et al., “Pixelated phase computer holograms for the accurate encoding of scalar complex fields,” J. Opt. Soc. Am. A 24(11), 3500–3507 (2007). [CrossRef]  

19. O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” (2015).

20. X.-J. Mao, C. Shen, and Y.-B. Yang, “Image restoration using convolutional auto-encoders with symmetric skip connections,” arXiv, arXiv:1606.08921 (2016). [CrossRef]  

21. K. Matsushima and T. Shimobaba, “Band-limited angular spectrum method for numerical simulation of free-space propagation in far and near fields,” Opt. Express 17(22), 19662–19673 (2009). [CrossRef]  

22. E. Agustsson and R. Timofte, “Ntire 2017 challenge on single image super-resolution: Dataset and study,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, (2017), pp. 126–135.

23. H. Zhou, B. Sain, Y. Wang, et al., “Polarization-encrypted orbital angular momentum multiplexed metasurface holography,” ACS Nano 14(5), 5553–5559 (2020). [CrossRef]  

24. Y. Aborahama and M. Mojahedi, “Tailor-made 3d vectorial optical fields in simple media,” in CLEO: Science and Innovations, (Optica Publishing Group, 2022), pp. JW3B–46.

25. M. Zamboni-Rached, G. d. A. Lourenço-Vittorino, T. V. de Sousa, et al., “Mathematical description of a frozen wave beam after passing through a pair of convex lenses with different focal distance,” arXiv, arXiv:1907.08202 (2019). [CrossRef]  

Data Availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (10)

Fig. 1.
Fig. 1. Configurations of the holographic systems (for our various experiments). The holographic images are synthesized on: a) $x-z$ plane at $y = 0$. b) $x-z$ plane at $y=1.6$mm. c) Three $x-z$ planes separated by $1.2$mm distance, oriented perpendicular to the boundary region ($x-y$ plane at $z = 0$).
Fig. 2.
Fig. 2. Deep neural network-assisted light sheet holography algorithm. a) The U-net architecture. Each convolutional layer in the down-sampling blocks (DS1-4) is followed by a batch normalization layer and a Leaky ReLU activation function. The up-sampling blocks (US1-4) are implemented by transposed convolutional layers followed by a convolutional layer with ReLU activation function. The output convolutional layer uses a linear function to produce the real and imaginary parts of the optimal complex field. b) The complex field generator is a U-net, which generates a complex valued tensor with the same dimensions as the target amplitude profile. The number of channels, P, is the number of $x-z$ planes in the system. c) The hybrid holography system calculates the optimal FW’s expansion coefficients and the transverse boundary condition. In the training routine, the network’s parameters are optimized such that the simulated amplitude profile on longitudinal planes match the target profile. d) Training datasets: the DIV2K dataset (dataset 1), featuring natural images, and a synthetic dataset composed of images with randomly placed circles (dataset 2).
Fig. 3.
Fig. 3. The simulated amplitude profile on the target plane for a Bessel beam with mode number, $j = -11$ and axis coordinate, $x_s = 0.4$mm.
Fig. 4.
Fig. 4. Longitudinal holography setup, consisting of a reflective phase-only SLM and a 4f-lens system. The combination can generate arbitrary complex field at $z=0$ by modulating the reference beam. To measure the light intensity on the $x-z$ planes, a digital camera mounted on a translation stage captures images while moving along the $z$-direction.
Fig. 5.
Fig. 5. Examples of simulated intensity profiles obtained from the baseline 2D-FW method, CNN1 (trained using analytical calculations), CNN2 (trained using ASM simulations), and the two candidate iterative methods, GD-FW and GD-ASM. The image accuracy is measured by cross-correlation accuracy and Peak Signal to Noise Ratio (PSNR) between the reconstructed and target images.
Fig. 6.
Fig. 6. Comparing the performance of 2D-FW method, hybrid systems (CNN1 and CNN2), and iterative methods (GD-FW and GD-ASM). a) The cross correlation accuracy graph with respect to computation time for the target image shown in the first row of Fig. 5. GD-ASM is the slowest method but eventually yields the most accurate image as the computation time increases. b) Peak Signal to Noise Ratio (PSNR) of the reconstructed image for samples in the test dataset using various methods. The iterative methods, GD-FW and GD-ASM, were run for roughly 4 and 7 seconds, respectively. Both networks show higher PSNR compared to 2D-FW and iterative methods.
Fig. 7.
Fig. 7. Measured intensity profiles obtained from the baseline 2D-FW method, CNN1 (trained by analytical simulations), and CNN2 (trained by ASM simulations). The images obtained from 2D-FW suffers from checkerboard artifacts due to the interference between sampled lines. Both networks produce more accurate images with minimal artifacts. The measured and simulated images (Fig. 5) share similar characteristics, although the complexity of imaging procedures limits the accuracy of measurements.
Fig. 8.
Fig. 8. Simulated and measured intensity profiles obtained from 2D-FW, CNN3 (trained by analytical simulations) and CNN4 (trained by ASM simulations), for the setup where the image is extended beyond the reach of some Bessel modes. Both the 2D-FW method and CNN3 fail to generate accurate profiles beyond the depth of focus of the FWs.In contrast, CNN4, which is optimized based on accurate ASM simulations, is insensitive to the FW’s parameters such as the length and/or Q factor.
Fig. 9.
Fig. 9. Simulated and measured intensity profiles on a longitudinal plane at $y= 1.6$mm using 2D-FW method, CNN5 (trained by analytical simulations), and CNN6 (trained by ASM simulations). 2D-FW method and CNN5 generate distorted images as the FW expansion becomes inaccurate on planes close to the edge of the boundary region. Conversely, the network trained by the ASM simulation maintains high accuracy for the shifted plane.
Fig. 10.
Fig. 10. Simulated and measured intensity profiles on three longitudinal planes at $y = -1.2$mm, $0$mm, and $1.2$mm obtained from 2D-FW method and CNN3D trained by analytical simulations. The images produced by CNN3D are more accurate and show less cross talk between planes.

Equations (7)

Equations on this page are rendered with MathJax. Learn more.

Ψ ( ρ , ϕ , z , t ) = ν j = N N e i ω t C ν j j v ( k ρ j ρ ) e i k z j z e i ν ϕ ,
C ν j = 1 / L 0 L F ν ( z ) e i 2 π j / L z d z
Ψ ( x , y , z ) = s = 1 N x j = N N C 0 j ( x = x s ) J 0 ( k ρ j y 2 + ( x x s ) 2 ) e i k z j z
Ψ ( x , y , z ) = k = 1 P s = 1 N x j = N N C 0 j ( x = x s ) J 0 ( k ρ j ( y y k ) 2 + ( x x s ) 2 ) e i k z j z .
E b ( x , y ) = k = 1 P s = 1 N x j = N N C 0 j ( x = x s ) J 0 ( k ρ j ( y y k ) 2 + ( x x s ) 2 )
Ψ ( x , y , z ) = j = N N e i k z j z s = 1 N x C 0 j ( x = x s ) J 0 ( k ρ j y 2 + ( x x s ) 2 )
A c c = A A ^ A 2 A ^ 2 L o s s = 1 A c c
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.