## Abstract

Deconvolution can be used to obtain sharp images or volumes from blurry or encoded measurements in imaging systems. Given knowledge of the system’s point spread function (PSF) over the field of view, a reconstruction algorithm can be used to recover a clear image or volume. Most deconvolution algorithms assume shift-invariance; however, in realistic systems, the PSF varies laterally and axially across the field of view due to aberrations or design. Shift-varying models can be used, but are often slow and computationally intensive. In this work, we propose a deep-learning-based approach that leverages knowledge about the system’s spatially varying PSFs for fast 2D and 3D reconstructions. Our approach, termed MultiWienerNet, uses multiple differentiable Wiener filters paired with a convolutional neural network to incorporate spatial variance. Trained using simulated data and tested on experimental data, our approach offers a $625 {-} 1600 \times$ increase in speed compared to iterative methods with a spatially varying model, and outperforms existing deep-learning-based methods that assume shift invariance.

© 2022 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

Deconvolution is integral to many modern imaging systems. Imperfections in the optics may inadvertently blur the image (e.g., aberrations) and deconvolution can be used to computationally undo some of this blur [1,2]. In microscopy, deconvolution can reduce out-of-focus fluorescence to provide sharper 3D images [3–5]. Alternatively, distributed point spread functions (PSFs) can be intentionally designed into an imaging system to enable new capabilities, such as single-shot 3D [6–10] or hyperspectral imaging [11,12]. In this case, multiplexing optics encode 2D or 3D information by mapping each point in an object space to a distributed pattern on the image sensor, then deconvolution is used to recover the encoded image or volume. In either case, a deconvolution algorithm is needed to recover a clear image or volume from the blurred or encoded measurement.

A variety of algorithms have been used for deconvolution over the years. Classical methods range from closed-form approaches such as Wiener filtering to iterative optimization approaches, such as Richardson–Lucy and the fast iterative shrinkage-thresholding algorithm (FISTA). Many methods incorporate handpicked priors, such as total variation (TV) and native sparsity, to improve image quality. These approaches often assume that the system is shift invariant, meaning that all parts of the image have the same blur kernel. Shift invariance allows the forward model to be efficiently expressed as a convolution between the PSF and the object. However, most imaging systems will have a blur that varies across the field of view (FoV); that is, they have spatially varying PSFs, usually due to field-varying aberrations. This motivates the use of spatially varying deconvolution, for which several methods have been proposed [6,7,13–17]. Unfortunately, many of these algorithms are prohibitively slow and computationally intensive, making them unsuitable for real-time image reconstruction. Furthermore, these methods can suffer from poor image quality, especially for highly multiplexed imaging systems that have PSFs with large spatial extent, or for poorly chosen priors. Recently, deep-learning-based deconvolution methods have been demonstrated to improve both the image quality and reconstruction speed, providing a promising improvement over iterative approaches [18–21]. However, to date, these methods rely on a shift-invariant PSF approximation and do not generalize well to optical systems with field-varying aberrations.

In this work, we propose a new deep-learning based approach for fast, spatially varying deconvolution. Our network, termed MultiWienerNet, consists of multiple learnable Wiener deconvolutions followed by a refinement convolutional neural network (CNN). The Wiener deconvolution layer performs multiple Fourier-space deconvolutions, each with a different PSF from a particular field point, yielding several intermediate images that have sharp features in different regions of the image. These intermediate images are then fed into the refinement CNN that fuses and refines them to create the final sharp deconvolved image. The learnable Wiener deconvolution filters are initialized with PSFs captured at several locations in the FoV, but then are allowed to update throughout training to learn the best filters and noise regularization parameters. This allows us to incorporate knowledge of the field-varying aberrations into the network, providing a physically informed initialization that is further refined throughout training. The end result is a fast spatially varying deconvolution that is $625 {-} 1600 \times$ faster than the baseline iterative method (spatially varying FISTA [6]), enabling real-time image reconstruction. In addition, incorporating the field-varying PSFs allows our network to have better image quality near the edges of the FoV than is achieved by existing deep-learning-based methods that assume shift invariance.

Our approach consists of the following steps: 1) Generate a simulated training dataset by applying measured PSFs to images from open-source microscopy datasets, 2) Initialize and train the MultiWienerNet using the simulated data, and 3) Use the trained network for fast shift-varying deconvolutions, where the input to the network is a single measurement. To demonstrate, we choose the challenging example of single-shot 3D microscopy with Miniscope3D [6] as our test case. Miniscope3D uses a phase mask that consists of a random array of multifocal microlenses to encode 3D information in a 2D image. The system maps each object point in the FoV to a unique pseudorandom pattern on the sensor, then decodes the captured images by solving a sparsity-constrained inverse problem. We selected this system both for its spatially and depth-varying PSFs, as shown in Fig. 1, and its high degree of multiplexing, which creates a particularly challenging deconvolution problem. We demonstrate our approach on both 2D deconvolution, where the goal is to recover a 2D image from a 2D measurement, as well as 3D deconvolution, where the goal is to recover a 3D volume from a single 2D measurement.

To generate simulated datasets, we first need a forward model that can faithfully relate how a 3D object is mapped to a 2D measurement in our microscope system, taking into account the effects of spatially varying blur introduced by the system. To establish this forward model, the volumetric object intensity is treated as a 3D grid of voxels, ${\bf v}[x,y,z]$. Each voxel produces a PSF, ${\bf h}[x^\prime ,y^\prime ;x,y,z]$, on the camera sensor, where $[x^\prime ,y^\prime]$ are image space indices. Since the object voxels are mutually incoherent, the measurement can be expressed as a linear combination of the PSFs from each voxel in the object:

Using this forward model, we can simulate measurements from our microscope to use in training datasets. We run images from online microscopy datasets [23–26] through the low-rank forward model, generating pairs of ground truth volumes/images and simulated measurements. Given a good system forward model (see Supplement 1 for PSF calibration details), it is possible to generate any number of image pairs, which we can use to train our MultiWienerNet. We generate both 2D and 3D training datasets; the 2D dataset contains 2D target objects with dimensions ($x,y,z$) of (336,480,1), representing a FoV of $700 \times 1000\;\unicode{x00B5} {\text{m}^2}$, while the 3D dataset contains 3D target objects with dimensions ($x,y,z$) of (336,480,32), representing a FoV of $700 \times 1000 \times 320\;\unicode{x00B5} {\text{m}^3}$. We generate 5,000 2D and 15,000 3D training images, with an 80/20 training/testing split.

Our network consists of two components: a differentiable Wiener deconvolution layer, and a refinement CNN, as shown in Fig. 2. Wiener deconvolution is a fast and simple approach that is used for linear shift-invariant systems given a known PSF and noise level. It consists of a single Fourier filtering step, which can be efficiently computed using FFTs. However, when the assumption of shift-invariance does not hold, Wiener deconvolution results in a degraded image quality in the areas of the image in which the PSF differs from the one assumed. Hence, instead of performing Wiener deconvolution with a single PSF [21], we approximate the behavior of our spatially varying system using $M$ PSFs taken from different field points. Our Wiener deconvolution layer thus performs $M$ Wiener deconvolutions, resulting in $M$ intermediate deconvolved images, as shown in Fig. 2. Each will have sharp features in a different region of the image, corresponding to the area in the FoV from which the PSF was taken. These $M$ intermediate images are then fed into the refinement CNN that combines and refines the images to produce the final image/volume.

Mathematically, our differentiable Wiener deconvolution layer can be described as

After training, we test our model on 1,000 images in a held-out test set from the online datasets, and on experimental data from the Miniscope3D setup. We compare our results against an iterative spatially varying FISTA, a U-Net, and the U-Net with a single Wiener deconvolution [21]. Our results show that the MultiWienerNet achieves more than a $625 \times$ speedup in 2D reconstruction and a $1600 \times$ speedup in 3D reconstruction compared to FISTA, while also providing better PSNR and image quality, especially toward the edges of the FoV, where off-axis aberrations dominate. Our network also outperforms current deep-learning approaches while only being slightly slower. In addition, despite being trained solely on simulated data, the MultiWienerNet generalizes well to the experimental data. Though FISTA achieves slightly higher resolution near the center of the FoV [Fig. 3(a)], the MultiWienerNet performs better overall.

In summary, we propose a new network architecture to perform fast deconvolution for microscopes with spatially varying PSFs. Given knowledge of the system’s spatially varying PSFs, our proposed network is trained in simulation, fusing known system parameters in the form of multiple differentiable Wiener deconvolutions with a CNN refinement step. After training, our network provides a $625 - 1600 \times$ speedup over existing spatially varying deconvolution algorithms and improved reconstruction quality, especially at the edges of the FoV. The code is open source and can be used for an imaging system with spatially varying aberrations.

## Funding

National Science Foundation (DGE 1752814, 1617794); National Institutes of Health (1R21EY027597- 01); Gordon and Betty Moore Foundation (GBMF4562); Defense Advanced Research Projects Agency (N66001-17-C4015).

## Acknowledgment

The authors thank Amit Kohli for providing simulated point spread functions.

## Disclosures

The authors declare no conflicts of interest.

## Data availability

Data available upon request. Supporting code can be found in Supplement 1.

## Supplemental document

See Supplement 1 for supporting content.

## REFERENCES

**1. **J.-B. Sibarita, “Deconvolution Microscopy” in *Microscopy Techniques* (Springer2005), Vol. 95.

**2. **D. Sage, L. Donati, F. Soulez, D. Fortun, G. Schmit, A. Seitz, R. Guiet, C. Vonesch, and M. Unser, Methods **115**, 28 (2017). [CrossRef]

**3. **J. G. McNally, T. Karpova, J. Cooper, and J. A. Conchello, Methods **19**, 373 (1999). [CrossRef]

**4. **D. S. Biggs, Curr. Protoc. Cytom. **52**, 12 (2010). [CrossRef]

**5. **P. Sarder and A. Nehorai, IEEE Signal Process. Mag. **23**, 32 (2006). [CrossRef]

**6. **K. Yanny, N. Antipa, W. Liberti, S. Dehaeck, K. Monakhova, F. L. Liu, K. Shen, R. Ng, and L. Waller, Light Sci. Appl. **9**, 171 (2020). [CrossRef]

**7. **G. Kuo, F. L. Liu, I. Grossrubatscher, R. Ng, and L. Waller, Opt. Express **28**, 8384 (2020). [CrossRef]

**8. **F. L. Liu, G. Kuo, N. Antipa, K. Yanny, and L. Waller, Opt. Express **28**, 28969 (2020). [CrossRef]

**9. **N. Antipa, G. Kuo, R. Heckel, B. Mildenhall, E. Bostan, R. Ng, and L. Waller, Optica **5**, 1 (2018). [CrossRef]

**10. **M. S. Asif, A. Ayremlou, A. Sankaranarayanan, A. Veeraraghavan, and R. G. Baraniuk, IEEE Trans. Comput. Imaging **3**, 384 (2016). [CrossRef]

**11. **K. Monakhova, K. Yanny, N. Aggarwal, and L. Waller, Optica **7**, 1298 (2020). [CrossRef]

**12. **D. S. Jeon, S.-H. Baek, S. Yi, Q. Fu, X. Dun, W. Heidrich, and M. H. Kim, ACM Trans. Graph. **38**, 1 (2019). [CrossRef]

**13. **M. Arigovindan, J. Shaevitz, J. McGowan, J. W. Sedat, and D. A. Agard, Opt. Express **18**, 6461 (2010). [CrossRef]

**14. **N. Patwary and C. Preza, Biomed. Opt. Express **6**, 3826 (2015). [CrossRef]

**15. **E. Maalouf, B. Colicchio, and A. Dieterlen, J. Opt. Soc. Am. A **28**, 1864 (2011). [CrossRef]

**16. **S. Ben Hadj, L. Blanc-Féraud, and G. Aubert, SIAM J. Imaging Sci. **7**, 2196 (2014). [CrossRef]

**17. **L. Denis, E. Thiébaut, F. Soulez, J.-M. Becker, and R. Mourya, Int. J. Comput. Vis. **115**, 253 (2015). [CrossRef]

**18. **K. Monakhova, J. Yurtsever, G. Kuo, N. Antipa, K. Yanny, and L. Waller, Opt. Express **27**, 28075 (2019). [CrossRef]

**19. **F. Sureau, A. Lechat, and J.-L. Starck, Astron. Astrophys. **641**, A67 (2020). [CrossRef]

**20. **O. Ronneberger, P. Fischer, and T. Brox, “U-net: Convolutional networks for biomedical image segmentation,” arXiv:1505.04597 (2015).

**21. **S. S. Khan, V. Sundar, V. Boominathan, A. Veeraraghavan, and K. Mitra, “Flatnet: towards photorealistic scene reconstruction from lensless measurements,” IEEE Trans. Pattern Analysis Mach. Intell. (to be published).

**22. **R. C. Flicker and F. J. Rigaut, J. Opt. Soc. Am. A **22**, 504 (2005). [CrossRef]

**23. **C. F. Koyuncu, R. Cetin-Atalay, and C. Gunduz-Demir, Cytometry Part A **93**, 1019 (2018). [CrossRef]

**24. **S. Arslan, T. Ersahin, R. Cetin-Atalay, and C. Gunduz-Demir, IEEE Trans. Med. Imaging **32**, 1121 (2013). [CrossRef]

**25. **V. Ljosa, K. L. Sokolnicki, and A. E. Carpenter, Nat. Methods **9**, 637 (2012). [CrossRef]

**26. **Y. Zhang, Y. Zhu, E. Nichols, Q. Wang, S. Zhang, C. Smith, and S. Howard, in *CVPR* (2019).