## Abstract

We present a learning-based Shack-Hartmann wavefront sensor (SHWS) to achieve the high-order aberration detection without image segmentation or centroid positioning. Zernike coefficient amplitudes of aberrations measured from biological samples are referred and expanded to generate the training datasets. With one SHWS pattern inputted, up to 120^{th} Zernike modes could be predicted within 10.9 ms with 95.56% model accuracy by a personal computer. The statistical experimental results show that compared with traditional modal-based SHWS, the root mean squared error in phase residuals of this method is reduced by ∼40.54% and the Strehl ratio of the point spread functions is improved by ∼27.31%. The aberration detection performance of this method is also validated on a mouse brain slice with 300 µm thickness and the median improvement of peak-to-background ratio of this method is ∼30% to 40% compared with traditional SHWS. With the high detection accuracy, simple processes, fast prediction speed and good compatibility, this work offers a potential approach to improve the wavefront sensing ability of SHWS, which could be combined with an existing adaptive optics system and be further applied in biological applications.

© 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## Corrections

Lejia Hu, Shuwen Hu, Wei Gong, and Ke Si, "Learning-based Shack-Hartmann wavefront sensor for high-order aberration detection: erratum," Opt. Express**28**, 32132-32132 (2020)

https://opg.optica.org/oe/abstract.cfm?uri=oe-28-21-32132

## 1. Introduction

During biomedical imaging, optical aberrations are induced due to the optical in-homogeneities of the biological tissue as well as the refractive index mismatching between the tissue, water and coverslip. The induced optical aberrations dramatically degrade the imaging performance of the microscopies as the imaging depth increasing [1]. Using transparent or optically benign model organism, such as zebrafish, could get better imaging results. However, complex aberrations still exist in this nominally transparent sample [2]. As for *ex vivo* biological tissues, optical clearing methods could reduce the light scattering in thick tissues [3], but the aberrations that reduce the quality of imaging still exist.

To improve the imaging quality, adaptive optics (AO) is introduced to measure and correct the aberrations accordingly [4]. As an important direct wavefront sensing device in AO, Shack-Hartmann wavefront sensor (SHWS) has been widely applied for biomedical imaging [1], astronomical imaging [5], human eye aberration measurement [6], high-energy laser beam quality testing [7], and optical tweezer [8] in the past few decades. In recent years, SHWS has been reported to offer efficient wavefront sensing in several kinds of microscopies, such as wide-field microscopy [9], confocal microscopy [10,11], two-photon microscopy [2,12], structured illumination microscopy [13] and light-sheet microscopy [14].

The SHWS usually consists of a micro-lens array and a camera. The sub-apertures of the micro-lens array sample the corresponding parts of the wavefront on the pupil plane and then focus the light onto the camera. The focal spots deviate from their ideal positions due to the wavefront slope over each sub-aperture. During the wavefront measurement, the centroid’s position of each spot is determined and the displacements in *x* and *y* directions should be calculated. With these computed values, the wavefront could be reconstructed through zonal or modal approaches [15,16]. This also means that the accuracy of centroid positioning directly affects the detection accuracy of SHWS [17]. However, the complex sensing processes to some extent limit the speed of SHWS. In order to reduce the latency of SHWS, field-programmable gate array (FPGA) based approach [18] and graphics processing unit (GPU) based approach [19] have been proposed for high-speed wavefront sensing.

Recently, machine learning has been utilized to implement an image-based wavefront estimation [20–23]. In [20], Paine *et al.* applied a convolutional neural network (CNN, a type of artificial neural networks) to predict an initial Zernike coefficients estimate of the wavefront from point spread function (PSF) images, although the tip and tilt terms were estimated through centroiding algorithms. In [21], Jin *et al.* proposed a machine learning based approach to estimate the first 10 Zernike coefficients of the wavefront aberration from distorted PSFs and demonstrated the aberration correction capability of their method by focusing through scattering mediums. However, the tip-tilt coefficients of the distorted PSF should be corrected before sending into the trained network, which may reduce the efficiency of wavefront sensing during real-time imaging. Recently, Nishizaki *et al.* presented a deep learning wavefront sensing method to estimate the first 32 Zernike coefficients of the wavefront with a single intensity image and verified the feasibility with an overexposured, defocused, or scattered image [22]. More recently, Zhang *et al.* applied a trained CNN to recover the doughnut-shaped focus and the 4^{th}–15^{th} Zernike coefficients could be estimated with high speed [23]. Although the mentioned methods provide new approaches for high-speed wavefront sensing, a photon detector, such as a camera, is still required to record the PSF of the system [21,23]. However, when implementing real-time *in vivo* deep tissue imaging, it is impossible to set a detector inside the tissue for high-speed PSF gathering. Due to the structural complexity of biological tissues, there are some non-ignorable high-order aberrations need to be corrected. Therefore, the predicted orders of Zernike modes still need to be extended for biological applications, such as deep tissue optical stimulation or imaging.

As for SHWS, neural networks are also utilized to enhance its performance [24–26]. In [24], Guo *et al.* applied the back-propagation artificial neural networks to estimate the Zernike coefficients using centroid displacements from SHWS and compared the wavefront reconstruction results with traditional approaches. Barwick proposed an astigmatic hybrid wavefront sensor with neural network post-processing to improve the wavefront reconstruction accuracy [25]. Li *et al.* applied artificial neural networks to calculate the centroids of SHWS in extreme situations, which improves the robustness of SHWS [26]. However, these works still require centroid positioning and displacement calculation to compute the Zernike mode coefficients from SHWS patterns.

In this paper, we propose a learning-based SHWS to achieve high-order aberration detection. This method is compatible with common SHWS systems and does not require centroid positioning while estimating the Zernike mode coefficients. A CNN based model is established to estimate the first 120 Zernike modes with large coefficient amplitudes in high prediction speed. During the prediction, only one pattern from SHWS is needed and the image segmentation process is not required. The wavefront estimation performance of this method, including prediction speed, root mean squared (RMS) error in phase residuals and the Strehl ratio of the PSFs, were experimentally compared with that through traditional modal-based wavefront sensing approach. A mouse brain slice with 300 µm thickness is used to validate the aberration detection performance of this method on real biological tissues. Statistical results of the improvements of peak-to-background ratio (PBR) about the corrected foci by two approaches are presented.

## 2. Methods

#### 2.1 Optical system for LSHWS

The experimental optical system setup for learning-based Shack-Hartmann wavefront sensor (LSHWS) is illustrated in Fig. 1. A collimated and expanded continuous wave light beam with 632.8 nm wavelength from a He-Ne laser (SPL-HN250P, SPL Photonics) serves as the light source. A linear polarizer plate is placed before the beam splitter (BS) to ensure the polarization direction of light consisting with the requirement of the spatial light modulator (SLM, PLUTO-NIR-011-A, Holoeye Photonics). The light reflected by the BS casts onto the SLM and the transmission part of the light is blocked by the beam blocker. The incident light on the SLM is phase-modulated and then reflected to a pair of relay lenses. The field stop (FS1) placed on the focal plane of the lenses is used to block the unwanted diffraction orders from SLM. The diameter of the FS1 is 800 µm and the corresponding cut-off spatial frequency is 6.32 circle/mm. The surface of the SLM is mutually conjugated to the rear pupil plane of the objective lens (OBJ1, ELWD 20X/0.45, Nikon), so the phase loaded on the SLM is stationary at the rear pupil. The light is focused by OBJ1 and then collected by another objective lens (OBJ2, Plan Apo λ 20X/0.75, Nikon) which share the same focal plane as OBJ1. For PSF quality monitoring and wavefront sensing, a relay lens is used to deliver the transmitted light to a beam splitter plate (BSP). Fifty percent of the light is reflected by the BSP and collected by a complementary metal oxide semiconductor (CMOS) camera (DMK 23UV024, The Imaging Source). The rest part of the light is relayed to the micro-lens array (64–483, Edmund Optics) of the SHWS and focuses on the camera (Zyla 4.2, Andor). Before the experiments, the SLM is calibrated based on the interferometric method [27]. During the experiments, 16×16 micro-lenses (∼208 effective lenses) are used to generate the spot pattern. The field stop (FS2) serves as a low pass filter on the phase, which could improve the performance of the SHWS by reducing the high spatial frequency components of the aberrations [28]. The diameter of the FS2 is 400 µm and the corresponding cut-off spatial frequency is 1.58 circle/mm.

The spot pattern recorded by the SHWS is put into the trained CNN model, and the predicted Zernike coefficients are used to reconstruct the wavefront. To evaluate the aberration detection capability of the LSHWS, the phase residuals (aberration subtracts the predicted wavefront) is loaded onto the SLM. Then the PSF on the CMOS as well as the spot pattern on the SHWS is recorded. In addition, the system is also compatible with traditional SHWS (TSHWS), so the wavefront compensation results of these two methods could be compared, such as the root mean squared (RMS) error in phase residuals and the Strehl ratio of the PSFs.

It should be mentioned that, the used objective (OBJ2) is corrected for a 0.17 mm coverslip. Therefore, while capturing the datasets, a glass coverslip is added between OBJ1 and OBJ2 for better imaging results.

#### 2.2 Datasets for model training and testing

The datasets for model training and testing consist of two parts, the randomly assigned 2^{nd}–120^{th} Zernike coefficient arrays and the corresponding spot patterns from SHWS. The first Zernike mode named ‘piston’ is not included because it presents the mean value of the wavefront across the pupil and has no effect on the PSF. Each Zernike coefficient array could generate a phase pattern through Zernike polynomials and all the phase patterns are loaded on the SLM one by one to distort the wavefront across the pupil. For each phase pattern, the corresponding spot pattern on the SHWS is recorded by the camera. The coefficient amplitude ranges of 2^{nd} to 120^{th} Zernike modes for datasets generating are listed in Table 1.

It is worth noting that these coefficient amplitude ranges refer to the data measured from the actual biological samples (mouse oocyte cell, mouse blastocyst, Nematode *C. Elegans* and mouse brain) and then expanded to generate the datasets [12,29]. The ranges of the coefficient amplitude used here are larger than that reported in previous works [21,23], and these large aberrations also challenge the performance of wavefront sensing approaches. If the aberrations of the measured sample exceed the coefficient range of the generated datasets, the aberrations can be measured by phase stepping interferometry to calculate the Zernike coefficient amplitudes and then update the training datasets [29].

In this paper, 40,960 training datasets, 8,192 validation datasets and 2,048 testing datasets were generated for model training and testing. Besides, the coefficient arrays in these three datasets are different from each other. The spot patterns acquisition was performed on a personal computer (PC, Intel Core i7 4770K 3.50 GHz, Kingston 16GB, NVIDIA GeForce GTX 970) with the optical system illustrated in Fig. 1. In order to enhance the robustness of the model, around 8% of the SHWS patterns in the training datasets were replaced by those patterns with the same coefficient arrays but under different exposures.

#### 2.3 Neural network architecture

The CNN architecture for LSHWS training is illustrated in Fig. 2(a). The network contains five convolutional layers and three fully connected layers, which are indicated in orange and purple respectively. This network is mainly based on AlexNet, which was initially used for ImageNet classification [30]. Although several new networks with deeper layers have been proposed to implement image-based wavefront sensing [20,22], AlexNet is still capable of Zernike coefficient regression [21]. The fewer parameters shortened the training and predicting process, which is a good compromise for high-speed wavefront sensing between simplicity and performance.

The size of the spot patterns (1232×1232) from SHWS were resized into 256×256 for down sampling. It should be mentioned that an SHWS pattern with large pixel number could reduce the centroid positioning error of TSHWS. As for LSHWS, if a camera with fewer pixels is utilized in the SHWS, the down sampling process can be left out. The first convolutional layer filters the input pattern with 32 kernels of size 5×5. The output of this layer is inputted to the second convolutional layer and filtered with 32 kernels of size 5×5. The next three convolutional layers filter the output from former layers with 64 kernels of size 3×3. The results from the fifth convolutional layer are flattened and sent to the fully connected layers, which have 512, 512 and 119 neurons, respectively. The fully connected layer outputs 119 parameters which refer to the predicted Zernike mode coefficients (except ‘piston’). It should be mentioned that the first, second and fifth convolution layers are contained with max-pooling operations (size 3×3), which could reduce the amount of calculation [31]. An activation function named Rectified Linear Unit (ReLU) is added to each of the first five layers, which is used to introduce the nonlinearity between layers of the neural network, thus reducing the training time of the model [32].

#### 2.4 The processes of LSHWS method

The LSHWS method we proposed here has two stages, model training and model testing, as shown in Fig. 2(b). While training the model, a series of spot patterns with the corresponding target Zernike coefficients is sent into the network. Here, the mean squared error (MSE) of the difference between the predicted Zernike coefficients and the target Zernike coefficients is chosen as the loss function. During the training, in each epoch, the MSE of the difference is minimized by iterative back-propagation through an adaptive moment estimation (ADAM) based optimization with an adaptive learning rate, thus updating the weights and biases for the network [33]. While testing the model, one or more spot patterns could be sent into the model and the predicted Zernike coefficients could be further transformed into wavefront distribution through Zernike polynomials. Figure 2(c) gives the comparison of mode amplitude of Zernike coefficients between aberration and predicted wavefront in Fig. 2(b). It is easy to find that both the mode amplitude of coefficients predicted by the model and its corresponding wavefront are close to the target.

#### 2.5 Details of network implementation

The network model for LSHWS is implemented using Keras framework (with TensorFlow backend, GPU version 1.12.0) based on Python 3.6.8. The model was trained on a desktop workstation (Intel Core i9-7920X CPU @ 2.90 GHz, Kingston 128GB, NVIDIA RTX 2080 Ti) and tested on the PC which is used in section 2.2. It should be noted that once the network is trained, it can be transferred to other computers for wavefront prediction. The training process took about 59 minutes for 50 epochs with a batch size of 32. The accuracy of the trained model is 95.56% and the MSE of 2^{nd}–120^{th} Zernike coefficients is ∼0.0246. While testing the model on the PC, the wavefront prediction time of the model is 10.9 ms per pattern. If a higher performance computer is used, the prediction speed of LSHWS could be further promoted. Besides, the accuracy of the model could be improved if more training datasets are generated to traverse the sample space of Zernike coefficients.

## 3. Results

#### 3.1 Experimental demonstration

In this paper, a learning-based SHWS named LSHWS is proposed to achieve high-order aberration detection. Unlike traditional SHWS, our method does not require image segmentation, centroid positioning or centroid displacements calculation. In order to compare the performance advantages of this method with the traditional SHWS in wavefront detection, we reproduce the modal-based SHWS referring to the previous work [34]. Here, the wavefront detection speed of modal-based SHWS (traditional SHWS) is 17.6** **ms per pattern, which is ∼38.1% slower than that with LSHWS.

Figure 3 gives one comparison result of wavefront sensing capability of TSHWS and LSHWS. All the SHWS patterns were captured under the same exposure. Before the experiments, the ideal SHWS pattern without aberration was recorded as a centroid reference for wavefront detection through TSHWS. A set of Zernike coefficients randomly chosen from the testing datasets was used to form the aberration phase-mask. The aberration phase-mask was loaded on the SLM, which distorted the spots on the SHWS, as shown in Fig. 3(a). The subgraph ‘I’ in Fig. 3(e) shows the used aberration phase-mask.

To calculate the aberration through TSHWS, the sub-apertures of this distorted SHWS pattern were segmented according to the ideal pattern. Then the centroid position of each spot was determined to calculate its displacements in *x* and *y* directions. Finally, the local derivative of each sub-aperture could be computed through modal based approaches. The subgraph ‘II’ in Fig. 3(e) shows the wavefront detected through TSHWS. The phase residual after compensation was loaded on the SLM and the SHWS pattern after compensating is presented in Fig. 3(b). When implementing wavefront sensing through LSHWS, the distorted SHWS pattern was first down sampled to 256×256 size and then sent into the trained model. The wavefront was reconstructed with the predicted coefficients, as shown in subgraph ‘III’ of Fig. 3(e). The phase residual after compensation was also loaded on the SLM and the SHWS pattern after compensating is shown in Fig. 3(c). From the macroscopic comparison of SHWS patterns, we can find that the distribution of the spot arrays after compensating with the wavefront measured by LSHWS is more uniform and neater.

To further compare the SHWS patterns before and after compensation, we chose four spots to compare their details. The spots were numbered and then boxed with white dotted rectangles, as shown in Fig. 3(a). Figure 3(d) lists the numbered spots before and after compensation. It is clear to find that the spots after compensating with the wavefront detected by LSHWS is tighter and the displacements of the centroids are well corrected. We plotted the intensity distributions of the spots to compare the compensation results. The direction of the profiles is indicated with green arrows in Fig. 3(a). From the profiles, we can find that the spots with LSHWS based compensation have higher intensities and uniform distribution. The comparison results above mean that the wavefront detected by LSHWS is closer to the original aberration.

It is necessary to evaluate the LSHWS based wavefront detection with statistic results. Here, we quantitatively evaluated the wavefront detection performance of LSHWS by calculating the RMS error in phase residuals and the Strehl ratio of the PSFs, which are widely used to assess the performance of an adaptive optical correction system.

Figure 4 shows the comparison results of two test groups, which makes a more intuitive comparison. In Figs. 4(a) and 4(b), the subgraphs above represent the quiver plots of the SHWS spots before and after compensation. The subgraphs bellow are their corresponding PSFs. The aberrations and the phase residuals are inserted in the subgraphs of PSFs. The PSFs are captured with the same exposures. From the comparison, it can be noticed that the wavefront detected by LSHWS can better correct the displacements in SHWS spots, and better focusing can be observed after the corrections. From the aspect of phase residuals, compared with the TSHWS based wavefront detection, the LSHWS based approach can effectively reduce the phase residuals, which makes the enhancement of system's Strehl ratios possible. The corresponding RMS error of phase residuals of TSHWS and LSHWS in Fig. 4(a) are 0.1873 λ and 0.1280 λ, respectively. And the RMS error of phase residuals in Fig. 4(b) are 0.1822 λ and 0.1187 λ, respectively. Figures 4(c) and 4(d) provide the subtracted Zernike mode amplitude to compare the differences between two wavefront sensing approaches (LSHWS subtract Aberration vs. TSHWS subtract Aberration). It is obvious that, compared with TSHWS, the differences in coefficients measured by LSHWS are lower than that with TSHWS, especially those modes with large amplitudes or higher-orders. The central intensity profiles of the PSFs shown in Figs. 4(c) and 4(d) are presented in Figs. 4(e) and 4(f), respectively. The peak intensities of the PSFs with LSHWS based wavefront compensations are higher than that with TSHWS based approach, which means the wavefronts detected by LSHWS could offer a better compensation result than that with TSHWS. To further investigate the wavefront sensing accuracy of these two methods, 100 testing datasets were used to calculate the RMS error of phase residuals in the range of [−π, π] after compensations.

The RMS error of phase residuals after compensating with two approaches are shown in Fig. 4(g). The mean RMS error of TSHWS based method and LSHWS based method are 0.1882λ (∼0.0183λ standard deviation) and 0.1119λ (∼0.0163λ standard deviation), respectively. The mean RMS error of phase residuals from LSHWS based method is 40.54% lower than that from TSHWS based method, which means that the wavefront detected by LSHWS is more suitable for reducing the phase residuals when compensating the high-order aberrations. Although the standard deviation in the RMS error of phase residuals by LSHWS based method is slightly larger than that of TSHWS based method, this may due to the fact that the training datasets do not traverse the sample space (Zernike amplitude range). Once the training datasets are enriched, the phase residuals and standard deviations can be further reduced.

To further investigate the wavefront compensation capability of LSHWS, we calculated the correlation coefficient of the SHWS patterns and the Strehl ratio of PSFs before and after correction. Twenty sets of test data were used for statistical analysis, as shown in Table 2. The mean values and standard deviations of the correlation coefficient and Strehl ratio are given. While calculating the correlation coefficients of the SHWS patterns, the ideal SHWS pattern without aberrations was used as a reference. From the listed data we can find that the correlation coefficient of LSHWS corrected patterns is ∼5.13% higher than that corrected with TSHWS. It is interesting to find that the correlation coefficient from LSHWS corrected patterns is close to the model accuracy (95.56%), which indirectly indicates that the accuracy of SHWS based wavefront detection depends mainly on the calculation of centroid displacements. The higher the correlation coefficient between the SHWS patterns and the ideal pattern, the higher the accuracy of the wavefront detection method.

In this paper, the Strehl ratio of PSFs is defined as the ratio of the maximum intensity of the PSF after wavefront compensation to that of the ideal PSF of the optical system. The Strehl ratios of distorted PSFs are also calculated for comparison. From Table 2 we can find that the mean Strehl ratio with LSHWS based method is ∼27.31% higher than that with TSHWS based method, which means the aberrations of the system can be well corrected by LSHWS detected wavefront.

#### 3.2 Experimental validation on real biological tissue

Once the neural network model for LSHWS was trained, it is necessary to validate its aberration detection performance on real biological tissue. As mentioned in section 2.2, the model is trained for high-order aberration detection and the Zernike mode coefficients in training datasets are referenced and expanded from real biological samples, such as mouse brain.

To better demonstrate the aberration detection performance of these two approaches (TSHWS and LSHWS), we prepared a mouse brain slice with 300 µm thickness (sandwiched by two coverslips). The protocol for mouse brain slice preparation has been described in our previous work [21].

Figure 5 provides the experimental validation of aberration detection on the cortex of a real mouse brain slice (300 µm thickness) with TSHWS and LSHWS. The SHWS patterns (above) and the corresponding PSFs (below) before and after compensation are presented in Figs. 5(a)–5(c).

To better compare the compensation results, the intensity profiles of the PSFs are provided in Fig. 5(d). The curves are obtained from the *x* directions where the maximum pixels located in the PSFs. The Zernike mode coefficients detected by two approaches in Fig. 5(b) are shown in Fig. 5(e). The corresponding detected wavefronts are inserted in the sub-graph.

Due to the light absorption, it is not intuitive to compare the Strehl ratio of PSFs in this situation. As an alternative, we calculated the PBRs of the corrected foci. Here, the PBR is defined as the ratio of the peak intensity at the pixel with the maximum intensity to the average background intensity outside the focal volume (one Airy disk). The PBRs of the corrected foci in Fig. 5(a) are 24.30 (TSHWS) and 27.85 (LSHWS), the PBRs in Fig. 5(b) are 17.40 (TSHWS) and 23.39 (LSHWS), and that in Fig. 5(c) are 13.69 (TSHWS) and 22.21 (LSHWS). The corresponding improvements of PBR of LSHWS over TSHWS is 14.61%, 34.43% and 62.24%, respectively.

From the comparison, we can find that the distorted PSFs re-converge to the center, which means both the wavefront detected by TSHWS and LSHWS contribute to the aberration compensation. And the wavefront detected by LSHWS have a better performance in aberration correction. There are still some slight displacements in the corrected PSFs after compensating with two approaches, which could be rapidly corrected by additional tip-tilt corrections [20,21]. The coefficients detected by these two approaches have similar amplitudes in some low-order modes but most of the amplitudes in high-order modes are different. Although the network was trained within the Zernike coefficients range, it is capable to detect the modes with larger amplitudes than that we set for the training datasets, such as the 4th and 47th Zernike modes in Fig. 5(e).

Due to the optical in-homogeneities of the biological tissue, while implement aberration detection on the cortex of a mouse brain slice, the aberration correction results show different improvements with LSHWS and TSHWS. To better compare the aberration detection ability of these two approaches, we calculated the PBR of corrected foci from thirty measurements and analyzed the improvements of PBR with LSHWS over TSHWS, and the results are illustrated in Fig. 6. From Fig. 6(a) we can find that the PBR of corrected foci with two methods are slightly different in each group and the foci corrected with LSHWS has a higher PBR than that with TSHWS. Figure 6(b) presents the histogram of the improvements of PBR of LSHWS over TSHWS. The improvements of PBR various from 4.59% to 77.68% and the median improvements of PBR is located within the range of 30%∼40%. These results indicate that the aberration detected by LSHWS has a better correction performance on the cortex of mouse brain and the PBR of corrected foci could be well improved.

## 4. Discussions and conclusion

SHWS is an important tool to measure the optical aberration in real time. However, when detecting the aberration introduced by turbid media, such as biological tissues, the high-order aberrations will distort the spot patterns on the SHWS, which would degrade the accuracy of centroid positioning and limit the reliability of high-order aberration detection. In this paper, we propose a learning-based SHWS to achieve large high-order aberration detection. By analyzing the inputted SHWS pattern, the trained network could predict up to 120^{th} Zernike mode coefficients without image segmentation or centroid positioning.

Compared with traditional modal-based SHWS, LSHWS could offer higher prediction speed for real time wavefront measurement. The experimental results show that the average time LSHWS cost to predict the first 120 Zernike mode coefficients is 10.9 ms on a personal computer, which is ∼38.1% faster than that with TSHWS (17.6 ms). The increased detection speed could reduce the latency of an AO system and improve the imaging performance.

With a full SHWS pattern inputted, LSHWS could learn the features from the distorted spots, such as the displacements of spots and the intensity distributions of the spots. These features enable LSHWS to detect the high-order aberration with a higher accuracy. The experimental results on wavefront detection demonstration indicate that the wavefronts detected by LSHWS have a lower RMS error in the phase residuals, which is 40.54% lower than that with modal-based SHWS. The Strehl ratio of the PSFs after compensation with LSHWS is improved by ∼27.31% compared to the traditional method. The experimental validation on the mouse brain cortex proofs that the aberration detected by LSHWS has a better correction performance, and the statistic results indicate that the median improvement of PBR with LSHWS is ∼30% to 40% when compared with TSHWS. With the reduced detection error and the improved Strehl ratio or PBR, LSHWS could enhance the imaging quality, such as the signal to background ratio or the image sharpness.

Compared with the reported PSF images based wavefront sensing approaches, LSHWS does not need to detect the PSF inside or outside the sample, which makes it more suitable for scanning microscopes in biological imaging applications. Furthermore, LSHWS is compatible with common SHWS system, which could be conveniently combined with an existing AO system.

In this paper, the network for LSHWS was trained with high-order aberration datasets. If only lower-order aberrations are concerned, as long as the Zernike coefficients of an unknown aberration are within or close to the training ranges, the well-trained network could still offer a good prediction. The TSHWS and LSHWS are neck and neck in this situation. Besides, when the Zernike mode coefficients in the detected aberration are close to the ranges set for the training datasets, LSHWS could still offer an acceptable prediction results in the real biological tissue. If the coefficients in the detected aberration exceed the ranges set for the training datasets, the performance of LSHWS might be degraded and further training is required. In this case, the training datasets should be enriched and a neural network with deeper layers could be considered to further improve the performance of the LSHWS.

In summary, LSHWS offers a simplified approach to implement high-order aberration detection with faster speed. With the help of a neural network, the wavefront sensing ability of SHWS is improved. LSHWS has a good compatibility to an existing AO system, which could be further applied for optical stimulation or imaging in biological applications.

## Funding

National Natural Science Foundation of China (31571110, 61735016, 81771877); Natural Science Foundation of Zhejiang Province (LZ17F050001); Zhejiang Lab (2018EB0ZX01); Fundamental Research Funds for the Central Universities.

## Acknowledgments

We thank Yiye Zhang and Biwei Zhang for their helpful discussions on the implementation of neural networks and Younong Li and Xinpei Zhu for their help on biological sample preparation.

## Disclosures

The authors declare no conflicts of interest.

## References

**1. **N. Ji, “Adaptive optical fluorescence microscopy,” Nat. Methods **14**(4), 374–380 (2017). [CrossRef]

**2. **K. Wang, D. E. Milkie, A. Saxena, P. Engerer, T. Misgeld, M. E. Bronner, J. Mumm, and E. Betzig, “Rapid adaptive optical recovery of optimal resolution over large volumes,” Nat. Methods **11**(6), 625–628 (2014). [CrossRef]

**3. **X. Zhu, L. Huang, Y. Zheng, Y. Song, Q. Xu, J. Wang, K. Si, S. Duan, and W. Gong, “Ultrafast optical clearing method for three-dimensional imaging with cellular resolution,” Proc. Natl. Acad. Sci. U. S. A. **116**(23), 11480–11489 (2019). [CrossRef]

**4. **M. J. Booth, “Adaptive optics in microscopy,” Philos. Trans. R. Soc., A **365**(1861), 2829–2843 (2007). [CrossRef]

**5. **J. W. Hardy, * Adaptive optics for astronomical telescopes* (Oxford University, 1998), Vol. 16.

**6. **J. Z. Liang, D. R. Williams, and D. T. Miller, “Supernormal vision and high-resolution retinal imaging through adaptive optics,” J. Opt. Soc. Am. A **14**(11), 2884–2892 (1997). [CrossRef]

**7. **B. C. Platt and R. Shack, “History and principles of Shack-Hartmann wavefront sensing,” J Refract Surg **17**(5), S573–S577 (2001). [CrossRef]

**8. **P. J. Rodrigo, R. L. Eriksen, V. R. Daria, and J. Gluckstad, “Shack-Hartmann multiple-beam optical tweezers,” Opt. Express **11**(3), 208–214 (2003). [CrossRef]

**9. **O. Azucena, J. Crest, S. Kotadia, W. Sullivan, X. D. Tao, M. Reinig, D. Gavel, S. Olivier, and J. Kubby, “Adaptive optics wide-field microscopy using direct wavefront sensing,” Opt. Lett. **36**(6), 825–827 (2011). [CrossRef]

**10. **X. D. Tao, B. Fernandez, O. Azucena, M. Fu, D. Garcia, Y. Zuo, D. C. Chen, and J. Kubby, “Adaptive optics confocal microscopy using direct wavefront sensing,” Opt. Lett. **36**(7), 1062–1064 (2011). [CrossRef]

**11. **X. D. Tao, O. Azucena, M. Fu, Y. Zuo, D. C. Chen, and J. Kubby, “Adaptive optics microscopy with direct wavefront sensing using fluorescent protein guide stars,” Opt. Lett. **36**(17), 3389–3391 (2011). [CrossRef]

**12. **K. Wang, W. Sun, C. T. Richie, B. K. Harvey, E. Betzig, and N. Ji, “Direct wavefront sensing for high-resolution in vivo imaging in scattering tissue,” Nat. Commun. **6**(1), 7276 (2015). [CrossRef]

**13. **Q. Li, M. Reinig, D. Kamiyama, B. Huang, X. Tao, A. Bardales, and J. Kubby, “Woofer–tweeter adaptive optical structured illumination microscopy,” Photonics Res. **5**(4), 329–334 (2017). [CrossRef]

**14. **T. L. Liu, S. Upadhyayula, D. E. Milkie, V. Singh, K. Wang, I. A. Swinburne, K. R. Mosaliganti, Z. M. Collins, T. W. Hiscock, J. Shea, A. Q. Kohrman, T. N. Medwig, D. Dambournet, R. Forster, B. Cunniff, Y. Ruan, H. Yashiro, S. Scholpp, E. M. Meyerowitz, D. Hockemeyer, D. G. Drubin, B. L. Martin, D. Q. Matus, M. Koyama, S. G. Megason, T. Kirchhausen, and E. Betzig, “Observing the cell in its native state: Imaging subcellular dynamics in multicellular organisms,” Science **360**(6386), eaaq1392 (2018). [CrossRef]

**15. **D. L. Fried, “Least-square fitting a wave-front distortion estimate to an array of phase-difference measurements,” J. Opt. Soc. Am. **67**(3), 370–375 (1977). [CrossRef]

**16. **R. Cubalchini, “Modal wave-front estimation from phase derivative measurements,” J. Opt. Soc. Am. **69**(7), 972–977 (1979). [CrossRef]

**17. **D. R. Neal, J. Copland, and D. Neal, “Shack-Hartmann wavefront sensor precision and accuracy,” in * Advanced Characterization Techniques for Optical, Semiconductor, and Data Storage Components*, A. Duparre and B. Singh, eds. (Spie-Int Soc Optical Engineering, Bellingham, 2002), pp. 148–160.

**18. **M. Thier, R. Paris, T. Thurner, and G. Schitter, “Low-Latency Shack-Hartmann Wavefront Sensor Based on an Industrial Smart Camera,” IEEE Trans. Instrum. Meas. **62**(5), 1241–1249 (2013). [CrossRef]

**19. **J. Mompean, J. L. Aragon, P. M. Prieto, and P. Artal, “GPU-based processing of Hartmann-Shack images for accurate and high-speed ocular wavefront sensing,” Futur. Gener. Comp. Syst. **91**, 177–190 (2019). [CrossRef]

**20. **S. W. Paine and J. R. Fienup, “Machine learning for improved image-based wavefront sensing,” Opt. Lett. **43**(6), 1235–1238 (2018). [CrossRef]

**21. **Y. Jin, Y. Zhang, L. Hu, H. Huang, Q. Xu, X. Zhu, L. Huang, Y. Zheng, H. L. Shen, W. Gong, and K. Si, “Machine learning guided rapid focusing with sensor-less aberration corrections,” Opt. Express **26**(23), 30162–30171 (2018). [CrossRef]

**22. **Y. Nishizaki, M. Valdivia, R. Horisaki, K. Kitaguchi, M. Saito, J. Tanida, and E. Vera, “Deep learning wavefront sensing,” Opt. Express **27**(1), 240–251 (2019). [CrossRef]

**23. **Y. Zhang, C. Wu, Y. Song, K. Si, Y. Zheng, L. Hu, J. Chen, L. Tang, and W. Gong, “Machine learning based adaptive optics for doughnut-shaped beam,” Opt. Express **27**(12), 16871–16881 (2019). [CrossRef]

**24. **H. Guo, N. Korablinova, Q. S. Ren, and J. Bille, “Wavefront reconstruction with artificial neural networks,” Opt. Express **14**(14), 6456–6462 (2006). [CrossRef]

**25. **S. Barwick, “Detecting higher-order wavefront errors with an astigmatic hybrid wavefront sensor,” Opt. Lett. **34**(11), 1690–1692 (2009). [CrossRef]

**26. **Z. Q. Li and X. Y. Li, “Centroid computation for Shack-Hartmann wavefront sensor in extreme situations based on artificial neural networks,” Opt. Express **26**(24), 31675–31692 (2018). [CrossRef]

**27. **J. L. Fuentes, E. J. Fernández, P. M. Prieto, and P. Artal, “Interferometric method for phase calibration in liquid crystal spatial light modulators using a self-generated diffraction-grating,” Opt. Express **24**(13), 14159–14171 (2016). [CrossRef]

**28. **M. Shaw, K. O’Holleran, and C. Paterson, “Investigation of the confocal wavefront sensor and its application to biological microscopy,” Opt. Express **21**(16), 19353–19362 (2013). [CrossRef]

**29. **M. Schwertner, M. J. Booth, M. A. A. Neil, and T. Wilson, “Measurement of specimen-induced aberrations of biological samples using phase stepping interferometry,” J. Microsc. **213**(1), 11–19 (2004). [CrossRef]

**30. **A. Krizhevsky, I. Sutskever, and G. E. Hinton, “ImageNet Classification with Deep Convolutional Neural Networks,” in International Conference on Neural Information Processing Systems, (Curran Associates, Inc., 2012), 1097–1105.

**31. **B. Graham, “Fractional max-pooling,” https://arxiv.org/abs/1412.6071.

**32. **K. Hara, D. Saito, and H. Shouno, “Analysis of function of rectified linear unit used in deep learning,” in 2015 International Joint Conference on Neural Networks (IJCNN), (IEEE, 2015), 1–8.

**33. **D. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” https://arxiv.org/abs/1412.6980.

**34. **J. Antonello, “Optimisation-based wavefront sensorless adaptive optics for microscopy,” (Ph. D. thesis (Delft University of Technology), 2014).