High-resolution on-chip spatial heterodyne Fourier transform spectrometer based on artificial neural network and PCSBL reconstruction algorithm

Xiaojing Long; Zhuili Huang; Zhuili Huang; Ye Tian; Jihe Du; Yufei Liu; Yufei Liu; Yufei Liu

doi:10.1364/OE.500758

1. Introduction

The interaction between light and matter provides the basis for investigating the properties of specific molecules and compounds, making optical spectroscopy widely used in multiple fields, such as materials characterization, environmental monitoring, satellite remote sensing, and chemical sensing [1–3]. While conventional benchtop spectrometers provide high resolution and broad operating bandwidths, their bulkiness and large-scale optical components limit their applicability for in situ analysis. Therefore, miniaturized on-chip optical spectrometers have emerged as a more suitable option for integrated, portable devices like smartphones and unmanned vehicles [2]. Moreover, the complementary metal oxide semiconductor (CMOS) platform’s compatibility and integration advantages could effectively reduce the fabrication difficulty and cost of on-chip spectrometers [4].

In recent decades, various approaches have been proposed to achieve miniaturized spectrometers. One of the most straightforward techniques is to spatially separate the spectral content of the incident spectrum at different wavelengths using dispersive elements like gratings, followed by recording the output spectral signals with a photodetector array [5–8]. Dispersive spectrometers could offer high spectral resolution and broad operating bandwidth through increased grating channels or optical path length. However, this approach suffers from drawbacks such as increased device footprint, insertion loss, manufacturing cost, decreased signal-to-noise ratio (SNR), and reduced optical throughput [2,9,10]. Alternatively, another technique employs narrowband filters, such as Fabry-Perot (FP) filters [11,12], linear variable filters (LVF) [13], meta-surface filters [14], micro-ring resonators (MRRs) [9,15], etc., to selectively transmit light at specific wavelengths. This technique avoids the need of long optical path lengths, making it advantageous for system miniaturization [2]. Nevertheless, similar to dispersive spectrometers, the improvement in resolution is achieved by increasing the number of channels, leading to a decrease in SNR and optical throughput. In the past decades, with the advancement in computational techniques, a new type of on-chip spectrometer, named reconstructive spectrometer, has emerged. Reconstructive spectrometers could approximately reconstruct the incident spectrum with the assistance of the spectral retrieval algorithm [2]. This is achieved by sampling the incident signal with well-tailored optical elements (such as stratified waveguides [10], disordered photonic crystals [16,17], and spiral waveguides [18]) or well-tailored detectors (such as quantum dot filters [19], nanowires [20–22], and black phosphorus [23]) to obtain the unique response spectrum of each wavelength. Consequently, the unknown incident spectrum is transformed into a linear equation solution. Reconstructive spectrometers exhibit good robustness and tolerance for fabrication errors, but the spectrum resolution is determined by the orthogonality of each channel’s response spectrum, posing challenges in design [10,24]. In addition, the fabrication materials utilized in some of these structures in reconstructive spectrometers may not be compatible with CMOS or suitable for high-throughput wafer-level manufacturing techniques [25].

In contrast to the aforementioned types of spectrometers, the Fourier transform (FT) spectrometer has attracted widespread attention due to its structure simplicity, high SNR, large optical throughput, high resolution, and robustness against fabrication imperfections [26,27]. The advanced on-chip FT spectrometer can be categorized into three varieties: the active scanning FT spectrometer, the stationary wave integrated Fourier transform (SWIFT) spectrometer, and the spatial heterodyne Fourier transform (SHFT) spectrometer, based on the modulation of optical path length (OPL) differences [1]. The active scanning FT spectrometer employs either the thermo-optic effect to modulate the optical properties of the waveguide [28] or movable parts [29] to achieve different OPL differences. However, the enhancement in spectral resolution and operating bandwidth of active scanning spectrometers comes at the expense of extended measurement time and high-power consumption. In the SWIFT spectrometer, photodetectors directly record the standing wave fringes in the waveguide, without the need for moving parts. However, the spectral resolution is limited by the sampling interval, necessitating electro-optic modulation [30,31]. In contrast, the SHFT spectrometer have a number of attractive features including the absence of moving parts, thermo-optic modulation, or electro-optic modulation. It is more appealing due to its stabilization, fast responsiveness, and ease of package.

Theoretically, the FT spectrometer has the potential to reconstruct the incident spectrum by applying the Fourier transform to the interference pattern obtained from the output of the interferometer(s). However, in practice, obtaining all interference patterns with OPL differences ranging from 0 to infinity is nearly impossible, which limits the spectral reconstruction accuracy. In order to enhance the performance of the FT spectrometer, the utilization of a spectral retrieval algorithm is necessary. The spectral retrieval algorithm based on the pseudo-inverse is widely used in numerous of studies [1,27,32,33]. Furthermore, other algorithms have been also developed to enhance the reconstruction performance, including the convex (CVX) optimization algorithm [10], generalized cross-validation (GCV) [24], conditional generative adversarial network (cGAN) [34], compressive sensing (CS) [35], and ‘elastic-D1’ regularized regression method [36,37]. However, for those retrieval approaches, either the reconstruction result is badly impacted by measurement noise, or a large amount of experimental data is required for model training.

In our previous work [38,39], we explored the spectral retrieval approach to accurately reconstruct the incident spectrum with the presence of measurement noise. In contrast to the commonly used methods such as the pseudo-inverse method and CVX algorithm, our used pattern coupled sparse Bayesian learning (PCSBL) algorithm could reconstruct a single-peaked incident spectrum with a 0.5-nm full width at half maximum (FWHM) and a triple-peaked incident spectrum with 3-nm interval. However, the reconstruction results exhibit a slight wavelength shift due to noise corruption. In addition, we also investigated the combination of CVX algorithm and a tailored convolutional neural network (Unet) for reconstructing the incident spectrum with measurement noise. The simulation results demonstrate that, with the assistance of Unet, an array of 83 Mach-Zehnder interferometers (MZIs) can be utilized to reconstruct a narrow peak incident signal with 1.38-nm FWHM. Although a large amount of simulation data is used for training the model to achieve improved reconstruction results, this work demonstrates the potential of artificial neural networks (ANN) in mitigating the impact of noise on spectral reconstruction. As a consequence, we adopted a spectral retrieval algorithm that combines the PCSBL [39] with ANN to achieve accurate spectral reconstruction in the presence of measurement noise.

In this paper, we proposed and fabricated an integrated SHFT spectrometer on the SOI platform, employing a compact array of 16 MZIs with linearly increasing OPL differences. The experimental results show that, by leveraging the combined strengths of the PCSBL algorithm and ANN, our SHFT spectrometer demonstrates the capability to reconstruct narrowband incident signals with an FWHM of 0.5 nm and triple-peaked incident signals separated by a 3-nm interval across the wavelength range of 1530-1565 nm in the presence of measurement noise.

2. Operation principle

Figure 1 schematically illustrated the proposed on-chip SHFT spectrometer, which consists of a 16-channel optical power splitter and a MZI array. The optical power splitter consists of 15 cascaded 1 × 2 multimode interferences (MMIs), where each stage employs ${2^{j - 1}}$ MMIs to divide the light into ${2^j}$ channels. Thus, employing a four-stage cascade of MMIs, a total of $\mathop \sum \nolimits_{j = 1}^{j = 4} {2^{j - 1}}$ MMIs are used to divide the incident light into 16 channels with minimal power imbalance. The MZI array consists of 16 MZIs with linearly increasing path length differences, enabling the generation of wavelength-dependent spatial interference patterns at the output ports. In this device structure, the minimum path length difference is $\Delta {L_1} = 0$ µm, and the maximum path length difference is $\Delta {L_{16}} = 383.25$ µm. For coupling, the fiber-to-chip coupling is achieved by grating couplers designed specifically for TE polarization. The waveguide cross-section selected for SOI platform is 500 nm × 220 nm (width × height) to support the propagation of TE-mode. Moreover, both the buried oxide layer and silica cladding layer have a thickness of 2 µm, resulting in a group index of ${n_g} = 4.05$ and an effective index of ${n_{eff}} = 2.45$.

Fig. 1. The schematic diagram of the proposed on-chip SHFT spectrometer consisting of a 16-channel power splitter and an MZI array.

Download Full Size | PDF

The spectral reconstruction approach is according to the following principles. Suppose the transmission function of the i-th MZI in the array is ${F_{i,\lambda }}$. For a monochromatic incident signal ${X_\lambda }$, the output power of the MZI ${P_{i,\lambda }}$ can be mathematically written as:

(1)$${F_{i,\lambda }} \times {X_\lambda } = {P_{i,\lambda }}$$

Therefore, for a given monochromatic input spectrum, the output ports of the MZI array exhibit a distinctive spatial power distribution. In the case of a polychromatic incident signal, it can be represented as a combination of multiple monochromatic signals. As a consequence, the output power ${P_i}$ at the i-th output port for the polychromatic incident signal $X(\lambda )$ can be expressed by its integral with the corresponding transmission function ${F_i}(\lambda )$:

(2)$$\mathop \smallint \nolimits_{{\lambda _{min}}}^{{\lambda _{m,ax}}} {F_i}(\lambda )X(\lambda )= {P_i}$$

where ${\lambda _{min}}$ and ${\lambda _{max}}$ denote the minimum and maximum operating wavelength, respectively. The integral operation implies a continuous variation in wavelength. However, in practical terms, this is nearly impossible. Thus, a proper discretization of the wavelength is required. After discretization we can rewrite Eq. (2) as:

(3)$${F_{n \times m}}{X_{m \times 1}} = {P_{n \times 1}}$$

where m and n denote the number of sampling points of the optical spectrum and the number of output ports, respectively. ${X_{m \times 1}}$ and ${P_{n \times 1}}$ are column vectors representing an arbitrary polychromatic incident spectrum and detected output optical power of the MZI array, respectively. ${F_{n \times m}}$ is the transmission matrix of the MZI array, where each row represents the transmission of an individual MZI, and each column represents the spatial distribution of power values at the output ports for a specific monochromatic signal.

Based on Eq. (3), mathematically, we could reconstruct an arbitrary incident spectrum X by performing matrix multiplication between the detected optical power P and the inverse of the transmission matrix ${F^{ - 1}}$, i.e., $X = {F^{ - 1}}P$. However, the presence of experimental noise renders the matrix ${F^{ - 1}}$ ill-conditioned [40,41]. Therefore, the spectral retrieval approach needs to take into account the impact of noise. Assuming the measurement noise is denoted as ω, we can rewrite Eq. (3) as:

(4)$${F_{n \times m}}{X_{m \times 1}} + \omega = {P_{n \times 1}}$$

The spectral retrieval process involves two main steps: firstly, the incident spectrum is pre-reconstructed using the PCSBL algorithm; then, the ANN is applied to improve the accuracy of the final reconstructed spectrum. In PCSBL [42,43], the sparseness between adjacent elements within a signal would affect each other. By introducing suitable hyperparameters, the non-zero elements and zero elements are encouraged to gather together, leading to the formation of a block structure without prior information. The hyperparameters (${\gamma _i}$) are solved based on the maximum posterior criterion. Subsequently, the incident spectrum X can be estimated as:

(5)$$X = {({{F^T}F + {\lambda^{ - 1}}\mathrm{\Lambda }} )^{ - 1}}{F^T}P$$

where ${\lambda ^{ - 1}}$ denotes the variance of the measurement noise. And ${F^T}$ represents the transpose of transmission matrix F. The matrix $\mathrm{\Lambda }$ is a diagonal matrix. Its diagonal element is ${d_{jj}} = {\gamma _j} + \beta {\gamma _{j - 1}} + \beta {\gamma _{j + 1}}$, in which $\beta $ is a parameter used to indicate the correlation between adjacent signal elements.

Figure 2 describes the flow chart of our proposed spectral reconstruction approach. In the simulation, a series of arbitrary incident signals, denoted as ${X_{sim}}$, was generated. Then, according to Eq. (4), multiplying ${X_{sim}}$ with the transmission matrix F and adding a random noise vector $\omega $, we obtained the corresponding output optical power values, referred to as ${P_{sim}}$. The transmission matrix F was obtained through experimental measurements, which would be described in detail in Section IV. Subsequently, we could get the pre-reconstructed incident spectrum ${X_{pre\_rec}}$ according to Eq. (5). Obviously, ${X_{pre\_rec}}$ and ${X_{sim}}$ cannot be identical due to the presence of the noise, leading to the reconstruction errors such as central wavelength shift and power peak shift. ANN is then applied to mitigate the noise effect and enhance the accuracy of the reconstructed spectra.

Fig. 2. The flow chart of the proposed spectral reconstruction approach.

Download Full Size | PDF

Generalizability and accuracy are two criteria to evaluate the performance of neural networks, but these two criteria usually need to be traded off, i.e., good generalizability leads to lower accuracy, and high accuracy means poor generalizability. Hence, we utilized a classification network first to categorize the pre-reconstructed spectra, based on the spectral shapes. Following the classification step, reconstruction neural networks were employed to calibrate the spectra based on their corresponding spectral shapes. Based on our laboratory experimental conditions, we classified a total of five types of spectral shapes according to the number of peaks and the FWHM. These categories included: type 1: single-peaked spectra with a 0.5-nm FWHM, type 2: single-peaked spectra with a 1-nm FWHM, type 3: single-peaked spectra with a 2-nm FWHM, type 4: double-peaked spectra with a 0.5-nm FWHM, and type 5: triple-peaked spectra with a 0.5-nm FWHM.

3. Artificial neural network training

3.1 Training data preparation

Typically, training a neural network requires a huge amount of training data. To avoid extensive experimentation and data collection, we leveraged simulations to generate the training data. To ensure that the simulated training data closely resembled the experimental data, we followed the following procedure when generating simulation data. To begin with, five different shapes of input spectra were generated in Matlab. And all the five types of spectra were available in experiments. We generated a total of 3200 pairs of simulation data covering the five types of spectral shapes mentioned above with different central wavelength. And the random noise added to each pair of data was also different. The composition of the training data is presented in detail in Table 1. Figure 3 shows the representative incident spectra used to generate the ANN training dataset. Subsequently, the output optical power values, ${P_{sim}}$, corresponding to each simulated input spectrum, ${X_{sim}}$, were obtained according to Eq. (4). In this step, the transmission matrix F was acquired through experimental measurements. The experimental procedure for obtaining the transmission matrix is described in Section IV. It should be noted that there is a noise vector in Eq. (4). Since the measurement noise is random, the noise vector added each time was also randomly generated. Following the generation of ${P_{sim}}$, the PCSBL algorithm was employed to obtain the pre-reconstructed spectra ${X_{pre\_rec}}$. So far, the training data required for training the ANN was collected.

Fig. 3. Representation of (a) type1: single-peaked spectrum with 0.5-nm FWHM, (b) type 2: single-peaked spectrum with 1-nm FWHM, (c) type 3: single-peaked spectrum with 2-nm FWHM, (d) type 4: double-peaked spectrum with 0.5-nm FWHM and 3-nm interval, and (e) type 5: triple-peaked spectrum with 0.5-nm FWHM and 3-nm interval.

Download Full Size | PDF

Table 1. The composition of the simulation training data

View Table | View all tables in this article

3.2 Artificial neural network training

The performance of the ANN is affected by several factors, including the size of the network and the initial weights [44,45]. To ensure accurately reconstruct incident spectra, we investigated the number of hidden layers and nodes for each ANN employed in our reconstruction algorithm. As mentioned in Section II, a classification neural network was utilized to categorize the pre-reconstructed spectra, and five reconstruction neural networks were employed to calibrate and de-noise the spectra for each type. Consequently, we trained a total of one classification ANN and five reconstruction ANNs to accomplish the spectral retrieval objective.

Before training the classification ANN, we represent the different shapes of the incident spectra using five-dimensional vectors: [1 0 0 0 0], [0 1 0 0 0], [0 0 1 0 0], [0 0 0 1 0], and [0 0 0 0 1]. These vectors correspond to the above five types of spectra. The training algorithm was Levenberg-Marquardt, and the activation functions for each hidden layer and output layer were set as tansig and purelin, respectively. Out of the 3200 pairs of simulation data, 3100 pairs were employed for the classification ANN training, and 100 pairs were employed for test. Ultimately, a four-layer fully connected neural network was utilized to classify the spectra.

Figure 4 shows the confusion matrices for spectral classification. The numbers 1, 2, 3, 4, and 5 correspond to type 1-5, respectively, as mentioned above. The target class and the output class refer to the ground truth and the prediction of the neural network, respectively. The diagonal of the confusion matrices represents the count of correctly classified samples for each class. The results from the confusion matrices demonstrate that the accuracy of the entire training dataset is 100%.

Fig. 4. Confusion matrices of (a) training dataset and (b) test dataset.

Download Full Size | PDF

De-noising and calibrating the pre-reconstructed spectra were regression problems, which were addressed using the back-propagation fully connected neural networks. The training of ANN was performed using the Levenberg-Marquardt algorithm, and the activation functions of each hidden layer and output layer were tansig and purelin, respectively. The loss function is mean square error (MSE):

(6)$${L_{mse}} = \frac{1}{M}\mathop \sum \nolimits_{i = 1}^M {({{y_i} - y_i^\mathrm{^{\prime}}} )^2}$$

where $\textrm{y}$ is the original incident spectrum, and $y^{\prime}$ is the output of the neural network. And M represents the number of samples.

To optimize the reconstruction results, we trained specific neural networks for each type of spectrum. The training data were obtained from the previously generated simulation data, and they were randomly divided into training and test datasets in the ratio of 90:10. For a neural network, more nodes and layers require more computation time and training data. Therefore, we prefer to use fully connected neural networks with simpler structure to achieve the spectral reconstruction. In order to determine the optimal network architecture, we tried a total of six different sizes of the network, and each network structure was trained five times. Subsequently, the performance of the neural networks was evaluated using the test dataset, and the average MSE errors for each network were determined, as shown in Fig. 5. The numbers 1, 2, 3, 4, 5, and 6 present different neural network sizes: two hidden layers with nodes of [5,5], two hidden layers with nodes of [8,8], two hidden layers with nodes of [10,10], two hidden layers with nodes of [10,8], three hidden layers with nodes of [5,5,5], and three hidden layers with nodes of [5,5,3]. The purple bar represents the one with the smallest MSE among these different sizes of ANN. Therefore, based on the results in Fig. 5, we determined the optimal size of the de-nosing and calibration ANN for each type of spectra, as shown in Table 2. After the training of the ANNs was completed in simulation, these trained networks were saved and would be used for spectral reconstruction in the experiment.

Fig. 5. The minimize average MSE values for each size of the de-noising and calibration ANN for (a) single-peaked spectra with 0.5-nm FWHM, (b) single-peaked spectra with 1-nm FWHM, (c) single-peaked spectra with 2-nm FWHM, (d) double-peaked spectra with 0.5-nm FWHM and 3-nm interval, and (e) triple-peaked spectra with 0.5-nm FWHM and 3-nm interval.

Download Full Size | PDF

Table 2. The size of the de-noising and calibration ANN for each type of spectra

View Table | View all tables in this article

4. Experimental setup and results

The designed high-resolution FT spectrometer was fabricated leveraging a commercial silicon photonics foundry process (Institute of Microsystems, China). And Fig. 6(a) displays the microscope image of the complete device. The device was subsequently packaged at SJTU-Pinghu Institute of Intelligent Optoelectronics with a standard FC/PC fiber connector interface. This package approach allows for easy integration and deployment, making the spectrometer a “plug-and-play” device. The photo of the fully packaged on-chip SHFT spectrometer is shown in Fig. 6(b).

Fig. 6. (a) The microscope image of the designed device, and (b) the fully packaged “plug-and-play” on-chip SHFT spectrometer with standard FC/PC fiber connector interface.

Download Full Size | PDF

The experimental setup is illustrated in Fig. 7. To start with, a high-resolution wavelength-scanning measurement system, consisting of a tunable laser, a polarization controller, and a power meter, was employed to measure the transmission spectrum of the designed SHFT spectrometer, which would be used as the transmission matrix in our spectral retrieval algorithm. The wavelength-scanning measurement system is shown in Fig. 7. (a). The tunable laser (Keysight, 81606A) was set to emit narrow linewidth output laser light varying from 1500 nm to 1630 nm with a step size of 0.02 nm. The photodetector (Keysight, 81636B) has a sensitivity of −80 dBm. Additionally, a polarization controller was employed to set the input polarization into TE-mode polarization. Figure 7. (c) shows the measured transmission spectrum for part of the output channels of the designed spectrometer. According to the results, the SHFT spectrometer chip exhibits a flat transmission characteristic in the range of 1500 nm to 1600 nm, which indicates that our spectrometer has a wide operating bandwidth of 100 nm.

Fig. 7. (a) The experimental setup for transmission performance characterizing, in which the tunable laser provided narrow incident peaks with uniform power and the output optical power was measured by a power meter, (b) the experimental setup for spectrum reconstruction, (c) the transmission spectra for some of the output channels, and (d) the normalized transmission matrix from 1530 nm to 1565 nm.

Download Full Size | PDF

To characterize the performance of the SHFT spectrometer, various types of incident spectra were input into the spectrometer and then were reconstructed using the spectral retrieval approach described in Section II. The experimental setup, shown in Fig. 7. (b), involved a broadband amplified spontaneous emission (ASE) source (Connet, VENUS ASE-1550) and a programmable optical filter (Finisar, WaveShaper 4000A). The ASE source would provide a wide and flat spectrum, which would be shaped by the WaveShaper to generate incident spectra with different FWHMs and central wavelengths. Optical power meters were employed to measure the optical powers at the output ports. Light propagation between these instruments and the designed SHFT spectrometer was achieved through the standard single-mode fiber (SSMF). The spectral reconstruction algorithm consists of two parts: the PCSBL algorithm used for pre-reconstruction and the trained ANNs used for de-noising and central wavelength calibration. As a reference, the incident spectrum was directly recorded by a commercial optical spectrum analyzer (OSA, Yokogawa AQ6370D). As mentioned above, the designed on-chip SHFT spectrometer has a 100-nm wide operating bandwidth. However, the ASE source and the programmable optical filter in the laboratory work only in the C-band. Therefore, in the subsequent experiments, we mainly focus on the spectral reconstruction performance of the designed spectrometer in the wavelength range of 1530-1565 nm. Figure 7. (d) illustrates the normalized transmission matrix of the SHFT spectrometer from 1530 nm to 1565 nm, in which the x-axis and y-axis represent the serial number of channels and wavelength, respectively.

Figure 8 demonstrates the capability of the designed SHFT spectrometer in reconstructing the single-peaked incident spectra with the assistance of the proposed spectral reconstruction algorithm. Figure 8. (a) shows the reconstruction results of a single-peaked incident spectrum with 1-nm FWHM and a central wavelength of 1550.24 nm by using different reconstruction approaches. And Fig. 8. (b) is a partial enlargement of the reconstruction results in Fig. 8. (a). At first, we attempted to reconstruct the incident spectrum using the pseudo-inverse method, that is, directly multiplies the measured optical power vector by the pseudo-inverse of the transmission matrix, represented by the blue solid line in Fig. 8. (a). And the gray solid line with circles represents the original incident spectrum. Apparently, the pseudo-inverse method fails to accurately retrieve the incident spectrum due to unavoidable experimental noise. Thus, a new spectra retrieval algorithm that combines the PCSBL algorithm and ANN was proposed. Since the experimental noise is taken into account in the PCSBL algorithm, a preliminary spectral reconstruction can be achieved using the PCSBL algorithm alone, represented by the purple solid line. Reconstruction result with less error is already available using PCSBL algorithm, but there is a slight shift in the central wavelength (0.06 nm). And the reconstruction result obtained using the proposed spectral retrieval algorithm is presented by the orange solid line. With the assistance of the trained ANNs, the noise in the reconstruction result is removed and the central wavelength shift is also reduced. The MSE errors of the reconstructed results for the pseudo-inverse method, the PCSBL method, and the combined PCSBL and ANN method are 0.02, 7.58e-3, and 5.98e-5, respectively. Furthermore, Fig. 8. (c) and (d) illustrate the reconstructed results of single-peaked incident spectrum with 1-nm FWHM and central wavelengths of 1545.72 nm and 1554.80 nm, respectively. The MSE errors of the pre-reconstruction results using the PCSBL method are 6.78e-3 and 5.72e-3, respectively, while the MSE errors of the final reconstruction results obtained using the combined PCSBL and ANNs reconstruction method proposed in this paper are 1.84e-4 and 3.09e-4, respectively. These results demonstrate that our proposed spectral reconstruction algorithm could obtain a more accurate reconstruction result of the unknown incident spectrum.

Fig. 8. Reconstruction results of (a-b) single-peaked spectrum with 1-nm FWHM and central wavelength of 1550.24 nm using different reconstruction approaches, (c) single-peaked spectrum with 1-nm FWHM and central wavelength of 1545.72 nm, and (d) single-peaked spectrum with 1-nm FWHM and central wavelength of 1554.80 nm.

Download Full Size | PDF

We also carried out reconstruction experiments for single-peak incident signals with FWHM of 2 nm and 0.5 nm. The results are shown in Fig. 9. The gray solid line with circles indicates the reference incident spectrum recorded by the commercial spectrometer. The purple solid lines display the pre-reconstructed results using only the PCSBL algorithm, with MSE errors of 2.75e-2, 2.23e-2, and 6.26e-3, respectively. And the orange solid lines indicate the final reconstructed spectrum using the proposed reconstruction method, with MSE errors of 3.78e-5, 3.77e-5, and 1.93e-4, respectively. In addition, we also experimentally reconstructed other two kinds of different incident spectra, that is, a double-peaked spectrum with 0.5-nm FWHM and 3-nm interval, and a triple-peaked spectrum with 0.5-nm FWHM and 3-nm interval, as shown in Fig. 10. The pre-reconstructed results using PCSBL algorithm are presented by purple solid lines, with MSE errors of 3.38e-2, and 1.17e-2, respectively. And the final reconstructed results using the proposed reconstruction algorithm are presented by orange solid lines, with MSE errors of 1.75e-5, and 1.50e-6, respectively. To sum up, with the assistance of ANN, the reconstructed spectra are less corrupted by measurement noise, and both wavelength shift and relative light intensity are calibrated, the MSE errors of spectrum reconstruction is at least one order lower than only PCSBL algorithm.

Fig. 9. Reconstruction results of single-peaked spectrum (a) with 2-nm FWHM and central wavelength of 1553.62 nm, (b) with 0.5-nm FWHM and central wavelength of 1542.26 nm, and (c) with 0.5-nm FWHM and central wavelength of 1551.54 nm.

Download Full Size | PDF

Fig. 10. Reconstruction results of (a) double-peaked spectrum with 0.5-nm FEHM and 3-nm interval, and (b) triple-peaked spectrum with 0.5-nm FWHM and 3-nm interval.

Download Full Size | PDF

5. Conclusion

In this manuscript, a compact high-resolution on-chip SHFT spectrometer that comprises a 16-channel power splitter and an MZI array of 16 MZIs is proposed. The power splitter is used to evenly distribute the incident spectrum into 16 MZIs. The arm length differences of each MZI are linearly increased to obtain linearly increasing OPL differences. Grating couplers are employed for efficient fiber-to-chip coupling. The waveguide cross-section of 500 nm × 220 nm (width × height) has been considered for the SOI platform. The total footprint of our spectrometer is about 1.64 mm². To achieve high spectral resolution and enhance the reconstruction accuracy, we proposed a spectral retrieval algorithm that combines the PCSBL algorithm and ANN, in which the PCSBL algorithm is utilized for pre-reconstruction of the incident spectrum, and the ANN is applied to improve the accuracy of the final reconstructed spectrum.

We examined the operating bandwidth and spectral resolution of our on-chip spectrometer in experiments. The experimental results show that our device has a wide operating bandwidth from 1500 nm to 1600 nm. Besides, with the assistance of our spectral retrieval algorithm, we successfully reconstructed narrowband signals with 0.5-nm FWHM and a triple-peaked spectrum with a 3-nm separated distance even though there is unavoidable experimental noise. We also compared the reconstruction results using the pseudoinverse method and using only the PCSBL method. With the assistant of the trained ANNs, the MSE errors of spectrum reconstruction is at least one order lower than only PCSBL algorithm. Besides, the central wavelength shift and the relative light intensity are both well-calibrated, further improving the accuracy of the spectral reconstruction result.

National Natural Science Foundation of China (61927818), National Natural Science Foundation of China (62005030), Chongqing Science Fund for Distinguished Young Scholars (cstc2021jcyj-jqX0014), Natural Science Foundation Project of Chongqing (cstc2021jcyj-msxmX0818), and Chongqing Innovative Research Groups (No. CXQT20001)

Funding

National Natural Science Foundation of China (61927818, 62005030); Science Fund for Distinguished Young Scholars of Chongqing Municipality (cstc2021jcyj-jqX0014); Chongqing Municipal Science and Technology Bureau (cstc2021jcyj-msxmX0818, No. CXQT20001).

Acknowledgments

This research was supported by the NSFC 61927818 and 62005030, the Chongqing Science Fund for Distinguished Young Scholars (cstc2021jcyj-jqX0014), Natural Science Foundation Project of Chongqing (cstc2021jcyj-msxmX0818), Chongqing Innovative Research Groups (No. CXQT20001) and Shandong Taishan Industry Leading Talent Program. The authors acknowledge the laboratory equipment and technical support from Chongqing United Microelectronic Center Co., Ltd. And the Analysis and Testing Center of Chongqing University.

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. D. González-Andrade, T. T. D. Dinh, S. Guerber, N. Vulliet, S. Cremer, S. Monfray, E. Cassan, D. Marris-Morini, F. Boeuf, P. Cheben, L. Vivien, A. V. Velasco, and C. Alonso-Ramos, “Broadband Fourier-transform silicon nitride spectrometer with wide-area multiaperture input,” Opt. Lett. 46(16), 4021–4024 (2021). [CrossRef]

2. Z. Yang, T. Albrow-Owen, W. Cai, and T. Hasan, “Miniaturization of optical spectrometers,” Science 371(6528), eabe0722 (2021). [CrossRef]

3. C. Zeng, D. J. King, M. Richardson, and B. Shan, “Fusion of Multispectral Imagery and Spectrometer Data in UAV Remote Sensing,” Remote Sens. 9(7), 696 (2017). [CrossRef]

4. Q. Qiao, X. Liu, Z. Ren, B. Dong, J. Xia, H. Sun, C. Lee, and G. Zhou, “MEMS-Enabled On-Chip Computational Mid-Infrared Spectrometer Using Silicon Photonics,” ACS Photonics 9(7), 2367–2377 (2022). [CrossRef]

5. P. Cheben, J. H. Schmid, A. Delâge, A. Densmore, S. Janz, B. Lamontagne, J. Lapointe, E. Post, P. Waldron, and D.-X. Xu, “A high-resolution silicon-on-insulator arrayed waveguide grating microspectrometer with sub-micrometer aperture waveguides,” Opt. Express 15(5), 2299–2306 (2007). [CrossRef]

6. E. Ryckeboer, A. Gassenq, M. Muneeb, N. Hattasan, S. Pathak, L. Cerutti, J. B. Rodriguez, E. Tournié, W. Bogaerts, R. Baets, and G. Roelkens, “Silicon-on-insulator spectrometers with integrated GaInAsSb photodiodes for wide-band spectroscopy from 1510 to 2300 nm,” Opt. Express 21(5), 6101–6108 (2013). [CrossRef]

7. M. Muneeb, A. Vasiliev, A. Ruocco, A. Malik, H. Chen, M. Nedeljkovic, J. S. Penades, L. Cerutti, J. B. Rodriguez, G. Z. Mashanovich, M. K. Smit, E. Tourni, and G. Roelkens, “III-V-on-silicon integrated micro - spectrometer for the 3 µm wavelength range,” Opt. Express 24(9), 9465–9472 (2016). [CrossRef]

8. J. Zou, X. Ma, X. Xia, J. Hu, C. Wang, M. Zhang, T. Lang, and J.-J. He, “High Resolution and Ultra-Compact On-Chip Spectrometer Using Bidirectional Edge-Input Arrayed Waveguide Grating,” J. Lightwave Technol. 38(16), 4447–4453 (2020). [CrossRef]

9. X. Chen, X. Gan, Y. Zhu, and J. Zhang, “On-chip micro-ring resonator array spectrum detection system based on convex optimization algorithm,” Nanophotonics 12(4), 715–724 (2023). [CrossRef]

10. A. Li and Y. Fainman, “On-chip spectrometers using stratified waveguide filters,” Nat. Commun. 12(1), 2704 (2021). [CrossRef]

11. Y. Horie, A. Arbabi, S. Han, and A. Faraon, “High resolution on-chip optical filter array based on double subwavelength grating reflectors,” Opt. Express 23(23), 29848–29854 (2015). [CrossRef]

12. Y. Horie, A. Arbabi, E. Arbabi, S. M. Kamali, and A. Faraon, “Wide bandwidth and high resolution planar filter array based on DBR-metasurface-DBR structures,” Opt. Express 24(11), 11677–11682 (2016). [CrossRef]

13. A. Emadi, H. Wu, G. de Graaf, and R. Wolffenbuttel, “Design and implementation of a sub-nm resolution microspectrometer based on a Linear-Variable Optical Filter,” Opt. Express 20(1), 489–507 (2012). [CrossRef]

14. B. J. Russell, J. J. Cadusch, J. Meng, D. Wen, and K. B. Crozier, “Mid-infrared spectral reconstruction with dielectric metasurfaces and dictionary learning,” Opt. Lett. 47(10), 2490–2493 (2022). [CrossRef]

15. L. Zhang, M. Zhang, T. Chen, D. Liu, S. Hong, and D. Dai, “Ultrahigh-resolution on-chip spectrometer with silicon photonic resonators,” Opto-Electron. Adv. 5(7), 210100 (2022). [CrossRef]

16. B. Redding, S. F. Liew, R. Sarma, and H. Cao, “Compact spectrometer based on a disordered photonic chip,” Nat. Photonics 7(9), 746–751 (2013). [CrossRef]

17. T. Liu and A. Fiore, “Designing open channels in random scattering media for on-chip spectrometers,” Optica 7(8), 934–939 (2020). [CrossRef]

18. B. Redding, S. F. Liew, Y. Bromberg, R. Sarma, and H. Cao, “Evanescently coupled multimode spiral spectrometer,” Optica 3(9), 956–962 (2016). [CrossRef]

19. J. Bao and M. G. Bawendi, “A colloidal quantum dot spectrometer,” Nature 523(7558), 67–70 (2015). [CrossRef]

20. Z. Yang, T. Albrow-Owen, H. Cui, J. Alexander-Webber, F. Gu, X. Wang, T.-C. Wu, M. Zhuge, C. Williams, P. Wang, A. V. Zayats, W. Cai, L. Dai, S. Hofmann, M. Overend, L. Tong, Q. Yang, Z. Sun, and T. Hasan, “Single-nanowire spectrometers,” Science 365(6457), 1017–1020 (2019). [CrossRef]

21. “Single-Detector Spectrometer Using a Superconducting Nanowire | Nano Letters,” Nano Lett (2021).

22. J. Meng, J. J. Cadusch, and K. B. Crozier, “Detector-Only Spectrometer Based on Structurally Colored Silicon Nanowires and a Reconstruction Algorithm,” Nano Lett. 20(1), 320–328 (2020). [CrossRef]

23. S. Yuan, D. Naveh, K. Watanabe, T. Taniguchi, and F. Xia, “A wavelength-scale black phosphorus spectrometer,” Nat. Photonics 15(8), 601–607 (2021). [CrossRef]

24. J. Zhang, Z. Cheng, J. Dong, and X. Zhang, “Cascaded nanobeam spectrometer with high resolution and scalability,” Optica 9(5), 517–521 (2022). [CrossRef]

25. J. J. Cadusch, J. Meng, B. Craig, and K. B. Crozier, “Silicon microspectrometer chip based on nanostructured fishnet photodetectors with tailored responsivities and machine learning,” Optica 6(9), 1171–1177 (2019). [CrossRef]

26. A. Li and Y. Fainman, “Integrated Silicon Fourier Transform Spectrometer with Broad Bandwidth and Ultra-High Resolution,” Laser Photonics Rev. 15(4), 2000358 (2021). [CrossRef]

27. T. T. D. Dinh, X. L. Roux, N. Koompai, D. Melati, M. Montesinos-Ballester, D. González-Andrade, P. Cheben, A. V. Velasco, E. Cassan, D. Marris-Morini, L. Vivien, and C. Alonso-Ramos, “Mid-infrared Fourier-transform spectrometer based on metamaterial lateral cladding suspended silicon waveguides,” Opt. Lett. 47(4), 810–813 (2022). [CrossRef]

28. M. C. M. M. Souza, A. Grieco, N. C. Frateschi, and Y. Fainman, “Fourier transform spectrometer on silicon with thermo-optic non-linearity and dispersion correction,” Nat. Commun. 9(1), 665 (2018). [CrossRef]

29. “On-Chip Micro–Electro–Mechanical System Fourier Transform Infrared (MEMS FT-IR) Spectrometer-Based Gas Sensing - Mazen Erfan, Yasser M Sabry, Mohammad Sakr, Bassem Mortada, Mostafa Medhat, Diaa Khalil, 2016,” https://journals.sagepub.com/doi/10.1177/0003702816638295.

30. J. Loridat, S. Heidmann, F. Thomas, G. Ulliac, N. Courjal, A. Morand, and G. Martin, “All Integrated Lithium Niobate Standing Wave Fourier Transform Electro-Optic Spectrometer,” J. Lightwave Technol. 36(20), 4900–4907 (2018). [CrossRef]

31. D. Pohl, M. Reig Escalé, M. Madi, F. Kaufmann, P. Brotzer, A. Sergeyev, B. Guldimann, P. Giaccari, E. Alberti, U. Meier, and R. Grange, “An integrated broadband spectrometer on thin-film lithium niobate,” Nat. Photonics 14(1), 24–29 (2020). [CrossRef]

32. Q. Liu, J. M. Ramirez, V. Vakarin, X. L. Roux, C. Alonso-Ramos, J. Frigerio, A. Ballabio, E. T. Simola, D. Bouville, L. Vivien, G. Isella, and D. Marris-Morini, “Integrated broadband dual-polarization Ge-rich SiGe mid-infrared Fourier-transform spectrometer,” Opt. Lett. 43(20), 5021–5024 (2018). [CrossRef]

33. H. Wang, Z. Lin, Q. Li, and W. Shi, “On-chip Fourier transform spectrometers by dual-polarized detection,” Opt. Lett. 44(11), 2923–2926 (2019). [CrossRef]

34. H. Wang, Y. Bao, J. Tang, Q. Li, W. Shi, and X. Ma, “On-chip monolithic Fourier transform spectrometers assisted by cGAN spectral prediction,” Opt. Lett. 46(17), 4288–4291 (2021). [CrossRef]

35. H. Podmore, A. Scott, P. Cheben, A. V. Velasco, J. H. Schmid, M. Vachon, and R. Lee, “Demonstration of a compressive-sensing fourier-transform on-chip spectrometer,” Opt. Lett. 42(7), 1440–1443 (2017). [CrossRef]

36. D. M. Kita, B. Miranda, D. Favela, D. Bono, J. Michon, H. Lin, T. Gu, and J. Hu, “High-performance and scalable on-chip digital Fourier transform spectroscopy,” Nat. Commun. 9(1), 4405 (2018). [CrossRef]

37. J. Du, H. Zhang, X. Wang, W. Xu, L. Lu, J. Chen, and L. Zhou, “High-resolution on-chip Fourier transform spectrometer based on cascaded optical switches,” Opt. Lett. 47(2), 218–221 (2022). [CrossRef]

38. H. Luo, Z. Huang, C. Xu, A. P. T. Lau, and C. Yu, “Design and spectral reconstruction assisted by intelligent algorithms for high-resolution Fourier transform spectrometer,” in 2021 30th Wireless and Optical Communications Conference (WOCC) (2021), pp. 153–156.

39. X. Long, Z. Huang, H. Luo, C. Yu, H. Zhao, and Y. Liu, “High-resolution on-chip Fourier Transform Spectrometer Based on MZI Array and PCSBL Reconstruction Algorithm,” in 2022 Asia Communications and Photonics Conference (ACP) (2022), pp. 1427–1430.

40. B. Redding, S. M. Popoff, and H. Cao, “All-fiber spectrometer based on speckle pattern reconstruction,” Opt. Express 21(5), 6584–6600 (2013). [CrossRef]

41. A. Mondal and K. Debnath, “Design of Resolution-Tunable Neural Network-Based Integrated Reconstructive Spectrometer,” IEEE Sens. J. 22(3), 2630–2636 (2022). [CrossRef]

42. M. E. Tipping, “Sparse bayesian learning and the relevance vector machine,” J. Mach. Learn. Res. 1, 211–244 (2001).

43. J. Fang, Y. Shen, H. Li, and P. Wang, “Pattern-Coupled Sparse Bayesian Learning for Recovery of Block-Sparse Signals,” IEEE Trans. Signal Process. 63(2), 360–372 (2015). [CrossRef]

44. E. B. Baum and D. Haussler, “What Size Net Gives Valid Generalization?” Neural Comput. 1(1), 151–160 (1989). [CrossRef]

45. A. Atiya and C. Ji, “How initial conditions affect generalization performance in large networks,” IEEE Trans. Neural Netw. 8(2), 448–451 (1997). [CrossRef]

Types		FWHM	Quantity
Type 1	Single-peaked	0.5 nm	1000
Type 2	Single-peaked	1 nm	600
Type 3	Single-peaked	2 nm	500
Type 4	Double-peaked	0.5 nm	600
Type 5	Triple-peaked	0.5 nm	500
Total			3200

Types	FWHM	Number of hidden layers	Number of nodes in each hidden layer	Average MSE of test dataset
Single-peaked	0.5 nm	2	10,10	6.69e-5
	1 nm	3	5,5,5	2.10e-5
	2 nm	2	5,5	3.30e-5
Double-peaked	0.5 nm	2	5,5	2.21e-5
Triple-peaked	0.5 nm	3	5,5,3	4.62e-6

Types		FWHM	Quantity
Type 1	Single-peaked	0.5 nm	1000
Type 2	Single-peaked	1 nm	600
Type 3	Single-peaked	2 nm	500
Type 4	Double-peaked	0.5 nm	600
Type 5	Triple-peaked	0.5 nm	500
Total			3200

Types	FWHM	Number of hidden layers	Number of nodes in each hidden layer	Average MSE of test dataset
Single-peaked	0.5 nm	2	10,10	6.69e-5
	1 nm	3	5,5,5	2.10e-5
	2 nm	2	5,5	3.30e-5
Double-peaked	0.5 nm	2	5,5	2.21e-5
Triple-peaked	0.5 nm	3	5,5,3	4.62e-6

High-resolution on-chip spatial heterodyne Fourier transform spectrometer based on artificial neural network and PCSBL reconstruction algorithm

Abstract

1. Introduction

2. Operation principle

3. Artificial neural network training

3.1 Training data preparation

3.2 Artificial neural network training

4. Experimental setup and results

5. Conclusion

Funding

Acknowledgments

Disclosures

Data availability

References

Data availability

Cited By

Figures (10)

Tables (2)

Equations (6)

Optics Express