Deep photonic reservoir computing recurrent network

Yi-Wei Shen; Rui-Qian Li; Guan-Ting Liu; Jingyi Yu; Xuming He; Lilin Yi; Cheng Wang; Cheng Wang

doi:10.1364/OPTICA.506635

1. INTRODUCTION

Deep neural networks with multiple hidden layers have been substantially advancing the development of artificial intelligence. In comparison with electronic computing based on the von Neumann architecture, optical computing can boost energy efficiency and reduce computation latency [1–3]. In recent years, a large variety of optical computing architectures have been proposed, and most focused on the linear multiply-accumulate operation [4–8]. Together with the nonlinear activation function in the digital domain, optical convolutional neural networks and multilayer perceptrons have been extensively demonstrated. Differing from these feedforward neural networks, recurrent neural networks (RNNs) have an inherent memory effect and are favorable for solving time-dependent tasks such as natural language processing and temporal signal processing [9]. Reservoir computing (RC) is such a kind of RNN, but with fixed weights both in the input layer and in the hidden reservoir layers [10,11]. Only weights in the readout layer require training, which leads to the simple training algorithm and the fast training speed. Various hardware RCs have been widely investigated, among which optoelectronics-based and memristor-based RCs have attracted a lot of interest [12–18]. However, most hardware RCs only have one hidden reservoir layer, which substantially limits the capability of dealing with complex real-world problems. A comprehensive theoretical analysis by C. Gallicchio et al. pointed out that the deep hierarchy of RCs introduced multiple time scales and frequency components, and thereby boosted the richness of neuron dynamics and the diversity of representation [19,20]. Several paradigms of combining multiple reservoirs have been theoretically compared in literature, and it was found that the unidirectional coupling scheme of hidden reservoirs was beneficial to improve the performance of RCs [21,22]. Indeed, the deep configuration raises both the linear and the nonlinear memory capacities of RCs [23,24]. Interestingly, Penkovsky et al. showed that a deep RC with time-delay loops was equivalent to a deep convolutional neural network [25]. In experiment, Nakajima constructed a deep RC based on a Mach–Zehnder modulator associated with an optoelectronic feedback loop [26]. However, there is only one piece of hardware, which is reused in each hidden layer. The interconnection between successive layers requires optical-electrical conversion (OEC), analog-digital conversion (ADC), and inverse conversions. The above four conversion processes consume high power and introduce substantial latency, which significantly counteracts the merits of optical computing. Lupo et al. recently proposed a two-layer RC based on two groups of frequency combs, which were produced by the phase modulation of light [27]. The interconnection between the two layers is implemented in the electrical domain through the OEC. Nevertheless, the scalability of the RC depth faces limitations due to its tradeoff with the number of neurons, which defines the RC width.

This work presents a deep PRC based on cascading injection-locked semiconductor lasers. The hidden-layer interconnections are fully optical, without the need for any OEC and ADC. The deep PRC recurrent network, comprising 4 hidden layers and 320 interconnected neurons, is successfully demonstrated in experiment. In particular, the PRC depth is highly scalable without any power and coherence limitation. The deep PRC is applied in the real-world signal equalization task of an optical fiber communication system. It is proved that the deep PRC has a strong ability to mitigate the Kerr nonlinearity of optical fibers, and hence to improve the signal quality at the optical receiver.

Fig. 1. (a) Schematic architecture of the deep PRC. (b) Experimental setup of the deep PRC. AWG, arbitrary waveform generator; OSC, oscilloscope; PD, photodiode; EDFA, erbium-doped fiber amplifier; Ch, channel. The hidden layers are interconnected by the optical injection. The optical feedback loops provide virtual neurons.

Download Full Size | PDF

2. DEEP PRC ARCHITECTURE AND EXPERIMENTAL SETUP

Figure 1(a) illustrates the architecture of the deep PRC network. A single-mode master laser unidirectionally injects into the slave laser (Laser 1) in the first hidden layer of the reservoir. The optical injection is operated in the stable regime, which is bounded by the Hopf bifurcation and the saddle-node bifurcation [28,29]. Partial light of Laser 1 goes to the second layer of the reservoir and locks Laser 2 through optical injection. Similarly, Laser 2 locks Laser 3 in the third layer, and subsequently Laser 3 locks Laser 4 in the fourth layer. As a result, the lasing frequencies of all the four slave lasers are locked to be the same as that of the master laser. Furthermore, the phases of all the slave lasers are synchronized with the master laser as well. In each hidden layer, the slave laser is subject to an optical feedback loop, which produces a large number of virtual neurons through nonlinear laser dynamics [14,30]. The optical feedback is also operated in the stable regime, which is separated from the unstable regime via a critical feedback level [28,31]. In the input layer, the input signal is multiplied by a random mask, and this pre-processed signal is superimposed onto the carrier wave of the master laser through an optical modulator. The masking process plays a crucial role in the PRC system. On one hand, the mask varies much more rapidly than the characteristic time (inverse of resonance frequency) of the slave lasers. Consequently, all the slave lasers operate in the instantaneous state rather than the steady state, due to the injection of this fast varying masked signal. Maintaining the system in the transient state is one of the fundamental operation principles of time-delay RCs [30,32]. On the other hand, the mask interval defines the temporal interval of virtual neurons. The neuron number in each hidden layer is determined by the clock cycle divided by the neuron interval. In the readout layer, the neuron states of all the four hidden layers are recorded simultaneously. The target value is obtained through the weighted sum of all the neuron states, and the weights are trained through using the ridge regression algorithm [32]. Based on the deep PRC scheme, Fig. 1(b) shows the corresponding experimental setup. A tunable external cavity laser (Santec TSL-710) serves as the master laser, and its output power is amplified by an erbium-doped fiber amplifier (EDFA). The polarization of the light is aligned with a Mach–Zehnder intensity modulator (EOSPACE, 40 GHz bandwidth) through a polarization controller. The input signal is multiplied by a binary random mask consisting of {0, 1}. This pre-processed signal is generated from an arbitrary waveform generator (AWG, Keysight 8195 A, 25 GHz bandwidth), which drives the optical modulator. The polarization of the modulated light is realigned with the polarization of the slave laser in the first hidden layer. The four slave lasers in the hidden layers are commercial Fabry–Perot (FP) lasers with multiple longitudinal modes. In each layer, the optical feedback loop is formed by an optical circulator and two 90:10 couplers. The feedback strength is adjusted through an optical attenuator. At the output of each hidden layer (except the fourth layer), 70% light is unidirectionally injected into the subsequent layer to lock the slave laser. The unidirectional optical injection is guaranteed by the optical circulator because it also plays the role of an optical isolator. Between the second and the third layers, the laser power is amplified by using another EDFA. The neuron states of all the four layers are detected by broadband photodiodes (PD) and then recorded on the four channels (Ch) of a high-speed digital oscilloscope (OSC, Keysight DSAZ594A, 59 GHz bandwidth), simultaneously. The optical spectrum is measured by an optical spectrum analyzer with a resolution of 0.02 nm (Yokogawa). In the experiment, the time interval of neurons in each hidden layer is fixed at $\theta = {0.05}\;{\rm ns}$, which is determined by the modulation rate of the optical modulator at 20 GHz. The number of neurons in each layer is set at $N = {80}$, resulting in a total of 320 neurons in the deep PRC network. Consequently, the clock cycle of the PRC system is ${T_c} = {4.0}\;{\rm ns}$ (${T_c} = \theta \times N$). The sampling rates of the AWG and the OSC are 60 GSa/s and 80 GSa/s, respectively.

3. EXPERIMENTAL RESULTS

In the experiment, all the four FP lasers in the hidden layers exhibit an identical lasing threshold of ${I_{\rm{th}}} = {8.0}\;{\rm mA}$. The resonance frequency of the lasers is around 4.7 GHz, leading to a characteristic time of 0.21 ns [29]. Prior to discussing the operation conditions, we summarize the basic operation principles derived from our previous work on the single-layer PRC [24,29,33]: (I) A large pump current of the slave laser, (II) a high injection ratio of the optical injection, and (III) a detuning frequency close to the Hopf bifurcation are helpful for improving the PRC performance. (IV) The PRC performance is insensitive to the feedback ratio, as long as the feedback is below the critical feedback level. (V) The optimal ratio of the delay time to the clock cycle is task-dependent. Following the guidance of the above five operation principles, Table 1 lists the detailed operation parameters of the experimental setup. It is important to note that the operation parameters are not quantitatively optimized. The delay times of the four optical feedback loops are fixed in the range of 63.0 to 68.5 ns. It is noted that the delay times are more than 15 times longer than the clock cycle (4.0 ns) of the computing system, unlike the common synchronous case. Our recent work has shown that this asynchronous architecture associated with a short neuron interval is helpful for improving the PRC performance [29,33], which is owing to the rich neuron interconnections [34]. The feedback ratio is defined as the power ratio of the reflected light to the emitted light, which is set around ${-}{30}\;{\rm dB}$ for all the four layers. The critical feedback level of the lasers is about ${-}{19.3}\;{\rm dB}$, and hence the optical feedback is operated in the stable regime. The injection ratio is defined as the ratio of the injected power from the laser in the previous layer to the emission power of the laser in the subsequent layer. As shown in Table 1, the injection ratios of each layer vary from about 2.0 up to 4.0. In addition, the detuning frequency is defined as the lasing frequency difference between the two lasers. All the detuning frequencies in Table 1 are set within the stable locking regime. The injection-locking conditions determine the interconnection weights between successive layers, which act as hyper-parameters of the system. These hyper-parameters have the potential to be optimized through the Bayesian optimization algorithm in future work [35]. In contrast, the connection weights can be trained numerically in case the interconnection between layers is implemented in the digital domain [26]. Figure 2 shows the optical spectra of the FP lasers in all the four reservoir layers. The lasers exhibit multiple longitudinal modes, and the lasing peaks are around 1550.98, 1542.63, 1548.91, and 1540.86 nm, respectively. Meanwhile, the free spectral ranges are 154.6, 154.8, 172.7, and 171.9 GHz, respectively. The wavelength of the master laser is set at 1546.5 nm, and the output power after the EDFA amplification is 36 mW. When applying optical injection from the master laser, only one mode of the slave lasers closest to the injection wavelength is locked. All side modes are suppressed, and the suppression ratio is more than 50 dB. This is because the optical injection reduces the gain of the laser medium, and only the mode subject to optical injection reaches the lasing threshold [36].

Table 1. Operation Conditions of the Deep PRC Network

View Table | View all tables in this article

Fig. 2. Optical spectra of the four FP lasers with (red curves) and without (black curves) optical injection.

Download Full Size | PDF

The performance of the deep PRC is tested in the real-world task of nonlinear signal equalization in an optical fiber communication system. The signal in optical fibers is usually distorted due to the linear chromatic dispersion and the nonlinear Kerr effect during transmission [37]. The linear distortion is usually mitigated by the feedforward equalizer (FFE) in the digital signal processing (DSP) module of optical receivers [38,39]. The FFE linearly combines the current symbol and its neighbors as ${\hat y_n} = \mathop{\sum}\nolimits_{j = - K}^K {{w_j}} {y_{n - j}}$, where ${y_n}$ is the received symbol at the $n$th step, ${w_j}$ is the weight, ${2}\;K + {1}$ is the tap number, and ${\hat y_n}$ is the recovered symbol. On the other hand, the Kerr nonlinearity can be compensated through solving the nonlinear Schrödinger equation [37]. However, common solving algorithms like the digital back propagation are too complex for DSP implementations [39,40]. An alternative solution is deploying deep neural networks to compensate for the fiber nonlinearity, which can reduce the computational complexity [40–42]. In particular, several reports have experimentally demonstrated that shallow PRCs were capable of compensating for the linear impairment of optical fibers instead of the FFEs [33,43–46]. Here we show that the deep PRC has strong capability in mitigating the nonlinear impairment of optical fibers. The nonlinear Schrödinger equation describing the propagation of light in an optical fiber reads as [37]

Fig. 3. Example of signal sequences (a) at the transmitter and (b) at the receiver. The launch power is 4.0 mW. (c) Measured performance of the PRCs with different depth. The error bar stands for the standard deviation of the measurement. The dashed line indicates the FEC threshold.

Download Full Size | PDF

(1)$$\frac{{\partial E}}{{\partial z}} + \frac{\alpha}{2}E + j\frac{{{\beta _2}}}{2}\frac{{{\partial ^2}E}}{{\partial {t^2}}} = j\gamma {\left| E \right|^2}E,$$

where $E(z,t)$ is the slowly varying envelope of the electric field at position $z$ and time $t$. $\alpha$ is the attenuation constant (0.2 dB/km), ${\beta _2}$ is the fiber dispersion coefficient (${-}{21.4}\;{{\rm ps}^2}/{\rm km}$), and $\gamma$ is the fiber nonlinearity coefficient (${1.2/}\;({\rm W} \cdot {\rm km})$) [47]. The symbol sequence sent at the transmitter is a non-return-to-zero (NRZ) signal with a modulation rate of 25 Gbps. The transmission distance in the optical fiber is set at 50 km. The symbol sequence received at the receiver is simulated by solving Eq. (1). The training set comprises 35,000 random symbols of {0, 1}, and the testing set comprises 15,000 random symbols. Each symbol consists of 8 samples, and the tap number of the equalizer is set at 21. In the experiment, each measurement is repeated 4 times, and the mean bit error rate (BER) and the standard deviation are collected.

Figure 3(a) shows an example of the simulated NRZ signal sequence sent at the transmitter. After a transmission distance of 50 km, nevertheless, the signal at the receiver side in Fig. 3(b) is substantially distorted. Generally, increasing the launch power raises the nonlinear effect, and the signal distortion becomes severe [37]. The mission aims to reproduce the signal at the transmitter based on the distorted one at the receiver. When applying the shallow 1-layer PRC (squares) to equalize the received signal in Fig. 3(c), the BER first decreases from ${5.0} \times {{10}^{- 3}}$ at 1.0 mW down to the minimum value of ${3.4} \times {{10}^{- 3}}$ at 100 mW. Above 100 mW, the BER increases with the launch power nonlinearly. Meanwhile, the BERs for launch powers ranging from 80 to 120 mW are below the hard-decision forward error correction (FEC) threshold (${3.8} \times {{10}^{- 3}}$, dashed line) [48]. This is because the PRC inherently owns both linear memory effect and nonlinear memory effect, which are commonly quantified by the linear memory capacity (MC) and the nonlinear MC, respectively [24,32]. For low launch powers (see 1.0 mW), the Kerr nonlinearity of the optical fiber is negligible, and the signal distortion is mainly induced by the linear chromatic dispersion. Therefore, the impairment compensation only requires the linear memory effect of the PRC, while the nonlinear memory effect plays a negative role. When increasing the launch power (see 20–100 mW), the Kerr nonlinearity appears, and hence the nonlinear memory effect of the PRC becomes beneficial for mitigating this nonlinear distortion. The BER reaches the minimum value when the inherent nonlinear memory of the PRC well compensates for the Kerr nonlinearity of the optical fiber (see 100 mW). On the other hand, the BER increases when the nonlinear memory capacity is not high enough to compensate for the strong fiber nonlinearity (see 120–200 mW). Therefore, the nonlinear equalization capability of the PRC is limited by its maximum nonlinear memory effect. For the deep PRC with two reservoir layers (up-triangles), the BER reduces from ${4.4} \times {{10}^{- 3}}$ at 1.0 mW down to the minimum of ${1.5} \times {{10}^{- 3}}$ at 120 mW. The BERs for launch powers ranging from 20 to 160 mW are below the FEC threshold. The PRC performance further improves when we raise the PRC depth to three (circles). It is shown that the corresponding BER declines from ${4.2} \times {{10}^{- 3}}$ at 1.0 mW down to the minimum of ${1.0} \times {{10}^{- 3}}$ at 120 mW. Interestingly, the BERs of the 3-layer PRC are better than those of the 2-layer PRC, for all the studied launch powers ranging from 1.0 mW up to 200 mW. However, the performance of the 4-layer PRC (down-triangles) is similar to or slightly inferior to that of the 3-layer PRC. It is remarked that the 4-layer PRC suffers from the overfitting issue, which can be overcome by using more training samples. In such a way, the 4-layer PRC should perform similarly or slightly better than the 3-layer one. This suggests the PRC performance saturates at the depth of three. In comparison with the shallow PRC, all the three deep PRCs exhibit better performance across the whole launch power range. The above observation is in agreement with the well-established theory that deeper neural networks perform better than the shallower ones, unless saturated [9]. In particular, the BERs are significantly reduced in the power range of 80 to 160 mW. Therefore, unlike the shallow PRC, the deep PRCs have a very strong capability to mitigate the nonlinearity of optical fibers and hence to improve the signal quality. This nonlinear compensation ability can be attributed to the strengthened nonlinear memory effect, which is described in Section 4.

Figure 4 explores the contribution of each reservoir layer in the 3-layer PRC. For the contribution evaluation, only one of the three reservoirs is used for the signal equalization. Therefore, the neuron number becomes 80 instead of 240, both for the training and for the test. It is found that the performance of the second-layer reservoir is generally better than the first-layer one. In particular, the minimum BER of the second-layer reservoir achieved at 120 mW is ${2.0} \times {{10}^{- 3}}$. This value further declines down to ${1.4} \times {{10}^{- 3}}$ for the third-layer reservoir, which is 2.6 times smaller than the first-layer case (${3.7} \times {{10}^{- 3}}$). The different performance of the three reservoir layers suggests that the neuron dynamics are different from one layer to another. Generally, the neuron states in the deeper layer are richer than those in the shallower one, which leads to better equalization performance. It is worthwhile to point out that this behavior is different from the parallel PRC, where several reservoirs are connected in parallel instead of in series. Our recent work has demonstrated that the neuron states in each reservoir of the parallel wavelength-multiplexing PRC were similar to each other [33]. Owing to the different neuron dynamics of each layer, the nonlinear memory effect of the deep PRC is improved, and thereby the equalization performance of the 3-layer PRC is enhanced in Fig. 3. In comparison, the FFE in the DSP of optical receivers only compensates for the linear chromatic dispersion of optical fibers [38]. Figure 4 shows that the BER of the FFE (squares) increases nonlinearly from ${5.2} \times {{10}^{- 3}}$ at 1.0 mW up to ${3.8} \times {{10}^{- 2}}$ at 200 mW. At the launch power of 120 mW, the BER of the FFE (${7.7} \times {{10}^{- 3}}$) is 7.7 times larger than that of the 3-layer PRC (${1.0} \times {{10}^{- 3}}$). This comparison proves that the deep PRC can indeed compensate for strong nonlinearity impairment of optical fibers. For low launch powers (1 mW), nevertheless, the BER of the 3-layer PRC is only slightly better than that of the FFE. This suggests that the deep PRC has a similar capability of compensating for the chromatic dispersion as the FFE.

Fig. 4. Performance comparison between the 3-layer PRC (dots) and the FFE (squares). The open symbols represent the BERs of each reservoir layer. The dashed line indicates the FEC threshold. The error bar stands for the standard deviation of the measurement. The plot of “1st layer” is the same as the plot of “1-layer PRC” in Fig. 3.

Download Full Size | PDF

4. DISCUSSION

The experimental results in Figs. 3 and 4 show that raising the PRC depth substantially improves the performance of nonlinearity compensation at high launch powers. However, the deep PRC shows similar performance of linearity compensation to the shallow one at low launch powers. In order to understand the behavior, we analyze both the linear MC (LMC) and the nonlinear MCs of the deep PRC, respectively. The LMC measures the capability of the PRC for reproducing the past input signal [49], which is quantified by [50,51]

(2)$${{\rm MC}_L} = \sum\limits_{i = 1}^\infty {\frac{{{{\left\langle {u(k - i)y(k)} \right\rangle}^2}}}{{{\sigma ^2}\left[{u(k)} \right]{\sigma ^2}\left[{y(k)} \right]}}} ,$$

where the input signal $u(k)$ is a random sequence uniformly distributed in the range of [${-}{1}$, 1]. $y(k)$ is the corresponding output of the PRC at the step k. The evaluation aims to reproduce the input signal $u(k - i)$ shifted $i$-step backward using the output $y(k)$. ${\sigma ^2}$ represents the variance operation, and ${\lt\gt}$ stands for the average operation. In addition, the nonlinear MC characterizes the capability for reproducing high-order Legendre polynomials of the input signal, which is defined as

(3)$${{\rm MC}_{\textit{NL}}} = \sum\limits_{i = 1}^\infty {\frac{{{{\left\langle {p(k - i)y(k)} \right\rangle}^2}}}{{{\sigma ^2}\left[{p(k)} \right]{\sigma ^2}\!\left[{y(k)} \right]}}} ,$$

where the polynomial is $p(k) = [{3}{u^2}(k) - {1}]{ /2}$ for the quadratic MC (QMC) and is $p(k) = [{5}{u^3}(k) - {3}u(k)]/{2}$ for the cubic MC (CMC), respectively. The evaluation aims to reproduce the polynomial $p(k - i)$ using the output $y(k)$. In addition to LMC, QMC, and CMC, the PRC also has higher-order memory effect and cross memory effect, which are not considered in this work [50,51]. In the analysis of the MCs, 4000 samples are used for training, and 1000 samples for testing, both in the experiment and in the simulation.

The measurement in Fig. 5(a) shows that both the linear MC and the nonlinear MCs of the PRC rise with increasing depth. The LMC increases from 4.4 for the 1-layer PRC up to 8.1 for the 4-layer PRC. However, the deep PRC in Fig. 3 only slightly reduces the BER at low launch powers. This suggests that the LMC of the shallow PRC is already high enough to compensate for the linear chromatic dispersion. Meanwhile, the nonlinear QMC increases from 2.5 to 5.2, and the CMC increases from 0.72 to 2.0. The enhanced nonlinear MCs can be attributed to the rich neuron states in deep reservoir layers as proved in Fig. 4. Consequently, the deep PRC exhibits strong capability in mitigating the nonlinearity of optical fibers. In addition, all the three MCs almost saturate at the depth of three, resulting in the performance saturation observed in Fig. 3. It is worthwhile to point out that all the three MCs substantially degrade in case there is no optical feedback in the PRC. This kind of system is known as the extreme learning machine [46], and the detailed discussion refers to Supplement 1.

Fig. 5. (a) Measured and (b) simulated memory capacities versus the depth of the PRC.

Download Full Size | PDF

In order to verify the effect of depth on the MCs measured in experiment, we simulate the MCs using a deep PRC model comprised of four hidden reservoir layers. The model assumes that all the four slave lasers are identical so as to simplify the simulation. The carrier dynamics, the photon dynamics, and the phase of the electrical field of the laser are taken into account through the framework of rate equations [24]. Both the optical feedback effect and the optical injection are characterized through the classical Lang-Kobayashi model [52,53]. The main simulation parameters are listed in Table 2. The detailed deep PRC model and other simulation parameters refer to [24]. The simulation in Fig. 5(b) verifies that all the MCs indeed go up with increasing depth of the PRC, and the evolution trend is similar to that observed in experiment. In addition, all the simulated MCs almost saturate at the depth of three, which is in agreement with the measurement in Fig. 5(a) as well. However, the simulated MCs are larger than the measured ones. At the depth of three, the simulated LMC, QMC, and CMC are 2.1 times, 2.3 times, and 3.5 times larger than the measured ones, respectively. The quantity difference is likely due to the random noise, which exists in the experiment but is not considered in the simulation model. Indeed, the input signal for the MC evaluation randomly distributes in the range of [${-}{1}$, 1], and thereby the measurement is sensitive to the noise sources.

Table 2. Main Simulation Parameters of the Deep PRC

View Table | View all tables in this article

5. CONCLUSION

In summary, we have experimentally demonstrated a deep PRC architecture based on the cascading injection-locked lasers. The connection between successive reservoir layers is all optical, without requiring any OEC or ADC. In addition, this scheme is highly scalable because the laser at each layer provides optical power. The deep PRC with a depth of four is used to solve the real-world problem of nonlinear signal equalization in an optical fiber communication system. It is found that the deep PRC exhibits strong capability in compensating for the Kerr nonlinearity of optical fibers and hence improving the quality of the received signal. In comparison with the linear FFE, the deep PRC reduces the BER of the transmission link as much as 7.7 times. In comparison with the shallow PRC, the improved performance of the deep PRC is owing to the rich neuron dynamics of the deep reservoir layers, which in turn boosts the nonlinear memory effect. It is remarked that the operation conditions in this work are not optimized, and hence the conclusion can be generalized to other conditions (no matter whether optimized or not). Future work will optimize the operation parameters, including the injection ratio, the detuning frequency, the feedback ratio, and the feedback delay time, so as to further enhance the deep PRC performance.

Funding

Natural Science Foundation of Shanghai (20ZR1436500); National Natural Science Foundation of China (62025503).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Supplemental document

See Supplement 1 for supporting content.

REFERENCES

1. M. A. Nahmias, T. F. de Lima, A. N. Tait, et al., “Photonic multiply-accumulate operations for neural networks,” IEEE J. Quantum Electron. 26, 7701518 (2020). [CrossRef]

2. Z. Chen and M. Segev, “Highlighting photonics: looking into the next decade,” eLight 1, 2 (2021). [CrossRef]

3. C. Huang, V. J. Sorger, M. Miscuglio, et al., “Prospects and applications of photonic neural networks,” Adv. Phys. X 7, 1981155 (2022). [CrossRef]

4. Y. Shen, N. Harris, S. Skirlo, et al., “Deep learning with coherent nanophotonic circuits,” Nat. Photonics 11, 441–446 (2017). [CrossRef]

5. X. Lin, Y. Rivenson, N. Yardimci, et al., “All-optical machine learning using diffractive deep neural networks,” Science 361, 1004–1008 (2018). [CrossRef]

6. X. Xu, M. Tan, B. Corcoran, et al., “11 TOPS photonic convolutional accelerator for optical neural networks,” Nature 589, 44–51 (2021). [CrossRef]

7. J. Feldmann, N. Youngblood, M. Karpov, et al., “Parallel convolutional processing using an integrated photonic tensor core,” Nature 589, 52–58 (2021). [CrossRef]

8. E. Goi, X. Chen, Q. Zhang, et al., “Nanoprinted high-neuron-density optical linear perceptrons performing near-infrared inference on a CMOS chip,” Light Sci. Appl. 10, 40 (2021). [CrossRef]

9. I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning (MIT, 2016).

10. W. Maass, T. Natschlager, and H. Markram, “Real-time computing without stable states: a new framework for neural computation based on Perturbations,” Neural Comput. 14, 2531–2560 (2002). [CrossRef]

11. H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science 304, 78–80 (2004). [CrossRef]

12. K. Vandoorne, P. Mechet, T. V. Vaerenbergh, et al., “Experimental demonstration of reservoir computing on a silicon photonics chip,” Nat. Commun. 5, 3541 (2014). [CrossRef]

13. M. Nakajima, K. Tanaka, and T. Hashimoto, “Scalable reservoir computing on coherent linear photonic processor,” Commun. Phys. 4, 20 (2021). [CrossRef]

14. D. Brunner, M. C. Soriano, C. R. Mirasso, et al., “Parallel photonic information processing at gigabyte per second data rates using transient states,” Nat. Commun. 4, 1364 (2013). [CrossRef]

15. J. Moon, W. Ma, J. H. Shin, et al., “Temporal data classification and forecasting using a memristor-based reservoir computing system,” Nat. Electron. 2, 480–487 (2019). [CrossRef]

16. Y. Zhong, J. Tang, X. Li, et al., “A memristor-based analogue reservoir computing system for real-time and power-efficient signal processing,” Nat. Electron. 5, 672–681 (2022). [CrossRef]

17. K. Liu, T. Zhang, B. Dang, et al., “An optoelectronic synapse based on α-In₂Se₃ with controllable temporal dynamics for multimode and multiscale reservoir computing,” Nat. Electron. 5, 761–773 (2022). [CrossRef]

18. G. Tanaka, T. Yamane, J. B. Héroux, et al., “Recent advances in physical reservoir computing: a review,” Neural Netw. 115, 100–123 (2019). [CrossRef]

19. C. Gallicchio, A. Micheli, and L. Pedrelli, “Deep reservoir computing: a critical experimental analysis,” Neurocomputing 268, 87–99 (2017). [CrossRef]

20. C. Gallicchio, A. Micheli, and L. Pedrelli, “Design of deep echo state networks,” Neural Netw. 108, 33–47 (2018). [CrossRef]

21. M. Freiberger, S. Sackesyn, C. Ma, et al., “Improving time series recognition and prediction with networks and ensembles of passive photonic reservoirs,” IEEE J. Sel. Top. Quantum Electron. 26, 7700611 (2020). [CrossRef]

22. H. Hasegawa, K. Kanno, and A. Uchida, “Parallel and deep reservoir computing using semiconductor lasers with optical feedback,” Nanophotonics 12, 869–881 (2023). [CrossRef]

23. M. Goldmann, F. Koster, K. Lüdge, et al., “Deep time-delay reservoir computing: Dynamics and memory capacity,” Chaos 30, 093124 (2020). [CrossRef]

24. B. D. Lin, Y. W. Shen, J. Y. Tang, et al., “Deep time-delay reservoir computing with cascading injection-locked lasers,” IEEE J. Sel. Top. Quantum Electron. 29, 7600408 (2023). [CrossRef]

25. B. Penkovsky, X. Porte, M. Jacquot, et al., “Coupled nonlinear delay systems as deep convolutional neural networks,” Phys. Rev. Lett. 123, 054101 (2019). [CrossRef]

26. M. Nakajima, K. Inoue, K. Tanaka, et al., “Physical deep learning with biologically inspired training method: gradient-free approach for physical hardware,” Nat. Commun. 13, 7847 (2022). [CrossRef]

27. A. Lupo, E. Picco, M. Zajnulina, et al., “Deep photonic reservoir computer based on frequency multiplexing with fully analog connection between layers,” Optica 10, 1478–1485 (2023). [CrossRef]

28. J. Ohtsubo, Semiconductor Lasers: Stability, Instability and Chaos (Springer, 2017).

29. J. Y. Tang, B. D. Lin, Y. W. Shen, et al., “Asynchronous photonic time-delay reservoir computing,” Opt. Express 31, 2456–2466 (2023). [CrossRef]

30. L. Appeltant, M. C. Soriano, G. Van der Sande, et al., “Information processing using a single dynamical node as complex system,” Nat. Commun. 2, 468 (2011). [CrossRef]

31. Y. Deng, Z. F. Fan, B. B. Zhao, et al., “Mid-infrared hyperchaos of interband cascade lasers,” Light Sci. Appl. 11, 7 (2022). [CrossRef]

32. D. Brunner, M. C. Soriano, and G. Van der Sande, Photonic Reservoir Computing: Optical Recurrent Neural Networks (De Gruyter, 2019).

33. R. Q. Li, Y. W. Shen, B. D. Lin, et al., “Scalable wavelength-multiplexing photonic reservoir computing,” APL Mach. Learn. 1, 036105 (2023). [CrossRef]

34. T. Hülser, F. Köster, L. Jaurigue, et al., “Role of delay-times in delay-based photonic reservoir computing,” Opt. Mater. Express 12, 1214–1231 (2022). [CrossRef]

35. K. Harkhoe and G. V. d. Sande, “Delay-based reservoir computing using multimode semiconductor lasers: exploiting the rich carrier dynamics,” IEEE J. Sel. Top. Quantum Electron. 25, 1502909 (2019). [CrossRef]

36. C. Wang, K. Schires, M. Osiński, et al., “Thermally insensitive determination of the linewidth broadening factor in nanostructured semiconductor lasers using optical injection locking,” Sci. Rep. 6, 27825 (2016). [CrossRef]

37. G. P. Agrawal, Nonlinear Fiber Optics (Springer, 2000).

38. L. Huang, Y. Xu, W. Jiang, et al., “Performance and complexity analysis of conventional and deep learning equalizers for the high-speed IMDD PON,” J. Lightwave Technol. 40, 4528–4538 (2022). [CrossRef]

39. P. J. Freire, Y. Osadchuk, B. Spinnler, et al., “Performance versus complexity study of neural network equalizers in coherent optical systems,” J. Lightwave Technol. 39, 6085–6096 (2021). [CrossRef]

40. Q. Fan, G. Zhou, T. Gui, et al., “Advancing theoretical understanding and practical performance of signal processing for nonlinear optical communications through machine learning,” Nat. Commun. 11, 3694 (2020). [CrossRef]

41. S. Zhang, F. Yaman, K. Nakamura, et al., “Field and lab experimental demonstration of nonlinear impairment compensation using neural networks,” Nat. Commun. 10, 3033 (2019). [CrossRef]

42. C. Huang, S. Fujisawa, T. F. De lima, et al., “A silicon photonic–electronic neural network for fibre nonlinearity compensation,” Nat. Electron. 4, 837–844 (2021). [CrossRef]

43. A. Argyris, J. Bueno, and I. Fischer, “Photonic machine learning implementation for signal recovery in optical communications,” Sci. Rep. 8, 8487 (2018). [CrossRef]

44. J. Vatin, D. Rontani, and M. Sciamanna, “Experimental realization of dual task processing with a photonic reservoir computer,” APL Photon. 5, 086105 (2020). [CrossRef]

45. S. Ranzini, R. Dischler, F. Da Ros, et al., “Experimental investigation of optoelectronic receiver with reservoir computing in short reach optical fiber communications,” J. Lightwave Technol. 39, 2460–2467 (2021). [CrossRef]

46. I. Estebanez, S. Li, J. Schwind, et al., “56 GBaud PAM-4 100 km transmission system with photonic processing schemes,” J. Lightwave Technol. 40, 55–62 (2022). [CrossRef]

47. K. Hammani, B. Wetzel, B. Kibler, et al., “Spectral dynamics of modulation instability described using Akhmediev breather theory,” Opt. Lett. 36, 2140–2142 (2011). [CrossRef]

48. M. Bi, J. Yu, X. Miao, et al., “Machine learning classifier based on FE-KNN enabled high-capacity PAM-4 and NRZ transmission with 10-G class optics,” Opt. Express 27, 25802–25813 (2019). [CrossRef]

49. H. Jaeger, “Short term memory in echo state networks,” GMD tech. report 152 (2002).

50. J. Dambre, D. Verstraeten, B. Schrauwen, et al., “Information processing capacity of dynamical systems,” Sci. Rep. 2, 514 (2012). [CrossRef]

51. M. Inubushi and K. Yoshimura, “Reservoir computing beyond memory-nonlinearity trade-off,” Sci. Rep. 7, 10199 (2017). [CrossRef]

52. R. Lang and K. Kobayashi, “External optical feedback effects on semiconductor injection laser properties,” IEEE J. Quantum Electron. 16, 347–355 (1980). [CrossRef]

53. R. Lang, “Injection locking properties of a semiconductor laser,” IEEE J. Quantum Electron. 18, 976–983 (1982). [CrossRef]

Parameters	Layer 1	Layer 2	Layer 3	Layer 4
Laser current	$6.3 \times I_{t h}$	$2.8 \times I_{t h}$	$6.5 \times I_{t h}$	$2.5 \times I_{t h}$
Laser power	9.0 mW	2.1 mW	11.2 mW	2.4 mW
Feedback delay	63.0 ns	68.4 ns	63.7 ns	68.0 ns
Feedback ratio	−30.2 dB	−30.2 dB	−29.7 dB	−30.8 dB
Injection ratio	4.0	2.2	3.8	1.9
Detuning freq.	−55.0 GHz	−15.0 GHz	−33.7 GHz	−13.7 GHz

Parameters	Values
Laser current	$1.6 \times I_{t h}$
Feedback delay	3.6/2.8/2.0/1.2 ns
Feedback ratio	−30 dB
Injection ratio	−5 dB
Detuning freq.	0 GHz
Neuron per layer	80
Neuron interval	10 ps

Parameters	Layer 1	Layer 2	Layer 3	Layer 4
Laser current	$6.3 \times I_{t h}$	$2.8 \times I_{t h}$	$6.5 \times I_{t h}$	$2.5 \times I_{t h}$
Laser power	9.0 mW	2.1 mW	11.2 mW	2.4 mW
Feedback delay	63.0 ns	68.4 ns	63.7 ns	68.0 ns
Feedback ratio	−30.2 dB	−30.2 dB	−29.7 dB	−30.8 dB
Injection ratio	4.0	2.2	3.8	1.9
Detuning freq.	−55.0 GHz	−15.0 GHz	−33.7 GHz	−13.7 GHz

Parameters	Values
Laser current	$1.6 \times I_{t h}$
Feedback delay	3.6/2.8/2.0/1.2 ns
Feedback ratio	−30 dB
Injection ratio	−5 dB
Detuning freq.	0 GHz
Neuron per layer	80
Neuron interval	10 ps

Deep photonic reservoir computing recurrent network

Abstract

1. INTRODUCTION

2. DEEP PRC ARCHITECTURE AND EXPERIMENTAL SETUP

3. EXPERIMENTAL RESULTS

4. DISCUSSION

5. CONCLUSION

Funding

Disclosures

Data availability

Supplemental document

REFERENCES

Supplementary Material (1)

Data availability

Cited By

Figures (5)

Tables (2)

Equations (3)

Optica