Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

All-optical reservoir computing

Open Access Open Access

Abstract

Reservoir Computing is a novel computing paradigm that uses a nonlinear recurrent dynamical system to carry out information processing. Recent electronic and optoelectronic Reservoir Computers based on an architecture with a single nonlinear node and a delay loop have shown performance on standardized tasks comparable to state-of-the-art digital implementations. Here we report an all-optical implementation of a Reservoir Computer, made of off-the-shelf components for optical telecommunications. It uses the saturation of a semiconductor optical amplifier as nonlinearity. The present work shows that, within the Reservoir Computing paradigm, all-optical computing with state-of-the-art performance is possible.

©2012 Optical Society of America

1. Introduction

Reservoir Computing is a novel neural network computing paradigm introduced at the onset of the 21st century [115]. It is very well suited to process time dependent inputs, and provides state-of-the-art performance on tasks such as speech recognition, time series prediction, etc. (see [11,15] for reviews).

A Reservoir Computer consists of a high-dimensional nonlinear dynamical system, the reservoir, driven by a time dependent input, and an output layer in which the output signal is calculated. The time dependent output of the reservoir is given by a linear combination of the instantaneous internal states of the reservoir. The weights of this linear combination are the only parameters of the system which are trained to match the reservoir output to a target sequence. In addition, a few parameters of the dynamical system are usually tunable, to adjust how close the system is to the edge of stability and to ensure adequate coupling between the internal variables. The simplicity and flexibility of the concept of Reservoir Computing makes it eminently suitable for experimental implementation. Indeed, given the loose constraints which a system must satisfy in order to constitute a suitable reservoir, a variety of experimental realizations have so far been built. These include using a bucket of water [16], an analogue VLSI chip [17], and a delay system with a single nonlinear node. The latter architecture was introduced in an experimental architecture in [18] and studied independently from a theoretical point of view in [19,20]. Because of its simplicity, this architecture has proven very powerful for experiments. It enabled, using an electronic implementation, the first experimental Reservoir Computer with performance comparable to state-of-the-art digital implementations [21], as well as the first optoelectronic implementation of Reservoir Computing [18,22,23]. For a further general discussion, we refer to the supplementary material of [22] in which a roadmap for building experimental reservoirs is presented.

The realization of all-optical Reservoir Computers is attractive. All-optical computation has proven a tantalizing but frustrating goal since decades. On the one hand, optics has great promise for computation, such as inherent speed and parallelism. On the other hand, the difficulty of realizing optical nonlinearities and of building complex optical architectures has hindered these efforts. Reservoir Computing seems a promising new avenue. Indeed, in Reservoir Computing one can directly use the available nonlinearities for computation, rather than first adapting them so as to implement specific operations (e.g. logical gates such AND, OR, etc…) which are then combined into more complex architectures. This implies that Reservoir Computing is highly flexible, a desirable feature when, as in optics, implementing nonlinearities is difficult.

The possibility of an all-optical implementation of Reservoir Computing based on an array of Semiconductor Optical Amplifiers (SOA) on a chip has been proposed through numerical simulations in [24,25]. Here we report the first all-optical experimental implementation of a Reservoir Computer. The architecture we use is based on a fiber optics delay loop with a single nonlinear node and off-line training [2123]. The nonlinearity is provided by the saturation gain effect in a SOA.

The experiment reported here constitutes a significant step towards the possible development of analogue ultrafast all-optical computers. However, independently of the ultimate realization of this goal, the present work is also of fundamental interest as it illustrates the versatility and robustness of the delay loop architecture. Indeed, at the conceptual level our experiment differs from previous ones both in the type of nonlinearity used, and in the presence of significant amounts of noise that arise due to the spontaneous emission from the SOA.

In the next section of this paper we outline the operating principle of the Reservoir Computer and detail the choice we made for its all-optical implementation. We then describe in Sec. 3 how to operate the reservoir to achieve the best performances. And finally, in Sec. 4 we present the results obtained on a variety of benchmark tasks in order to evaluate these performances.

2. The photonic hardware implementation

2.1. Principles of reservoir computing

On a conceptual level, a Reservoir Computer can be thought of as a collection of internal nodes, whose states xi evolve in discrete timenaccording to a nonlinear recurrent map of the form

xi(n)=FNL(j=1NαAijxi(n1)+βmiu(n)),i=1,2,...,N
whereNis the number of nodes,FNLis a nonlinear function, u(n)is the input signal,mis the input mask vector, andAis the interconnection matrix. The feedback gainαand the input gainβare parameters that are used to tune the dynamics of the reservoir in order to find its best working point. The input maskmienriches the dynamics of the reservoir by distributing the same input to different nodes with different weightsmi.

The aim of a Reservoir Computer is to carry out computation on the input signal u(n). The result of the computation is an outputy^(n). It is given by a linear combination of the node states

y^(n)=iWixi(n)

The output weights Wiare calculated to minimize the distance between the actual outputy^(n)and the desired outputy(n). This is typically done by considering a “training” sequence for which the input and desired output are known. In many cases the distance between the actual and desired output is measured using the Normalized Mean Square Error (NMSE), given by

NMSE=(yy^)2n(yy)2n

Note that the input and output can both have more than one dimension, in which case the input maskmiand readout weightsWibecome respectively the matricesmikandWik, as will be considered in Sec. 4.5 and 4.6.

While in most digital implementations the connection matrixAis randomly generated, there are in fact very few requirements on its structure, the main one being that its spectral radius multiplied by the feedback gainαshould be less than 1 in order to avoid instabilities. It has been shown [19,20] that even very simple interconnection matricesAcan lead to excellent results. We can therefore choose an architecture where each node state xi only depends on the adjacent node statexi1. Such an interconnection matrix can be naturally implemented in the time-multiplexed sequential reservoir configuration described in the next section.

2.2. Configuration with a delay loop and a single nonlinear node

The Reservoir Computer implementation using a single nonlinear node and delayed feedback has been described in detail in [2123]. Its main advantage is experimental simplicity, since it requires very few components.

To implement this architecture, we need to go from the discrete timenused in Eq. (1) to continuous time. To this end each input u(n)undergoes a sample and hold procedure, during which it is held for a time T. Each interval of lengthTis subdivided intoNintervals of lengthθ. The i-th interval of durationθ(i=1,...,N)is associated with the input valuemiu(n), wheremiis the corresponding value of the input mask. It is these valuesmiu(n)which are sequentially used to drive the reservoir.

We denote byTthe round-trip time of the delay loop. Two approaches have been proposed to implement the single nonlinear node and delayed feedback scheme. In the first, used in [21,23], one takesT=T. In this case, to enrich the dynamics, it is necessary to include in the delay loop a lowpass filter with time constant typically of order a few timesθ. This internal time-scale couples successive node states to each other.

In the second approach, used in [22], the nonlinearity is instantaneous, but one desynchronizes the input from the round-trip time, for instance by takingT=(N+1)θ. This introduces a coupling from each internal variablexito the adjacent onexi1. It is this second approach that is used in the present experiment.

2.3. Experimental implementation

Our experimental setup is depicted in Fig. 1 . The all-optical reservoir is implemented using off-the-shelf fiber components operating in the telecommunication C-band. It uses an incoherent light source so that interference effects can be neglected and the internal variablesxi(n)correspond to the light intensity inside the loop. The nonlinear feedback loop consists of a fiber spool (1.6 km of single mode fiber), an SOA (Covega C-Band SOA1117) and a tunable optical attenuator (Agilent 81571A). The latter determines the feedback strengthαthat appears in the Eq. (1). In addition the delay loop contains an isolator to avoid counter-propagating Amplified Spontaneous Emission (ASE) within the SOA and a band pass filter (commercial Coarse Wavelength Divison Multiplexer centered on 1550nm) that removes a large part of the spontaneous emission noise from the SOA. The noise figure of this amplifier has been measured on the photodiode to take values between 25 dB and 45 dB, depending on the input power and the pump current.

 figure: Fig. 1

Fig. 1 Schematic of the experimental set-up of the all-optical reservoir. The reservoir consists of off-the-shelf fiber optics components operating in the telecommunication C-band. It is based on nonlinear all-optical loop operating in incoherent regime. Optical components are in red whereas electronic components are in green. The all-optical loop is driven by the input optical signal. A Superluminescent Light Emitting Diode (SLED) generates broadband white light. An electronic signal corresponding to the time dependent input multiplied by the input mask is generated by the Arbitrary Waveform Generator (AWG). This electronic signal drives an integrated Lithium Niobate Mach-Zehnder intensity modulator (MZ) thereby producing a time dependent input optical signal whose intensity is adjusted with a variable attenuator. The input optical signal is injected into the cavity by means of a 50/50 fiber coupler. The cavity itself consists of an isolator, a Semiconductor Optical Amplifier (SOA), a variable optical attenuator, and a fiber spool that acts as delay line. The cavity operates below the lasing threshold. A 90/10 fiber coupler is used to send 10% of the cavity intensity to a readout photodiode and then to a digitizer. Coarse Wavelength Division Multiplexers (CWDM, boxes with waves) are used to select a wavelength band near the maximum of emission of the SLED and the maximum gain of the SOA.

Download Full Size | PDF

The round-trip time of the loop isT=7.9437µs. WithN=50internal variables and T=(N+1)θ, the input time-scale isT=7.7880µs, andθ=155.76ns.

Nonlinearity in a Reservoir Computer is essential to fulfill complex tasks. In our system the nonlinearity is provided by the saturation of the optical gain in the SOA. In Fig. 2 we show the measured power amplification characteristics of the SOA for various injection currents. These characteristics constitute the nonlinear functionFNLof our reservoir. As can be seen in Fig. 2, the injection current controls the shape of this nonlinear function and therefore constitutes an adjustable parameter in our system. For a given task, the injection current is varied to get the best performance. For most tasks we have found that operating in a relatively linear regime, corresponding to a low input current, provides best performance. Note that in our experimental conditions the amplifier is always used far from the complete saturation regime.

 figure: Fig. 2

Fig. 2 Incoherent light amplification characteristics of the SOA. Left hand: output power versus input power. The SOA gain is almost linear at low input powers and bends over at input powers above the saturation level. The overall output power increases with increasing injection current. Due to spontaneous emission, the output power is non zero even at zero input power. From bottom to top: injection current is a) 160 mA, b) 200 mA, c) 240 mA, d) 300 mA, e) 340 mA, f) 400 mA, and g) 520 mA. (The hysteresis effect is due to the high-pass filter of the modulator’s driver used for the record). Right hand: normalized input-output relation. The data is the same as in the left panel, but the input and output powers are normalized to lie in the interval [0,1]. It is this function that characterizes the nonlinearity of the reservoir, which can be tuned by changing the pump current.

Download Full Size | PDF

To drive the reservoir we use an incoherent light source (40 nm broadband Superluminescent Light Emitting Diode (SLED, Denselight DL-CS5254A) with peak emission at 1560 nm). The time-varying inputs are created by means of an Arbitrary Waveform Generator (AWG, National Instrument model PXI-5422) that drives a Mach-Zehnder Modulator (MZM, Photline model MXAN-LN-10). The intensity injected into the reservoir is chosen by means of a variable optical attenuator (Agilent, model 81571A). Two practically identical Coarse Wavelength Division Multiplexers (CWDM) with pass band [1540-1558.5nm] are added at the output of the SLED and the SOA. These filters improve the signal-to-noise ratio (SNR) by matching the bandwidth of the source to that of the cavity and removing a large portion of the ASE noise generated by the SOA.

A fraction (10%) of the circulating light intensity is extracted from the loop via a fiber coupler in order to record the evolution of the node states by means of a photodiode (TTI TIA-525 120 MHz bandwidth), and a digitizer (NIPXI-5124 Digitizer), whose output is stored in a desktop computer used for post-processing. The input and output are sampled by the AWG and digitizer at 200MSamples per second.

In summary, the present experiment is largely inspired by our earlier work [22]. In particular the AWG and digitizer, as well as most of the numerical code used to control the system, are identical. For this reason it is instructive, as we do in the following, to compare the two implementations. In the present all-optical implementation, the two variable optical attenuators allow independent adjustment of the feedback gainαand the input gainβ. An important difference with respect to our previous work [22] is the shape of the nonlinearity (saturable gain versus sine nonlinearity). In the present case, the nonlinearity can be adjusted to some extent by modifying the injection current of the SOA, see Fig. 2. Another important difference is the spontaneous emission noise of the SOA that affects the SNR. An example of response of the all-optical reservoir to a simple input is given in Fig. 3 , illustrating the fading memory of the reservoir and its nonlinear response.

 figure: Fig. 3

Fig. 3 Example response of the all-optical reservoir. The amount of light injected into the reservoir is kept constant (normalized value 0.5). At time t = 0, the input signal (blue curve) is momentarily set to 1, then to zero, before returning to its initial value of 0.5. The response of the reservoir is recorded in green. The nonlinearity of the reservoir due to saturation of the SOA is clear from the initial response (positive response is less than negative response). The fading memory of the all-optical system is also clear. Experimental parameters are: injection current 187mA, feedback gain −1.6 dB, and average input power rescaled to its value at the entrance of the SOA 350.8µW (−4.55dBm).

Download Full Size | PDF

3. Operation mode of the all-optical reservoir

3.1. Signal pre-processing

Because our implementation is based on light intensity coded information, signals need to be adapted so that they can be represented by a positive scalar regardless of the task under study. To do so, the masked input sequence, i.e. the productmiu(n)of the input mask and the input signal, is renormalized to lie within the interval [0,1]. It is then converted into a voltage signal that drives the Mach-Zehnder Modulator (MZM) in such a way that the intensity of the light after the modulator ismiu(n)P wherePis the maximum power available. This is achieved by applying to the MZM the arcsine of the normalized productmiu(n)and multiplying the result by a factor chosen such that the voltage signal spans the interval [0, Vπ]. This procedure exploits the maximum dynamical range of the MZM while removing the nonlinearity that is potentially present due to the nonlinear transfer function of the MZM. This allows us to compare the performance of the all-optical reservoir with the previous experiments [2123]. Note that if we do not carry out this pre-compensation, the performance of the reservoir can improve on certain tasks due to the added nonlinearity provided by the transfer function of the MZM.

3.2. Signal post-processing

In our experiment the reservoir training and testing are realized on the basis of the values recorded by the digitizer and are thus not performed in real time. The entire recorded intensity time trace is first rescaled to lie in the interval [-1, + 1]. Then each internal state is measured by averaging the normalized intensity around the middle of the corresponding time window over a duration ofθ/2. By doing so, we discard the leading and trailing edges of each time interval where abrupt jumps in the light intensity take place. This amounts to eliminating most of the influence of the photodetector response time on the state values.

For each experiment the desired or “target” output sequencey(n)corresponding to a given input sequence u(n)is defined according to the task under consideration. The reservoir training then consists in applying a least mean square algorithm that calculates the set of weight valuesWithat minimizes the mismatch between the desired output sequencey(n)and the actual output sequencey^(n)as calculated from Eq. (2). In order to make the reservoir more robust against overfitting ridge regularization is used.

Once the training is performed, the weight valuesWiare kept constant so that the reservoir can be tested with the remaining of the recorded normalized intensity time trace. Note that besides the training and testing sequences, the recorded output intensity time trace also contains a “warm-up” sequence that is not used in the training phase. Its role is to ensure that the reservoir has no memory of the steady state from which it systematically starts in the various experiments. Its length is typically 2 to 4 times the number of nodes in the reservoir (typically 200-400 samples). The total length of the warm up, training, and testing sequences is usually around ten thousand samples.

4. Results

4.1. Operating point

We investigated several benchmark tasks that have been used previously in the Reservoir Computing community. In the following paragraphs we describe the different tasks and give the corresponding scores obtained with our all-optical Reservoir Computer. We also quote the operating point used to get best performance on the task. The operating point is specified by three parameters.

The injection current of the SOA determines the shape of the nonlinearity, see Fig. 2. It also determines the noise figure of the SOA due to its ASE. For most tasks we have found that a small injection current, corresponding to a rather linear regime, gives best performance. The value of the injection current used is quoted for all tasks. In most cases it is 187mA.

The feedback gainαis estimated by measuring the small signal gain of the SOA and taking into account the other losses in the cavity. The largest feedback gain, corresponding to minimum loss in the cavity, is −1.6dB.

The input gainβis quantified by taking the powermiu(n)Pat the output of the MZM, averaging over the mask valuesmiand the input values u(n), and then propagating the resulting average power (taking into account losses in the variable attenuator and coupler) to the input of the SOA. The input power, brought back to the input of the SOA, is measured in dBm and µW, which allows comparison with the transfer function of the SOA, as reported in Fig. 2.

4.2. Numerical simulations

We have developed a simple numerical model of the experiment, based on integrating numerically the recurrence

xi(n)={FNL(αxi1(n1)+βmiu(n))+νi(n)2iNFNL(αxN+i1(n2)+βmiu(n))+νi(n)i=1
where FNL is the measured response of the SOA given in Fig. 2, and νi(n)is the noise added to node variable xi at time n (the νi(n)are independent indentically distributed Gaussian variables with zero mean).This numerical model takes into account the measured transfer functions of all the components. It does not take into account any bandpass effects of the electronics (either at the input or readout). The ASE noise added by the SOA is modeled by adding the noise terms νi(n). The aim of this numerical model was to establish whether the architecture used (single nonlinear node and delay loop) combined with the form of the nonlinearity (saturable nonlinearity) introduced fundamental limitations to our experiment. Except if relevant, we do not quote the predictions of these numerical simulations.

4.3. Memory capacities

The memory capacities characterize in a simple way how a reservoir processes information. The inputs u(n)are taken to be independent identically distributed random variables. In the present work the u(n)are drawn from the uniform distribution over the interval [-1,1].

The aim is to reconstruct a functiony[u]of the previous inputs. The quality of the reconstruction is measured by the capacityC[y]=1NMSE[y], whereNMSEis the normalized mean square error, so that a perfect reconstruction of the input gives a capacity of 1, while a completely uncorrelated reconstruction gives a capacity of 0. Generally we consider a set of functionsyk, and the total capacity for this set:C=kC[yk].

In the case of the linear memory capacity, introduced by Jaeger in 2001 [2], the task is to reconstruct the inputksteps in the past. That isyk(n)=u(nk). Upon summing the capacities over all delaysk, one obtains the linear memory capacity.

The nonlinear capacities introduced in [26] are extensions of the linear capacity to the case where the function to reconstruct is a nonlinear function of the past inputs. Here we consider only second order polynomials of the past input. Namely we consider the second order Legendre polynomial of the inputksteps in the pastyk(n)=3u2(nk)1 and the product of two inputskandksteps in the pastykk(n)=u(nk)u(nk). Upon summing the capacities overk(in the case of the second order Legendre polynomials) and overk,k(in the case of products) one obtains the quadratic memory capacity and the cross memory capacity, respectively.

A key result proven in [26] is that the sum of the linear memory capacity, the quadratic memory capacity, and the cross memory capacity is less or equal to the number of internal variablesN(in our caseN=50). This generalizes the result of [2] where it is shown that this inequality holds in the case of the linear memory capacity. Note that in order to saturate the bound one should in general add all higher order polynomials, see [26] for the conditions under which this bound is saturated. In other words, while the linear memory capacity quantifies how well the reservoir remembers past inputs, the quadratic and cross memory capacities quantify how well the reservoir can compute second order polynomials of the past inputs.

Table 1 presents the maximum value obtained for each of the memory capacities of our all-optical reservoir for a pump current of 187mA, and compares them to the same quantities for our previous optoelectronic experiment [22], while keeping the numberN=50of internal variable constant. The total memory capacity mentioned in the last line is the sum of the linear, quadratic, and cross memory capacities. To compute all the values reported we considered all the contributions up to k = 100, even if no significant contribution is expected to be found for k>N.

Tables Icon

Table 1. Linear, quadratic, cross and total memory capacities of the system with 50 internal variables and a pump current of 187 mA compared to memory capacities of an optoelectronic reservoir with the same number of internal variables. Each quantity is reported for the optimal choice of feedback gainαand the input gainβ. The last line gives the sum of the first three quantities, once again maximized overαandβ.

Figure 4 shows how these quantities change in terms of the feedback gainαand the input gainβ.

 figure: Fig. 4

Fig. 4 Linear, quadratic, and cross memory capacities for the all-optical reservoir, as a function of the feedback gain feedback gainαand the input gainβfor an injection current of 187mA. Note that the color scale is different for each capacity. The + signs locate the optimum working points for the channel equalization task at the different signal-to-noise ratios. The x signs locate the optimum working points for the radar task at low sea state over the 10 delays of prediction (some delays have identical optimal working points). This shows that the capacities are correlated to, but do not explain completely, the optimal working points.

Download Full Size | PDF

Compared with the capacities of the optoelectronic architecture, the memory capacities of the all-optical reservoir are lower. In particular, we were unable to reach a large cross memory capacity or a total memory capacity close to its optimal value, i.e., the number of nodes in the reservoir. This is attributed to the ASE noise of the SOA that is inherent to our all-optical configuration and decreases the SNR.

4.4. Channel equalization task

We now apply the all-optical reservoir to a task related to telecommunication applications, first used in the context of Reservoir Computing by Jaeger [4]. It consists of channel equalization in a wireless communication link undergoing multi-path symbol interference as well as nonlinear distortion. The transmitted symbolsdare drawn randomly from the set {3,1,1,3}and undergo the following channel transformation.

q(n)=0.08d(n+2)0.12d(n+1)+d(n)+0.18d(n1)0.1d(n2)+0.091d(n3)0.05d(n4)+0.04d(n5)+0.03d(n6)+0.01d(n7)u(n)=q(n)+0.036q2(n)0.011q3(n)+noise

White noise is added to obtain a Signal-to-Noise Ratio (SNR) ranging from 12 dB to 32 dB. The detected signal u(n)obtained at the end of the communication channel constitutes the reservoir input, while the target output is the original sequenced(n).

This task is evaluated through the calculation of the Symbol Error Rate (SER), representing the percentage of misclassified symbols. The results we obtained for this task are presented in Fig. 5 , together with the results obtained for the same task with our optoelectronic reservoir [22], and the ones obtained by running the numerical simulations described in section 4.2 of this paper. This figure reveals that the all-optical reservoir performs as well for a signal-to-noise ratio ranging from 12 to 16 dB, while for larger SNR the performance is slightly degraded with respect to the optoelectronic implementation. Note, however, that even in this high SNR range the measured SER is still one order of magnitude better than the bilinear filtering technique presented in [28]. It should also be noted that there is a good agreement between the results of the optical experiment and the ones obtained with the numerical simulations, using a distribution of the internal noise νi(n) with a standard deviation of 1·10−3.

 figure: Fig. 5

Fig. 5 Performance of the all-optical reservoir on the channel equalization task with 50 nodes at 187mA pump current (Optical RC), with a simulation of the same system (Simulation), and with an optoelectronic reservoir of 50 nodes (OptoElectronic RC). Error bars are statistical.

Download Full Size | PDF

The experiment is repeated 10 times over 10 different data sets, each time the test sequence is 6000 time steps. The results show the mean and standard deviation of the SER over the 10 data sets.

To obtain these results the pump current was set to 187mA. The feedback gain and input gain where optimized for each value of the SNR. The optimal values are in the range −2.35dB to −3.6dB forαand 27.9µW (−15.55dBm) to 55.6µW (−12.55dBm) forβ.

4.5. The radar task

The radar task is a prediction task. Starting from the radar signal backscattered from the ocean surface (collected by the McMaster University IPIX radar [29]), the goal is to predict this signal one to ten time steps in the future. To measure the quality of the prediction, one compares it to the actual data one to ten time steps later. The experiment is conducted for two data sets, termed the low and high sea states. The low sea state corresponds to an average wave height of 0.8 meters (max 1.3 m); the high sea state corresponds to an average wave height of 1.8 meters (max 2.9 m). The inputs, and by consequence the outputs, of the reservoir are two-dimensional, corresponding to the in-phase and in-quadrature outputs (respectively,IandQ) of the radar demodulator. For every prediction delay, the first thousand samples of the sea clutter data are used to train our 50 node reservoir, and the next thousand samples are used as test set. The quality of the prediction is measured by the NMSE.

The results presented in Fig. 6 , are the best NMSEs for prediction delays ranging from one to ten. Our performances are similar to those of [19] where the authors have reported, for a reservoir of 80 nodes, a NMSE equal to 1.15∙10−3 for one-step predictions and equal to 3.01∙10−2 for five-step predictions for low sea state.

 figure: Fig. 6

Fig. 6 Predicting radar signal with a reservoir of 50 nodes at a pump current of 187mA (Optical RC) and with the optoelectronic reservoir of 50 nodes reported in [22] (OptoElectronic RC).

Download Full Size | PDF

The pump current was set to 187mA. The feedback gainαand input gainβwhere optimized for each value of the prediction delay and for each sea state. The optimal values are in the range −2.6dB to −6.6dB forαand 55.6µW (−12.55dBm) to 221.3µW (−6.55dBm) forβ.

4.6. Isolated spoken digit recognition

The isolated spoken digit recognition is a task introduced in [30] in the context of Reservoir Computing. It consists of the classification of audio sequences, each one containing one digit spoken by one female speaker. The data set, a subset of the National Institute of Standards and Technology (NIST) TI-46 corpus [31], has 500 sequences: 5 female speakers pronouncing 10 times every digit from 0 to 9.

For each sequence, the audio signal is sampled at 12.5 kHz, preprocessed using the Lyon cochlear ear model [32] and then sampled at regular intervals. The input signal u(n)consists of an 86-dimensional state vector with up to 130 samplings for a single sequence. For this task we used a reservoir of 200 nodes for this task (N=200). The input mask is therefore a 200*86 matrixmij. Its elements are either −0.1 or + 0.1, with equal probability. The masked input driving the reservoir is given by the N-dimensional vectorjmiju(n).

The output layer consists of 10 linear classifiersy^k(n),k=0,..,9, each paired to one digit. For each classifier the desired output is equal to 1 if the corresponding digit is being sent to the reservoir, and to −1 otherwise. For each sequence, the outputs of all the classifiers are averaged over the sequence length; using a winner-take-all approach, the highest averaged classifier is then set to 1, and all other classifiers are set to −1. The metric used to evaluate the recognition is the Word Error Rate (WER), i.e. the rate of digits incorrectly classified.

For this task, since the data set contains only 500 sequences, the evaluation follows a standard cross-validation process. Five subsets of 100 sequences each are randomly chosen; the reservoir is trained over 4 subsets and tested over the last one. The length of 100 sequences for the test subset ensures that each test subset contains all the digits. This procedure is repeated 5 times rotating the subsets, so that each subset is used once for the test. The results reported here are the average WERs and corresponding standard deviations over the 5 test subsets.

The pump current was set to 270mA. The feedback gainαwas set to −7.49 dB and the input gain β was set to 55.6µW (−12.55dBm).

The best WER that we can obtain with our all-optical reservoir is 3% (s.d. 1.2%). This is significantly worse than the 0.4% (s.d. 0.55%) WER reported in [22] for our optoelectronic reservoir. Discrete-time simulations of the same task on our reservoir give perfect classification (0 WER). The difference between the simulations and the experiment is attributed to the ASE noise which is not included in the simulations.

5. Conclusion

We presented what is, to the best of our knowledge, the first all-optical implementation of a Reservoir Computer. The system design is based on a simple optical delayed feedback loop combined to the nonlinearity of an optical amplifier, which from a practical viewpoint only requires standard off-the-shelf optical components. We tested our all-optical reservoir on several benchmark tasks previously used in the literature to evaluate reservoir performances. We showed that, despite its simplicity, our system exhibits state-of-the-art capacities for several tasks. This outcome is promising as it reveals the potential of optics in the field of artificial intelligence.

However, compared to the opto-electronic reservoir reported in [22], the performance on tasks, such as the memory capacities, the non linear channel equalization, isolated spoken digit recognition, is somewhat degraded. Our numerical simulations show that this is not due to the form of the nonlinearity which is different in the two experiements, but can be attributed to the noise induced in the present system which arises from the ASE of the SOA. This highlights a difficulty inherent to all analog computation that will have to be addressed in future experimental realizations of reservoir computing.

The configuration studied here is sequential. The reservoir speed is at present limited by the bandwidth of the AWG and digitizer, but it can already process inputs at a rate of 7.8 μs/symbol, regardless of the number of nodes used, even if more time is then needed for the postprocessing. It has moreover the merit of showing that standard optical components with a nonlinear response can be exploited efficiently to perform complex computational tasks. Looking further in the future, it may be possible to exploit this flexibility together with the parallelism of optics in order to perform complex tasks at ultrahigh speed. One possibility, for example, is to encode the state of different nodes at different frequencies, thus implementing reservoirs of higher dimensionality; another is to move to free space optics, and to encode the states of different nodes in the spatial modulation of a light beam. Either way, we hope that the present work will contribute to a renewed development of all-optical computing.

Acknowledgments

The authors thank Joni Dambre, Benjamin Schrauwen and Peter Bienstman for helpful discussions. They acknowledge financial support by Interuniversity Attraction Poles program of the Belgian Science Policy Office, under grant IAP P7-35 “photonics@be” and by the Fonds de la Recherche Scientifique FRS-FNRS.

References and links

1. H. Jaeger, “The ‘echo state’ approach to analysing and training recurrent neural networks,” Technical Report GMD Report 148 (German National Research Center for Information Technology, 2001).

2. H. Jaeger, “Short term memory in echo state networks,” GMD Report 152 (German National Research Institute for Computer Science, 2001).

3. W. Maass, T. Natschläger, and H. Markram, “Real-time computing without stable states: a new framework for neural computation based on perturbations,” Neural Comput. 14(11), 2531–2560 (2002). [CrossRef]   [PubMed]  

4. H. Jaeger and H. Haas, “Harnessing nonlinearity: predicting chaotic systems and saving energy in wireless communication,” Science 304(5667), 78–80 (2004). [CrossRef]   [PubMed]  

5. J. J. Steil, “Backpropagation-decorrelation: online recurrent learning with O(N) complexity,” in Proceedings of IEEE International Joint Conference on Neural Networks (IEEE, 2004), pp. 843–848.

6. R. Legenstein and W. Maass, “What makes a dynamical system computationally powerful?” in New Directions in Statistical Signal Processing: From Systems to Brain (MIT Press, 2005), pp. 127–154.

7. D. Verstraeten, B. Schrauwen, M. D’Haene, and D. Stroobandt, “An experimental unification of reservoir computing methods,” Neural Netw. 20(3), 391–403 (2007). [CrossRef]   [PubMed]  

8. W. Maass, P. Joshi, and E. D. Sontag, “Computational aspects of feedback in neural circuits,” PLOS Comput. Biol. 3(1), e165 (2007). [CrossRef]   [PubMed]  

9. H. Jaeger, M. Lukosevicius, D. Popovici, and U. Siewert, “Optimization and applications of echo state networks with leaky-integrator neurons,” Neural Netw. 20(3), 335–352 (2007). [CrossRef]   [PubMed]  

10. D. V. Buonomano and W. Maass, “State-dependent computations: spatiotemporal processing in cortical networks,” Nat. Rev. Neurosci. 10(2), 113–125 (2009). [CrossRef]   [PubMed]  

11. M. Lukoševičius and H. Jaeger, “Reservoir computing approaches to recurrent neural network training,” Comput. Sci. Rev. 3(3), 127–149 (2009). [CrossRef]  

12. B. Hammer, B. Schrauwen, and J. J. Steil, “Recent advances in efficient learning of recurrent networks”, in Proceedings of the European Symposium on Artificial Neural Networks (2009), pp. 213–216.

13. F. Triefenbach, A. Jalalvand, B. Schrauwen, and J. Martens, “Phoneme recognition with large hierarchical reservoirs,” Proceedings of Adv.Neural Inf. Processing Syst. 23, 1–9 (2010).

14. F. Wyffels and B. Schrauwen, “A comparative study of reservoir computing strategies for monthly time series prediction,” Neurocomputing 73(10–12), 1958–1964 (2010). [CrossRef]  

15. M. Lukoševičius, H. Jaeger, and B. Schrauwen, “Reservoir computing trends,” KI - Künstliche Intelligenz, (2012), pp. 1–7.

16. C. Fernando and S. Sojakka, “Pattern recognition in a bucket,” in Proceedings of the 7th European Conference on Artificial Life2801, W. Banzhaf, J. Ziegler, T. Christaller, P. Dittrich, and J. Kim, eds. (2003), Vol. 2801, pp. 588–597.

17. F. Schürmann, K. Meier, and J. Schemmel, “Edge of chaos computation in mixed-mode vlsi - a hard liquid,” in Proceedings of Advances in Neural Information Processing Systems, L. K. Saul, Y. Weiss, and Léon, eds. (MIT Press, 2005).

18. Y. Paquot, B. Schrauwen, J. Dambre, M. Haelterman, and S. Massar, “Reservoir computing: a photonic neural network for information processing,” Proc. SPIE 7728, 77280B, 77280B-12 (2010). [CrossRef]  

19. A. Rodan and P. Tino, “Simple deterministically constructed recurrent neural networks,” Intelligent Data Engineering and Automated Learning (IDEAL, 2010), pp. 267–274.

20. A. Rodan and P. Tino, “Minimum complexity echo state network,” IEEE Trans. Neural Netw. 22(1), 131–144 (2011). [CrossRef]   [PubMed]  

21. L. Appeltant, M. C. Soriano, G. Van der Sande, J. Danckaert, S. Massar, J. Dambre, B. Schrauwen, C. R. Mirasso, and I. Fischer, “Information processing using a single dynamical node as complex system”, Nat. Commun. 2, 468 (2011). http://www.nature.com/ncomms/journal/v2/n9/full/ncomms1476.html

22. Y. Paquot, F. Duport, A. Smerieri, J. Dambre, B. Schrauwen, M. Haelterman, and S. Massar, “Optoelectronic reservoir computing,” Sci. Rep. 2, 287 (2012). http://www.nature.com/srep/2012/120227/srep00287/full/srep00287.html

23. L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer, “Photonic information processing beyond turing: an optoelectronic implementation of reservoir computing,” Opt. Express 20(3), 3241–3249 (2012), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-20-3-3241. [CrossRef]   [PubMed]  

24. K. Vandoorne, W. Dierckx, B. Schrauwen, D. Verstraeten, R. Baets, P. Bienstman, and J. Van Campenhout, “Toward optical signal processing using photonic reservoir computing,” Opt. Express 16(15), 11182–11192 (2008). [CrossRef]   [PubMed]  

25. K. Vandoorne, J. Dambre, D. Verstraeten, B. Schrauwen, and P. Bienstman, “Parallel reservoir computing using optical amplifiers,” IEEE Trans. Neural Netw. 22(9), 1469–1481 (2011), http://www.opticsinfobase.org/oe/abstract.cfm?URI=oe-16-15-11182. [CrossRef]   [PubMed]  

26. J. Dambre, D. Verstraeten, B. Schrauwen, and S. Massar, “Information processing capacity of dynamical systems,” Sci. Rep. 2, Article number: 514, (2012).

27. H. Jaeger, “Adaptive nonlinear system identification with echo state networks,” Adv. Neural Inf. Process. Syst. 8, 593–600 (2002).

28. V. J. Mathews, “Adaptive algorithms for bilinear filtering,” Proc. SPIE 2296(1), 317–327 (1994). [CrossRef]  

29. http://soma.ece.mcmaster.ca/ipix/dartmouth/datasets.html

30. D. Verstraeten, B. Schrauwen, and D. Stroobandt, “Isolated word recognition using a liquid state machine,” in Proceedings of the 13th European Symposium on Artifcial Neural Networks (ESANN) (2005), pp. 435–440.

31. Texas Instruments-Developed 46-Word Speaker-Dependent Isolated Word Corpus (TI46), September 1991, NIST Speech Disc 7–1.1 (1 disc) (1991).

32. R. Lyon, “A computational model of filtering, detection, and compression in the cochlea,” in Proceedings of IEEE International Conference on Acoustics, Speech, and Signal Processing (1982), pp. 1282–1285.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (6)

Fig. 1
Fig. 1 Schematic of the experimental set-up of the all-optical reservoir. The reservoir consists of off-the-shelf fiber optics components operating in the telecommunication C-band. It is based on nonlinear all-optical loop operating in incoherent regime. Optical components are in red whereas electronic components are in green. The all-optical loop is driven by the input optical signal. A Superluminescent Light Emitting Diode (SLED) generates broadband white light. An electronic signal corresponding to the time dependent input multiplied by the input mask is generated by the Arbitrary Waveform Generator (AWG). This electronic signal drives an integrated Lithium Niobate Mach-Zehnder intensity modulator (MZ) thereby producing a time dependent input optical signal whose intensity is adjusted with a variable attenuator. The input optical signal is injected into the cavity by means of a 50/50 fiber coupler. The cavity itself consists of an isolator, a Semiconductor Optical Amplifier (SOA), a variable optical attenuator, and a fiber spool that acts as delay line. The cavity operates below the lasing threshold. A 90/10 fiber coupler is used to send 10% of the cavity intensity to a readout photodiode and then to a digitizer. Coarse Wavelength Division Multiplexers (CWDM, boxes with waves) are used to select a wavelength band near the maximum of emission of the SLED and the maximum gain of the SOA.
Fig. 2
Fig. 2 Incoherent light amplification characteristics of the SOA. Left hand: output power versus input power. The SOA gain is almost linear at low input powers and bends over at input powers above the saturation level. The overall output power increases with increasing injection current. Due to spontaneous emission, the output power is non zero even at zero input power. From bottom to top: injection current is a) 160 mA, b) 200 mA, c) 240 mA, d) 300 mA, e) 340 mA, f) 400 mA, and g) 520 mA. (The hysteresis effect is due to the high-pass filter of the modulator’s driver used for the record). Right hand: normalized input-output relation. The data is the same as in the left panel, but the input and output powers are normalized to lie in the interval [0,1]. It is this function that characterizes the nonlinearity of the reservoir, which can be tuned by changing the pump current.
Fig. 3
Fig. 3 Example response of the all-optical reservoir. The amount of light injected into the reservoir is kept constant (normalized value 0.5). At time t = 0, the input signal (blue curve) is momentarily set to 1, then to zero, before returning to its initial value of 0.5. The response of the reservoir is recorded in green. The nonlinearity of the reservoir due to saturation of the SOA is clear from the initial response (positive response is less than negative response). The fading memory of the all-optical system is also clear. Experimental parameters are: injection current 187mA, feedback gain −1.6 dB, and average input power rescaled to its value at the entrance of the SOA 350.8µW (−4.55dBm).
Fig. 4
Fig. 4 Linear, quadratic, and cross memory capacities for the all-optical reservoir, as a function of the feedback gain feedback gain α and the input gain β for an injection current of 187mA. Note that the color scale is different for each capacity. The + signs locate the optimum working points for the channel equalization task at the different signal-to-noise ratios. The x signs locate the optimum working points for the radar task at low sea state over the 10 delays of prediction (some delays have identical optimal working points). This shows that the capacities are correlated to, but do not explain completely, the optimal working points.
Fig. 5
Fig. 5 Performance of the all-optical reservoir on the channel equalization task with 50 nodes at 187mA pump current (Optical RC), with a simulation of the same system (Simulation), and with an optoelectronic reservoir of 50 nodes (OptoElectronic RC). Error bars are statistical.
Fig. 6
Fig. 6 Predicting radar signal with a reservoir of 50 nodes at a pump current of 187mA (Optical RC) and with the optoelectronic reservoir of 50 nodes reported in [22] (OptoElectronic RC).

Tables (1)

Tables Icon

Table 1 Linear, quadratic, cross and total memory capacities of the system with 50 internal variables and a pump current of 187 mA compared to memory capacities of an optoelectronic reservoir with the same number of internal variables. Each quantity is reported for the optimal choice of feedback gain α and the input gain β . The last line gives the sum of the first three quantities, once again maximized over α and β .

Equations (5)

Equations on this page are rendered with MathJax. Learn more.

x i ( n )= F NL ( j=1 N α A ij x i ( n1 ) +β m i u( n ) ), i=1,2,...,N
y ^ ( n )= i W i x i ( n )
NMSE= ( y y ^ ) 2 n ( y y ) 2 n
x i ( n )={ F NL ( α x i1 ( n1 )+β m i u( n ) ) + ν i (n) 2iN F NL ( α x N+i1 ( n2 )+β m i u( n ) ) + ν i (n) i=1
q( n )=0.08d( n+2 )0.12d( n+1 )+d( n )+0.18d( n1 )0.1d( n2 )+0.091d( n3 ) 0.05d( n4 )+0.04d( n5 )+0.03d( n6 )+0.01d( n7 ) u( n )=q( n )+0.036 q 2 ( n )0.011 q 3 ( n )+noise
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.