Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Laser network decision making by lag synchronization of chaos in a ring configuration

Open Access Open Access

Abstract

Photonic technologies are promising for solving complex tasks in artificial intelligence. In this paper, we numerically investigate decision making for solving the multi-armed bandit problem using lag synchronization of chaos in a ring laser-network configuration. We construct a laser network consisting of unidirectionally coupled semiconductor lasers, whereby spontaneous exchange of the leader-laggard relationship in the lag synchronization of chaos is observed. We succeed in solving the multi-armed bandit problems with three slot machines using lag synchronization of chaos by controlling the coupling strengths among the three lasers. Furthermore, we investigate the scalability of the proposed decision-making principle by increasing the number of slot machines and lasers. This study suggests a new direction in laser network-based decision making for future photonic intelligent functions.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Photonic technologies are promising key components of artificial intelligence. The concept of photonic accelerators has been introduced [1], for which a photonic processor is located at the front ends of digital computers and performs certain functions with faster processing speed, less power consumption, and novel functionalities. Examples of photonic accelerators are photonic artificial neural networks [2], photonic reservoir computing [3,4], optical pass gate logic [5], photonic decision making [6], compressed sampling [7], and coherent Ising machine [8]. Photonic reservoir computing has been reported as a new type of simplified recurrent neural network using supervised learning. Photonic decision making is a technique used to solve the multi-armed bandit (MAB) problem [9], which is a fundamental problem of reinforcement learning [10].

The MAB problem in reinforcement learning is an important problem in artificial intelligence applications, such as computer gaming [11,12] and learning for robot arms [13]. In this problem, a player repeats selecting one of the slot machines and searches for the best slot machine from which the total reward is maximized. This problem can be solved in two actions: exploration, where the player selects the slot machine to examine which one yields higher rewards, and exploitation, where the player selects the slot machine with the highest hit probability estimated via the aforementioned exploration. However, exploration and exploitation have a difficult tradeoff, which is called the exploration-exploitation dilemma [10].

Recently, photonic implementations for solving the MAB problem have been reported [1419]. One of the photonic implementations has been reported using fast chaotic laser outputs over a GHz range [1416]. The two-armed bandit problem, which is a bandit problem with two slot machines, is considered in these studies. A player selects one of the slot machines by comparing the chaotic laser outputs with a threshold value controlled by a tug-of-war algorithm [2022]. In addition, scalable decision making has been proposed by introducing time-division multiplexing of chaotic temporal waveforms [16]. A hierarchical structure is used for the assignment of several slot machines, and the player selects one of the two slot machines at each stage for comparison. More recently, decision making using dual-channel chaotic outputs has also been reported [17]. In these previous reports, chaotic temporal waveforms of laser outputs have been utilized as correlated random numbers, and software-based implementation is mainly performed on an external computer.

Physical dynamics such as chaos synchronization could be directly used for decision making. Indeed, in our previous work, we have solved the two-armed bandit problem numerically and experimentally using lag synchronization of chaos in two mutually coupled semiconductor lasers [18]. Low-frequency fluctuations (LFFs) have been observed in mutually coupled semiconductor lasers [23], which consist of abrupt power dropouts and the subsequent gradual power recovery. More interestingly, for decision making, one laser synchronizes with the other laser with a coupling delay time, known as the leader-laggard relationship in lag synchronization of chaos [2325]. The laser oscillating in advance is called the leader, and the other laser is called the laggard. Furthermore, spontaneous exchange of the leader-laggard relationship has been reported [25]; that is, the leader and laggard are autonomously and irregularly exchanged. The trajectories of the two laser dynamics have been analyzed on the attractors in the phase space, and the leader is exchanged during dropouts because of partial injection locking. One laser frequency becomes higher than the frequency of the other laser, and this relationship is switched in time during dropouts. This spontaneous exchange can be controlled via the coupling strengths and initial optical frequencies. Therefore, decision making has been successfully achieved by selecting one of the slot machines corresponding to the leader laser while controlling the detuning of the coupling strengths based on the result of slot machine selection [18].

However, in the previous study [18], the MAB problem was considered to have only two slot machines by coupling two semiconductor lasers. Principles and methods for solving bandit problems with more than two slot machines using the physical dynamics of chaotically synchronized lasers have not yet been established. In fact, one of the fundamental advantages of using the interactions among lasers is their networking capability. In the literature, synchronization states in the laser network have been examined using an adjacency matrix [26,27]. Furthermore, laser networks that exhibit zero-lag synchronization have also been reported [26], in which all lasers are synchronized without a time delay. In addition, cluster synchronization has been reported [2630], where some of the coupled lasers are synchronized without a time delay. From these considerations, investigating the ability of a laser network with multiple coupled lasers is one of the most important elements for laser-based decision making.

In this study, we introduce a scheme for solving the MAB problem with more than two slot machines using unidirectionally coupled semiconductor lasers in a ring network configuration. Lag synchronization of chaos in unidirectionally coupled lasers of a ring configuration has been reported [31]. However, our focus is to investigate a decision-making function where autonomous adaptation should be accomplished in initially unknown external environments, in addition to the synchronization phenomena. Such aspects have not been explored and reported in the existing literature related to laser networks.

In Sec. 2, we propose a ring laser-network consisting of three unidirectionally coupled semiconductor lasers where we observe the leader-laggard relationship. In Sec. 3, we propose and investigate a method for solving the three-armed bandit problem (i.e., the bandit problem with three slot machines) using lag synchronization of chaos in a numerical simulation. In Sec. 4, we investigate the scalability of the proposed decision-making system by increasing the number of slot machines and lasers considered in the system. Finally, the conclusion of this study is presented in Sec. 5.

2. Numerical model for a unidirectionally coupled semiconductor laser network

First, we consider the MAB problem with three slot machines possessing unknown hit probabilities. We introduce a laser network using three semiconductor lasers [31], which are denoted as Laser 1, Laser 2, and Laser 3, as shown in Fig. 1. Further, Fig. 1 shows a ring laser-network and our numerical model for the laser network. Three semiconductor lasers are unidirectionally coupled with a coupling delay time $\tau $, which is configured at the same value between two adjacent lasers in the ring. The coupling strengths from laser $n - 1$ to laser n (its adjacent laser located on the input side) is configured differently and is denoted by ${\kappa _n}\; $($n = 1,\; 2,$ and $3$; Laser 0 corresponds to Laser 3). In Fig. 1(a), ${\kappa _1}$, ${\kappa _2}$, and ${\kappa _3}$ are the coupling strengths from Laser 3 to Laser 1, Laser 1 to Laser 2, and Laser 2 to Laser 3, respectively. In this configuration, we show lag synchronization of chaos among the three lasers where temporal waveforms are synchronized with the coupling delay time$\; \tau $. A theoretical analysis of this coupled network configuration using an adjacency matrix is described in the Appendix.

 figure: Fig. 1.

Fig. 1. (a) Schematic diagram of the ring laser-network and (b) the corresponding numerical model of the three unidirectionally coupled semiconductor lasers. Att, variable optical attenuator; BS, beam splitter; ISO, optical isolator; M, mirror; ${\kappa _n}$, coupling strength from laser $n - 1$ to laser n (Laser 0 corresponds to Laser 3); $\tau $, coupling delay time.

Download Full Size | PDF

The numerical model of the semiconductor laser network is described by the Lang-Kobayashi equations [3133] as follows:

$$\frac{{d{E_n}(t )}}{{dt}} = \frac{{1 + i\alpha }}{2}\left[ {\frac{{{G_N}({{N_n}(t )- {N_0}} )}}{{1 + \varepsilon {{|{{E_n}(t )} |}^2}}} - \frac{1}{{{\tau_p}}}} \right]{E_n}(t )+ {\kappa _n}{E_{n - 1}}({t - \tau } )\exp [{i{\theta_n}(t )} ],$$
$$\frac{{d{N_n}(t )}}{{dt}} = J - \frac{{{N_n}(t )}}{{{\tau _s}}} - \frac{{{G_N}({{N_n}(t )- {N_0}} )}}{{1 + \varepsilon {{|{{E_n}(t )} |}^2}}}{|{{E_n}(t )} |^2},$$
$${\theta _n}(t )= ({{\omega_{n - 1}} - {\omega_n}} )t - {\omega _{n - 1}}\tau ,\; $$
where ${E_n}(t )$ and ${N_n}(t )$ are the complex electric-field amplitude and carrier density of the semiconductor laser, respectively. The subscripts n and $n - 1$ represent Laser $n\; $and Laser $n - 1$, respectively, where $n = 1,\; 2,$ and $3$ (it is assumed that Laser 0 corresponds to Laser 3). In this study, we set $\tau $ = 5.0 ns and ${\kappa _n}$ = 40 $\textrm{n}{\textrm{s}^{ - 1}}$, and assume a symmetric $\tau $. We can compensate for the asymmetry of the coupling delay times among the lasers by introducing an average coupling delay time and a time shift for each temporal waveform (see the Appendix for details). The other parameter values used in the simulation are summarized in Table 1. We introduce a small detuning of the initial optical frequencies (10 MHz) between Lasers 1 and 2 and between Lasers 3 and 1 to incorporate the injection-locking effect between coupled lasers in the numerical model.

Tables Icon

Table 1. Parameter values used in the numerical simulation of the laser network.

For simplicity, we do not include spontaneous emission noise in our numerical simulations. Lag synchronization of chaos can be observed even in the presence of noise in experiments [2325]. We speculate that a small amount of noise may slightly change the statistical properties of the spontaneous exchange of the leader-laggard relationship in the laser network; however, the overall characteristics may be preserved.

Figure 2(a) shows the temporal waveforms of the intensities ${I_n}(t )$ of Lasers 1, 2, and 3. The laser intensity is given by ${I_n}(t )= {|{{E_n}(t )} |^2}$. The temporal waveforms of all lasers appear similar to each other. Further, Fig. 2(b) shows the temporal waveforms filtered by a low-pass filter to remove high-frequency oscillations. The cut-off frequency of the low-pass filter is set as 60 MHz. In Fig. 2(b), all laser outputs show sudden power dropouts and the subsequent gradual recovery, which indicates the characteristics of the LFF dynamics. All laser outputs are apparently synchronized to each other. Figure 2(c) shows an enlarged view of the low-pass-filtered temporal waveforms of the three lasers. It can be observed that there is a time lag between the low-pass-filtered temporal waveforms with the coupling delay time $\tau $ of 5 ns. For instance, in Fig. 2(c), Laser 1 shows a dropout first, and Laser 2 synchronizes Laser 1 with $\tau $. Similarly, Laser 3 synchronizes Laser 2 with$\; \tau $.

 figure: Fig. 2.

Fig. 2. (a) Temporal waveforms for the unidirectionally coupled ring laser-network. (b) Temporal waveforms filtered from original temporal waveforms in (a) using a low-pass filter with a cut-off frequency of 60 MHz. (c) Enlarged view of (b).

Download Full Size | PDF

From the low-pass-filtered temporal waveforms, it is possible to determine the leader laser at the occurrence of the power dropout. However, it is difficult to determine the leader in the region of gradual power recovery. Therefore, we introduce a short-term cross-correlation value. In the three-laser network, we calculate three short-term cross-correlation values from the original temporal waveforms of the laser outputs. The short-term cross-correlation value for Laser n is given by:

$${C_n} = \frac{{[{{I_n}(t )- {{\bar{I}}_n}} ]{{[{{I_{n - 1}}({t - \tau } )- {{\bar{I}}_{n - 1}}} ]}_\tau }}}{{{\sigma _n}{\sigma _{n - 1}}}}$$
where ${\bar{I}_n}$ and ${\sigma _n}$ are the mean and standard deviation of the temporal waveforms of the laser intensity, respectively. $\; \langle\cdot\rangle {\; _\tau }$ represents the short-term average over the period $\tau $. ${C_n}(t )$ is the cross-correlation value assuming that Laser n synchronizes with Laser $n - 1$ with the coupling delay time $\tau $, where Laser 0 corresponds to Laser 3 ($n = 1,\; 2,$ and $3$).

The leader laser can be determined by comparing the three short-term cross-correlation values; the laser corresponding to the minimum short-term cross-correlation value is the leader. This method of determining the leader laser is different from our previous method in [18]. For example, if ${C_1}$ is the minimum value, the temporal waveforms between Lasers 3 and 1 are less correlated than those between Lasers 1 and 2 and between Lasers 2 and 3. As a result, the coupling strength ${\kappa _1}$ is considered to be reduced and a unidirectional coupling from Laser 1 to Laser 3 via Laser 2 is constructed. Therefore, we consider that Laser 1 is the leader if ${C_1}$ is the minimum value.

Figure 3(a) shows the three short-term cross-correlation values from the original temporal waveforms of the laser intensity shown in Fig. 2(a). The short-term cross-correlation values show sudden dropout and the subsequent gradual recovery, similar to the low-pass-filtered intensities in Fig. 2(b). Figure 3(b) shows an enlarged view of the temporal waveforms of the low-pass-filtered intensities (upper, similar to Fig. 2(c)) and the short-term cross-correlation values (lower). For the low-pass-filtered intensities, Laser 1 shows the power dropout first, indicating that Laser 1 is the leader at 100 ns. For the short-term cross-correlation values, the minimum cross-correlation value at 100 ns is ${C_1}$. Therefore, the leader laser can be determined from the minimum cross-correlation value. Moreover, the leader switches from Laser 1 to Laser 2, from Laser 2 to Laser 3, and from Laser 3 to Laser 1, after the power dropout.

 figure: Fig. 3.

Fig. 3. (a) Short-term cross-correlation values calculated from Fig. 2(a). (b) Comparison between the low-pass-filtered temporal waveforms (Fig. 2(c)) and the short-term cross-correlation values.

Download Full Size | PDF

The leader changes with time among the three lasers, and we can determine the time evolution of the leader laser by tracking the three short-term cross-correlation values. Here, we introduce the leader probability ${L_n}$, defined as the ratio of the duration for which Laser n is the leader to the total duration, and is expressed as follows:

$${L_n} = \frac{{{T_n}}}{{{T_{\textrm{total}}}}}$$
where ${T_n}$ is the duration for which Laser n is the leader and ${T_{\textrm{total}}}$ is the total duration (the sum of ${T_n}$). In this study, ${T_{\textrm{total}}}$ is set as 10000 ns. The leader duration is determined by comparing the short-term cross-correlation values among the three lasers. First, we determine the leader laser from the minimum cross-correlation value at time ${t_1}$. Then, we shift the time to ${t_2}$ (${t_2} = {t_1} + \mathrm{\Delta }t,\; \mathrm{\Delta }t = 20\; \textrm{ps}$) and calculate the cross-correlation values to determine the new leader laser. We repeat this procedure and accumulate the times for which each laser is the leader to measure the leader duration. The window size of the short-term cross-correlation is set to 5.0 ns (i.e., the coupling delay time).

We investigate the leader probabilities ${L_n}$ for the change in the three coupling strengths ${\kappa _1},\; {\kappa _2},$ and ${\kappa _3}$. Figure 4 shows ${L_n}\; $for Lasers 1, 2, and 3 as one of the coupling strengths ${\kappa _1},\; {\kappa _2},$ or ${\kappa _3}$ is changed. The other coupling strengths in each figure are fixed at 40 $\textrm{n}{\textrm{s}^{ - 1}}$, which is indicated by the vertical dotted lines in Fig. 4. When all the coupling strengths are set to the same value of 40 $\textrm{n}{\textrm{s}^{ - 1}}$, the leader probabilities are almost the same. Strictly speaking, Laser 2 has the highest leader probability owing to its highest initial optical frequency. When the coupling strength is reduced, the laser corresponding to the index of the minimum coupling strength becomes the leader. For example, Laser 1 becomes the leader as ${\kappa _1}$ is reduced, as shown in Fig. 4(a), because the weak coupling of ${\kappa _1}$ (between Laser 3 and Laser 1) is equivalent to the unidirectional coupling configuration from Lasers 1 to 3 via Laser 2. The leader probability is also affected by the coupling direction. For example, when ${\kappa _1}$ is reduced in Fig. 4(a), Laser 1 shows the highest value of ${L_n}$ (i.e., most probably the leader) and Laser 2 shows the second-highest value of ${L_n}$. In contrast, when ${\kappa _1}$ increases, Laser 3 becomes the leader. However, the leader probability for Laser 3 does not approach 1 for a large ${\kappa _1}$. The difference in leader probabilities is caused by the difference in the coupling strengths, and the leader probability approaches 1 by setting the coupling strength to a smaller value. Therefore, the leader probability of Laser n can be controlled and set to 1 as the coupling strength ${\kappa _n}$ is reduced.

 figure: Fig. 4.

Fig. 4. Leader probabilities as one of the coupling strengths ${\kappa _n}$ is changed. The other coupling strengths are fixed at 40 ns-1, represented by the vertical dotted lines. (a) ${\kappa _1}$is changed, (b) ${\kappa _2}$ is changed, and (c) ${\kappa _3}$ is changed.

Download Full Size | PDF

3. Decision making using a ring network with three semiconductor lasers

In this section, we numerically investigate the process of decision making for solving the MAB problem using a ring network with unidirectionally coupled semiconductor lasers. For this purpose, we consider the three-armed bandit problem, where three slot machines ${S_n}$ return “hit” with the probability ${P_n}$, which we refer to as hit probability hereafter. Figure 5(a) schematically illustrates the decision-making system for solving the three-armed bandit problem using three unidirectionally coupled semiconductor lasers. Each slot machine corresponds to each semiconductor laser, i.e., the slot machine ${S_n}$ corresponds to laser n (n = 1, 2, and 3).

 figure: Fig. 5.

Fig. 5. Schematic diagram of solving the three-armed bandit problems using a ring laser network with three unidirectionally coupled semiconductor lasers. ${S_n}$ is slot machine n, ${\kappa _n}$ is the coupling strength from Laser $n - 1$ to Laser n (Laser 0 corresponds to Laser 3), and $\tau $ is the coupling delay time. Circle n corresponds to Laser n.

Download Full Size | PDF

The decision-making process using a ring laser network is described as follows. First, we observe the leader-laggard relationship among the three lasers by measuring their temporal waveforms and correlation values. We select the slot machine corresponding to the leader laser. For example, if Laser 1 is the leader, we select slot machine S1. Then, we control the coupling strengths based on the results of the slot machine selection.

Herein, we introduce a new algorithm to solve the MAB problem with more than two slot machines [21]. This algorithm is a generalization of the tug-of-war method for solving the MAB problem with a large number of slot machines. In fact, this algorithm can be considered equivalent to the algorithm used for the two-armed bandit problem in [18] (see the Appendix for details).

We define the evaluation values ${X_n}(t )$ and ${Q_n}(t )$ for slot machine ${S_n}$ using the tug-of-war theory as follows [21]:

$${X_n}(t )= {Q_n}(t )- \frac{1}{{N - 1}}\mathop \sum \limits_{j \ne n} {Q_j}(t )$$
$${Q_n}(t )= 2{H_n}(t )- ({{{\bar{P}}_{\textrm{1st}}} + {{\bar{P}}_{\textrm{2nd}}}} ){U_n}(t )$$
$${\bar{P}_n}(t )= \frac{{{H_n}(t )}}{{{U_n}(t )}}$$
where ${U_n}(t )$ and ${H_n}(t )$ are the total number of plays and hits for slot machine ${S_n}$, respectively, and N is the number of slot machines. The estimated hit probability ${\bar{P}_n}(t )$ is defined by Eq. (8), and the highest and second-highest estimated hit probabilities (${\bar{P}_{\textrm{1st}}}\; $and ${\bar{P}_{\textrm{2nd}}}$) are used to determine ${Q_n}(t )\; $in Eq. (7).

The algorithm for changing the coupling strengths in the laser network is given by ${X_n}(t )$ as follows:

$${\kappa _n} = \left\{ {\begin{array}{ll} {{\kappa_{\textrm{min}}}{\;\; }}&{({{\kappa_{\textrm{ini}}} - k{X_n}(t )< {\kappa_{\textrm{min}}}} )}\\ {{\kappa_{\textrm{ini}}} - k{X_n}(t )}&{({{\kappa_{\textrm{min}}} \le {\kappa_{\textrm{ini}}} - k{X_n}(t )\le {\kappa_{\textrm{max}}}} )}\\ {{\kappa_{\textrm{max}}}}&{({{\kappa_{\textrm{max}}} < {\kappa_{\textrm{ini}}} - k{X_n}(t )} )} \end{array}} \right.$$
where ${\kappa _{\textrm{min}}}$, ${\kappa _{\textrm{max}}}$, and ${\kappa _{\textrm{ini}}}$ are the minimum, maximum, and initial coupling strengths, respectively. k is the step width. The coupling strengths ${\kappa _n}$ are changed step-by-step between the minimum and maximum coupling strengths.

If the selected slot machine ${S_n}$ returns “hit,” the coupling strength ${\kappa _n}$ is decreased. In contrast, if the selected slot machine returns “miss,” ${\kappa _n}$ is increased. We repeat this procedure of selecting one of the slot machines corresponding to the leader and controlling the coupling strength based on the result of the slot machine selection. The parameter values are summarized in Table 2. We set the hit probabilities as ${P_1} = 0.4$, ${P_2} = 0.6$, and ${P_3} = 0.8$.

Tables Icon

Table 2. Parameter values for decision making.

In this algorithm, we assume that the hit probabilities remain unchanged over time. When we consider the situation where the hit probabilities change during decision making, we need to introduce the memory effect [15] in the evaluation of ${Q_n}(t )$ in Eq. (7).

Figure 6 shows an example of the process of decision making for solving the three-armed bandit problem. Figure 6(a) shows the short-term cross-correlation values, and the switching of the cross-correlation values is observed because of power dropout occurrence in the LFF dynamics. Figure 6(b) shows the results of the slot machine selection, where three slot machines are selected irregularly for small numbers of plays. As the number of plays increases, slot machine 3 is eventually selected (with the highest hit probability of ${P_3} = 0.8$), which is the correct decision-making result after exploration. Therefore, the exploration and exploitation processes are shown in Fig. 6(b). Figure 6(c) shows the estimated hit probabilities ${\bar{P}_n}$, where it can be observed that the hit probability for slot machine 3 becomes the largest value after exploration. The estimated hit probability ${\bar{P}_3}$ for slot machine 3 approaches the original hit probability ${P_3}$ (dotted line), whereas this is not the case for the estimated hit probabilities for slot machines 1 and 2 in Fig. 6(c). Slot machine 3, which has the highest hit probability, is selected more frequently after exploration, and its estimated hit probability ${\bar{P}_3}\; $ approaches the original hit probability ${P_3}$. Figure 6(d) shows that the coupling strengths ${\kappa _3}$ becomes the smallest value after exploration. Subsequently, the short-term cross-correlation value ${C_3}$ for slot machine 3 becomes the smallest value in Fig. 6(a). Laser 3 always becomes the leader, and only slot machine 3 is selected. From these results, correct decision making can be achieved successfully.

 figure: Fig. 6.

Fig. 6. Decision making results. (a) Short-term cross-correlation values ${C_n}$, (b) slot machine selection ${S_n}$, (c) estimated hit probabilities ${\bar{P}_n}$ (${P_n}$ is the original hit probability, indicated by the dotted lines), and (d) coupling strengths ${\kappa _n}$ as a function of the number of plays.

Download Full Size | PDF

We evaluate the decision-making performance using the correct decision rate (CDR), which is expressed as follows [6]:

$$CDR(t )= \frac{1}{m}\mathop \sum \limits_{i = 1}^m C({i,t} )$$
where $C({i,t} )$ yields 1 if the selected slot machine has the highest hit probability at the $t$-th play in the $i$-th cycle, or 0 otherwise. m is the number of total cycles (one cycle is defined as 500 plays). The slot machine with the highest hit probability is selected when CDR is 1. A faster convergence of CDR to 1 indicates a better performance or quick adaptation to initially uncertain reward environments.

Figure 7 shows the CDR curve as the number of plays increases. We test 500 plays and 100 cycles ($m$ = 100). In Fig. 7, CDR increases gradually with an increase in the number of plays and converges to 1 after 250 plays. The convergence of CDR is slower than in previous results for the two-armed bandit problem [18] because the number of slot machines increases from two to four; accordingly, decision making is more difficult than in the case studied in [18]. Therefore, we confirm that correct decision making is successfully achieved for the three-armed bandit problem using the laser network.

 figure: Fig. 7.

Fig. 7. Evolution of correct decision rate (CDR) as the number of plays increases for the three-armed bandit problem.

Download Full Size | PDF

In this study, we perform a physics-based method and use spontaneous switching phenomena of the leader laser in lag synchronization of chaos in the laser network to explore the best slot machine. We consider that the use of physical dynamics for decision making could be more effective, and our work may stimulate the development of new algorithms for decision making. For example, the plateaus of the curves for the leader probabilities in Fig. 4 provide slow decision making and can be interpreted as more careful consideration before making a final decision (i.e., “hesitation” of final decision making). This physical characteristic could be effective for solving more complex problem with less different hit probabilities between the best and the second-best slot machines.

In real experimental implementation, the transient time is required for lag synchronization of chaos after the coupling strengths are changed. The transient time of synchronization states depends on several periods of the coupling delay time, which is the order of tens of nanoseconds. The bandwidth of the variable optical attenuators, acquisition time of temporal waveforms, and feedback control time will also affect the speed of decision making in experimental implementation. Moreover, the detuning of the laser parameter values and the asymmetry of coupling strengths and delay times need to be compensated for successful correct decision making. The mismatch of the laser parameter values may affect the leader probabilities, and could be compensated by adjusting the injection currents of the lasers.

4. Scalability analysis

4.1 Four-laser network

We examine the decision-making performance when the number of slot machines further increases. We start the discussion when the number of slot machines is four. Figure 8 shows a schematic diagram of the four-armed bandit problem, where four semiconductor lasers are coupled unidirectionally in a ring network. The principle of decision making is the same as in the three-armed bandit case discussed in Sec. 3; we select slot machine ${S_n}$ ($n = 1, \ldots ,\; 4$) corresponding to the leader of Laser n.

 figure: Fig. 8.

Fig. 8. Schematics for solving the four-armed bandit problem using a ring network with four unidirectionally coupled semiconductor lasers. ${S_n}$ is slot machine n and ${\kappa _n}$ is the coupling strength from Laser $n - 1$ to Laser n (Laser 0 corresponds to Laser 4). Circle n corresponds to Laser n.

Download Full Size | PDF

First, we investigate decision making for solving the four-armed bandit problem by changing the hit probabilities. One of the hit probabilities is changed and the others are fixed under the following condition: ${P_1} < {P_2} < {P_3} < {P_4}$. Initially, the highest hit probability ${P_4}$ is changed and hit probabilities ${P_1},{P_2}$ and ${P_3}$ are fixed as 0.1, 0.2, and 0.3, respectively. Figure 9(a) shows CDR as the number of plays increases. The CDR curves appear different for different values of ${P_4}$. The CDR curve approaches 1 very quickly for ${P_4} = 0.9$. In contrast, CDR converges to 1 slowly for ${P_4} = 0.4$. The difference in CDR convergence is due to the difficulty of solving the MAB problem, which in turn arises because the former case (${P_4} = 0.9$) finds slot machine 4 with the highest hit probability easier than the latter case (${P_4} = 0.4$), compared with the second-highest hit probability (${P_3} = 0.3$).

 figure: Fig. 9.

Fig. 9. Evolution of correct decision rate (CDR) as the number of plays increases for different hit probabilities. (a) The highest hit probability ${P_4}$ is changed (${P_1}$ = 0.1, ${P_2}$ = 0.2, and ${P_3}$ = 0.3). (b) The third-highest hit probability ${P_2}$ is changed (${P_1}$ = 0.1, ${P_3}$ = 0.8, and ${P_4}$ = 0.9).

Download Full Size | PDF

Figure 9(b) shows the case where the third-highest hit probability ${P_2}$ is changed and hit probabilities ${P_1}$, ${P_3}$, and ${P_4}$ are fixed as 0.1, 0.8, and 0.9, respectively. The CDR curves are similar and no significant difference is observed. We found that the change in the third-highest hit probability does not affect the overall decision-making performance. Based on the results presented in Figs. 9(a) and 9(b), we speculate that the difference between the highest and second-highest hit probabilities substantially affects the decision-making performance. This is consistent with the fact that the highest and second-highest hit probabilities are used in the tug-of-war algorithm, as indicated in Eq. (7).

Next, we change the hit-probability assignment of the slot machines in the four-armed-bandit problem. We prepare four slot machines ${S_1},\; \; {S_2},\; {S_3},$ and ${S_4}$ under the following hit probability condition: ${P_A} < {P_B} < {P_C} < {P_D}$. In this method, some assignments of the four slot machines to the four lasers are equivalent when a circular permutation is considered. Therefore, it is sufficient to examine six permutations, as summarized in Table 3.

Tables Icon

Table 3. Six different assignments of the hit probabilities of the four slot machines when circular permutation is considered.

Figure 10 shows the CDR for two examples of the six assignments. The hit probabilities in Figs. 10(a) and 10(b) are set as $\{{{P_A},{P_B},{P_C},{P_D}} \}= \{{0.1,\; 0.2,\; 0.3,\; 0.4} \}$ and $\{{{P_A},{P_B},{P_C},{P_D}} \}= \{{0.1,\; 0.7,\; 0.8,\; 0.9} \}$, respectively. The six color-coded curves correspond to the six hit-probability assignments of the four slot machines, as shown in Table 3. The differences among the six curves in Fig. 10 are due to the unidirectional coupling of the four semiconductor lasers. The couplings among the lasers become asymmetric as the decision-making process is repeated. Then, the probabilities of becoming the leader lasers are also changed asymmetrically.

 figure: Fig. 10.

Fig. 10. Evaluation of correct decision rate (CDR) for the position dependency of the slot machines at (a) $\{{{P_A},{P_B},{P_C},{P_D}\; } \}= \{{0.1,0.2,0.3,0.4} \}$ and (b) $\{{{P_A},{P_B},{P_C},{P_D}\; } \}= \{{0.1,0.7,0.8,0.9} \}$. The color-coded curves correspond to the six hit-probability assignments of the four slot machines, as shown in Table 3.

Download Full Size | PDF

Based on the results shown in Fig. 10(a), the CDR curves for the six assignments can be categorized into three groups (red and orange curves, cyan and green curves, and blue and gray curves). In the red and orange group, the CDR curves converge quickly to unity. In this case, the slot machine with the highest hit probability is located to the right of that with the second-highest hit probability (0.3 and 0.4 in Fig. 10(a)). Physically, the slot machine with the highest hit probability is located at the output side of the slot machine with the second-highest hit probability. However, in the blue and gray group, the CDR curves slowly converge to unity. In this case, the slot machine with the highest hit probability is located to the left of that with the second-highest hit probability (0.4 and 0.3 in Fig. 10(a)). Physically, the slot machine with the highest hit probability is located at the input side of the slot machine with the second-highest hit probability.

We speculate that these differences result from the fact that the unidirectional coupling shows asymmetry in exploring the slot machine with the highest hit probability. In the former case, suppose that the slot machine with the second-highest hit probability is selected, which is a wrong decision as it is not the highest, and there is a reward (hit), the coupling strength between the slot machines with the highest and second-highest hit probabilities remains large. Therefore, the chances of selecting the slot machine with the highest hit probability are still high. Conversely, in the latter case, suppose that the slot machine with the second-highest hit probability is selected and there is a reward, the coupling strength between the slot machines with the highest and second-highest hit probabilities is reduced, so the chance of selecting the slot machine with the highest hit probability becomes small. This asymmetry of the slot machine assignment results in the different CDR curves in Fig. 10(a). A similar tendency is observed when different hit probabilities are used for $\{{{P_A},{P_B},{P_C},{P_D}} \}= \{{0.1,\; 0.7,\; 0.8,\; 0.9} \}$ in Fig. 10(b), even though the difference in the CDR curves is smaller.

4.2 General cases

In this subsection, we investigate scalable decision making when the number of slot machines increases. We prepare N multiple lasers in the ring network, which has the same number of slot machines N considered therein. Figure 11 schematically illustrates a decision-making system for solving the MAB problem with N slot machines using a ring laser-network. N semiconductor lasers are prepared and coupled unidirectionally to construct the ring network. Each laser is assigned to each slot machine for decision making.

 figure: Fig. 11.

Fig. 11. Schematics for scalable decision making for solving the MAB problem with N slot machines using a ring network with N unidirectionally coupled semiconductor lasers.$\; {S_n}$ is slot machine n and ${\kappa _n}$ is the coupling strength from Laser $n - 1$ to Laser n (Laser 0 corresponds to Laser $N$). Circle n corresponds to Laser n.

Download Full Size | PDF

We evaluate the CDR when the number of slot machines and lasers increases. We test the MAB problem with up to seven slot machines. The hit probability of one of the slot machines is set as 0.6, and those of the other slot machines are 0.4. Figure 12(a) shows the CDR for different numbers of slot machines (from three to seven) as the number of plays increases. The CDR curves gradually converge to 1 as the number of slot machines increases. From Fig. 12(a), we succeed in achieving decision making for solving the MAB problem with up to seven slot machines.

 figure: Fig. 12.

Fig. 12. (a) Correct decision rate (CDR) for the MAB problem with different numbers of slot machines (from three to seven). (b) Scalability of decision making. The number of plays at $CDR\; = \; 0.95$ is plotted as a function of the number of slot machines N.

Download Full Size | PDF

We investigate the convergence speed of the CDR curves for different numbers of slot machines. We measure the number of plays ${M_{play}}$ at which CDR reaches 0.95 for the first time (the orange dotted line in Fig. 12(a)). ${M_{play}}$ is considered as the convergence speed of the CDR curves. Figure 12(b) shows the number of plays ${M_{play}}$ at CDR = 0.95 as the number of slot machines N increases. This figure indicates the scalability of decision making using our method. The relationship between ${M_{play}}$ and N can be represented by a polynomial formula, where ${M_{play}}$ $= 19.4\; {N^{1.85}}$. Therefore, the scaling law can be described in the order of$\; {N^{1.85}}$ ($O({{N^{1.85}}} )$) and is close to ${N^2}$. This scaling coefficient is larger than those in previously reported methods [16,34]. We speculate that this large coefficient may result from the unidirectional coupling among the lasers. The introduction of mutual coupling in the network may reduce the scaling coefficient while lag synchronization of chaos is maintained and faster decision making could be implemented, which is an interesting topic for future research. In addition, we used up to seven slot machines because of a limitation in computational resources. Further, increasing the number of slot machines, which we are considering in future work, will result in a more convincing scalability conclusion.

Furthermore, investigating different coupling configurations that are suitable for decision making is very interesting, such as introducing bypass links in a ring configuration to configure small-world networks [35]. We believe that a large number of lasers could be implemented on a chip in a photonic integrated circuit [36] or VCSEL arrays [37] using recent photonic technologies.

5. Conclusions

We numerically investigated photonic decision making for solving the MAB problem using a ring laser-network. We constructed a ring network using multiple unidirectionally coupled semiconductor lasers. We observed lag synchronization of chaos and the leader-laggard relationship using the short-term cross-correlation values of the temporal waveforms of the laser outputs. We proposed a method for solving the MAB problem using the lag synchronization of chaos in the laser network. One of the slot machines corresponding to the leader laser is selected by evaluating short-term cross correlations, while the coupling strengths are changed based on the result of the slot machine selection. For the three-armed bandit problem, CDR converges to unity, indicating that correct decision making is performed. In the case of four slot machines, we found that the assignment of the slot machines with the highest and second-highest hit probabilities affects the decision-making performances. We also investigated the scalability of the decision making, which was successfully achieved for solving the MAB problem with up to seven slot machines. We clarified the scaling law $O({{N^{1.85}}} )$ between the CDR convergence speed and the number of slot machines N.

This is the first demonstration of scalable decision making using the synchronization of chaos in a laser network to the best of our knowledge. This method, which involves the use of a laser network, could be extended to solve more advanced problems, known as competitive MAB problems [38,39], where more than one player selects the slot machines with the highest hit probability, and the selection of the same slot machine (competition) reduces the total reward. We believe that our method using synchronization of chaos in a laser network represents a promising resource for decision making as a photonic accelerator [1].

6. Appendix

6.1 Theoretical analysis of the coupled network configuration using the adjacency matrix

It is important to generate lag synchronization of chaos with time delay in the decision-making scheme proposed in this study. We explain the theoretical background of how to construct a laser network for which zero-lag synchronization of chaos (i.e., synchronization without time delay) should be avoided. We assume that lasers are coupled with delay time $\tau $, and all the coupling strength and delay times are identical in the network. It has been reported that the existence of zero-lag synchronization in the network can be determined by the adjacency matrix of the network for the coupling terms [26,27].

First, the adjacency matrix A of the laser network is defined, which describes unidirectional couplings among the lasers, as follows:

$$A = \left\{ {\begin{array}{l} {{a_{i,j}} = 1{\; \; \; \; \; \; \; \; }({\textrm{with coupling}} )}\\ {{a_{i,j}} = 0{\; \; \; \; \; \; \; \; }({\textrm{without coupling}} )} \end{array}} \right.$$
where ${a_{i,j}}$ is the element at the i-th column and j-th row of adjacency matrix A. Unidirectional coupling exists for ${a_{i,j}} = 1{\; }$from laser i to laser j, and no coupling is found for ${a_{i,j}} = 0{\; }$from laser i to laser j. Next, a sign function of matrix Y is introduced:
$$Z = {\textrm{sign}}(Y )= \left\{ {\begin{array}{ll} {{z_{i,j}} = 1}&{({{y_{i,j}} > 0} )}\\ {{z_{i,j}} = 0}&{({{y_{i,j}} = 0} )}\\ {{z_{i,j}} ={-} 1}&{({{y_{i,j}} < 0} )} \end{array}} \right.$$
where function $sign(Y )$ is the element-wise sign function. The element of ${z_{i,j}}$ is determined by the sign of element ${y_{i,j}}$ at the i-th row and j-th column of the matrix.

To identify zero-lag synchronization, the m-th power of the adjacency matrix Am is calculated until the following equation is satisfied:

$${\textrm{sign}}({{A^m}} )= {\textrm{sign}}({{A^{m - n}}} ){\; \; \; \; \; \; \; \; }({m > n > 0} )$$
where m and n are positive integers. The laser network shows $n$-cluster synchronization (i.e., n different synchronization states are observed). In particular, zero-lag synchronization is observed for $n = 1$ (i.e., all the synchronization states are identical). More importantly, zero-lag synchronization is not observed if n equals the number of lasers N in the network.

We provide an example of a laser network consisting of three unidirectionally coupled semiconductor lasers, as shown in Fig. 1. The adjacency matrix A for the configuration of this three-laser network, shown in Fig. 1, is described as follows:

$$A = \left( {\begin{array}{ccc} 0&1&0\\ 0&0&1\\ 1&0&0 \end{array}} \right)\; $$

The elements of matrix A are ${a_{i,j}} = 1{\; }$when unidirectional coupling exists from Laser i to Laser j, and ${a_{i,j}} = 0{\; }$otherwise. In Fig. 1, the unidirectional coupling exists from Laser 1 to 2, from Laser 2 to 3, and from Laser 3 to 1.

The m-th power of the adjacency matrix Am is calculated using Eq. (14) and is satisfied as follows:

$${\textrm{sign}}({{A^4}} )= \left( {\begin{array}{ccc} 0&1&0\\ 0&0&1\\ 1&0&0 \end{array}} \right) = {\textrm{sign}}(A )$$

Here, $n = 3$ (and $m = 4$) is obtained, and n equals the number of lasers N = 3. Therefore, it is confirmed that zero-lag synchronization is not observed in the configuration shown in Fig. 1.

6.2 Compensation of asymmetry of coupling delay times

Asymmetry of the coupling delay time may exist for experimental realization. The asymmetry of the coupling delay time can be compensated by introducing a time shift among the measured temporal waveforms. In the case of asymmetric coupling delay time ${\tau _n}\; $ between Laser $n - 1$ and Laser n, we introduce the average coupling delay time ${\tau _{\textrm{ave}}}$, and the time shift ${D_n}$ for the temporal waveform of Laser n (n ≥ 2, D1 = 0) as follows:

$${\tau _{\textrm{ave}}} = \frac{1}{N}\mathop \sum \limits_{n = 1}^N {\tau _n}$$
$${D_n} = {\tau _{\textrm{ave}}} - {\tau _n} + \mathop \sum \limits_{j = 1}^{n - 1} {D_j}\; \; ({n \ge 2} )$$

After the time shift of each temporal waveform by ${D_n}$, we can calculate the short-term cross-correlation values using Eq. (4), where $\tau \; $is replaced by ${\tau _{\textrm{ave}}}$.

6.3 Equivalence of algorithms

The decision-making algorithm used in this study is universal for the MAB problem with a large number of slot machines. This algorithm is equivalent to that used in the case of the two-armed bandit problem in [18]. We show that ${X_1}(t )$ (Eq. (6) in this study) is equivalent to $TA(t )$ (Eq. (12) in [18]) as follows.

For the two-armed bandit problem, we assume the number of slot machines $N = 2$. The evaluation value ${X_1}(t )$ for Laser 1 is described in Eqs. (6) and (7) as follows:

$${X_1}(t )= {Q_1}(t )- {Q_2}(t )= 2{H_1}(t )- ({{{\bar{P}}_1} + {{\bar{P}}_2}} ){U_1}(t )- ({2{H_2}(t )- ({{{\bar{P}}_1} + {{\bar{P}}_2}} ){U_2}(t )} )$$

We introduce the shifts for hit and miss $\mathrm{\Delta } = 2 - ({{{\bar{P}}_1} + {{\bar{P}}_2}} )$ and $\; \mathrm{\Omega } = ({{{\bar{P}}_1} + {{\bar{P}}_2}} )$, respectively, used in [18] $(\mathrm{\Delta } + \; \mathrm{\Omega } = 2$).

$$\begin{aligned} {X_1}(t )& = ({\mathrm{\Delta } + \mathrm{\Omega }} ){H_1}(t )- \mathrm{\Omega }{U_1}(t )- ({({\mathrm{\Delta } + \mathrm{\Omega }} ){H_2}(t )- \mathrm{\Omega }{U_2}(t )} )\\ & = \mathrm{\Delta }{H_1}(t )- \mathrm{\Omega }({{U_1}(t )- {H_1}(t )} )- ({\mathrm{\Delta }{H_2}(t )- \mathrm{\Omega }({{U_2}(t )- {H_2}(t )} )} )\end{aligned}$$

We introduce the number of miss ${B_n}(t )= {U_n}(t )- {H_n}(t )\; $as follows:

$${X_1}(t )= \mathrm{\Delta }{H_1}(t )- \mathrm{\Omega }{B_1}(t )- ({\mathrm{\Delta }{H_2}(t )- \mathrm{\Omega }{B_2}(t )} )$$

We introduce the result of hit ${\hat{H}_n}(t )$ at play t (${\hat{H}_n}(t )= 1$ for hit at play t, ${\hat{H}_n}(t )= 0$ otherwise), and the result of miss ${\hat{B}_n}(t )$ at play t (${\hat{B}_n}(t )= 1$ for miss at play t, ${\hat{B}_n}(t )= 0$ otherwise). We rewrite Eq. (20) with ${\hat{H}_n}(t )$, ${\hat{B}_n}(t ),$ and the past result ${X_1}({t - 1} )$ as follows:

$$\begin{aligned} {X_1}(t )& = \mathrm{\Delta }{{\hat{H}}_1}(t )- \mathrm{\Omega }{{\hat{B}}_1}(t )- ({\mathrm{\Delta }{{\hat{H}}_2}(t )- \mathrm{\Omega }{{\hat{B}}_2}(t )} )+ ({\mathrm{\Delta }{H_1}({t - 1} )- \mathrm{\Omega }{B_1}({t - 1} )} )- ({\mathrm{\Delta }{H_2}({t - 1} )- \mathrm{\Omega }{B_2}({t - 1} )} )\\ & = \mathrm{\Delta }{{\hat{H}}_1}(t )- \mathrm{\Omega }{{\hat{B}}_1}(t )- ({\mathrm{\Delta }{{\hat{H}}_2}(t )- \mathrm{\Omega }{{\hat{B}}_2}(t )} )+ {X_1}({t - 1} )\end{aligned}$$

$\mathrm{\Delta }{\hat{H}_1}(t )- \mathrm{\Omega }{\hat{B}_1}(t )- ({\mathrm{\Delta }{{\hat{H}}_2}(t )- \mathrm{\Omega }{{\hat{B}}_2}(t )} )$ is equivalent $X$(t) in [18] (see also Table 2 in [18]). We replace ${X_1}(t )$ with $TA(t )$ as follows:

$$TA(t )= X(t )+ TA({t - 1} )$$

This equation is equivalent to Eq. (12) in [18] for $a = 1$.

Funding

Japan Society for the Promotion of Science (JP17H01277, JP19H00868, JP20K15185); Core Research for Evolutional Science and Technology (JPMJCR17N2); Telecommunications Advancement Foundation.

Disclosures

The authors declare no conflicts of interests.

References

1. K. Kitayama, M. Notomi, M. Naruse, K. Inoue, S. Kawakami, and A. Uchida, “Novel frontier of photonics for data processing—Photonic accelerator,” APL Photonics 4(9), 090901 (2019). [CrossRef]  

2. Y. Shen, N. C. Harris, S. Skirlo, M. Prabhu, T. B. Jones, M. Hochberg, X. Sun, S. Zhao, H. Larochelle, D. Englund, and M. Soljačić, “Deep learning with coherent nanophotonic circuits,” Nat. Photonics 11(7), 441–446 (2017). [CrossRef]  

3. L. Larger, M. C. Soriano, D. Brunner, L. Appeltant, J. M. Gutierrez, L. Pesquera, C. R. Mirasso, and I. Fischer, “Photonic information processing beyond Turing: an optoelectronic implementation of reservoir computing,” Opt. Express 20(3), 3241–3249 (2012). [CrossRef]  

4. G. Van der Sande, D. Brunner, and M. C. Soriano, “Advances in photonic reservoir computing,” Nanophotonics 6(3), 561–576 (2017). [CrossRef]  

5. T. Ishihara, A. Shinya, K. Inoue, K. Nozaki, and M. Notomi, “An integrated nanophotonic parallel adder,” J. Emerg. Technol. Comput. Syst. 14(2), 1–20 (2018). [CrossRef]  

6. M. Naruse, M. Berthel, A. Drezet, S. Huant, M. Aono, H. Hori, and S.-J. Kim, “Single-photon decision maker,” Sci. Rep. 5(1), 13253 (2015). [CrossRef]  

7. M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska, T. Sun, K. F. Kelly, and R. G. Baraniuk, “Single-pixel imaging via compressive sampling,” IEEE Signal Process. Mag. 25(2), 83–91 (2008). [CrossRef]  

8. T. Inagaki, Y. Haribara, K. Igarashi, T. Sonobe, S. Tamate, T. Honjo, A. Marandi, P. L. McMahon, T. Umeki, K. Enbutsu, O. Tadanaga, H. Takenouchi, K. Aihara, K. Kawarabayashi, K. Inoue, S. Utsunomiya, and H. Takesue, “A coherent ising machine for 2000-node optimization problems,” Science 354(6312), 603–606 (2016). [CrossRef]  

9. H. Robbins, “Some aspects of the sequential design of experiments,” Bull. Am. Math. Soc. 58(5), 527–536 (1952). [CrossRef]  

10. R. S. Sutton and A. G. Barto, Reinforcement learning (MIT press, 1998).

11. D. Silver, A. Huang, C. J. Maddison, A. Guez, L. Sifre, G. Van Den Driessche, J. Schrittwieser, I. Antonoglou, V. Panneershelvam, M. Lanctot, S. Dieleman, D. Grewe, J. Nham, N. Kalchbrenner, I. Sutskever, T. Lillicrap, M. Leach, K. Kavukcuoglu, T. Graepel, and D. Hassabis, “Mastering the game of Go with deep neural networks and tree search,” Nature 529(7587), 484–489 (2016). [CrossRef]  

12. D. Silver, J. Schrittwieser, K. Simonyan, I. Antonoglou, A. Huang, A. Guez, T. Hubert, L. Baker, M. Lai, A. Bolton, Y. Chen, T. Lillicrap, F. Hui, L. Sifre, G. Van Den Driessche, T. Graepel, and D. Hassabis, “Mastering the game of Go without human knowledge,” Nature 550(7676), 354–359 (2017). [CrossRef]  

13. O. B. Kroemer, R. Detry, J. Piater, and J. Peters, “Combining active learning and reactive control for robot grasping,” Rob. Auton. Syst. 58(9), 1105–1116 (2010). [CrossRef]  

14. M. Naruse, Y. Terashima, A. Uchida, and S.-J. Kim, “Ultrafast photonic reinforcement learning based on laser chaos,” Sci. Rep. 7(1), 8772 (2017). [CrossRef]  

15. T. Mihana, Y. Terashima, M. Naruse, S. -J. Kim, and A. Uchida, “Memory effect on adaptive decision making with a chaotic semiconductor laser,” Complexity 2018, 4318127 (2018). [CrossRef]  

16. M. Naruse, T. Mihana, H. Hori, H. Saigo, K. Okamura, M. Hasegawa, and A. Uchida, “Scalable photonic reinforcement learning by time-division multiplexing of laser chaos,” Sci. Rep. 8(1), 10890 (2018). [CrossRef]  

17. Y. Ma, S. Xiang, X. Guo, Z. Song, A. Wen, and Y. Hao, “Time-delay signature concealment of chaos and ultrafast decision making in mutually coupled semiconductor lasers with a phase-modulated Sagnac loop,” Opt. Express 28(2), 1665–1678 (2020). [CrossRef]  

18. T. Mihana, Y. Mitsui, M. Takabayashi, K. Kanno, S. Sunda, M. Naruse, and A. Uchida, “Decision making for the multi-armed bandit problem using lag synchronization of chaos in mutually coupled semiconductor lasers,” Opt. Express 27(19), 26989–27008 (2019). [CrossRef]  

19. T. Mihana, K. Kanno, M. Naruse, and A. Uchida, “Laser network for lag synchronization of chaos and scalable decision making,” Proceedings of 2019 International Symposium on Nonlinear Theory and Its Applications (NOLTA2019)1, 481–484 (2019).

20. S.-J. Kim, M. Aono, and M. Hara, “Tug-of-war model for the two-bandit problem: Nonlocally-correlated parallel exploration via resource conservation,” BioSystems 101(1), 29–36 (2010). [CrossRef]  

21. S.-J. Kim and M. Aono, “Amoeba-inspired algorithm for cognitive medium access,” Nonlinear Theory and Its Appl. IEICE 5(2), 198–209 (2014). [CrossRef]  

22. S.-J. Kim, M. Aono, and E. Nameda, “Efficient decision-making by volume-conserving physical object,” New J. Phys. 17(8), 083023 (2015). [CrossRef]  

23. T. Heil, I. Fischer, W. Elsässer, J. Mulet, and C. R. Mirasso, “Chaos synchronization and spontaneous symmetry breaking in symmetrically delay-coupled semiconductor lasers,” Phys. Rev. Lett. 86(5), 795–798 (2001). [CrossRef]  

24. E. A. Rogers-Dakin, J. García-Ojalvo, D. J. Deshazer, and R. Roy, “Synchronization and symmetry breaking in mutually coupled fiber lasers,” Phys. Rev. E 73(4), 045201 (2006). [CrossRef]  

25. K. Kanno, T. Hida, A. Uchida, and M. Bunsen, “Spontaneous exchange of leader-laggard relationship in mutually coupled synchronized semiconductor lasers,” Phys. Rev. E 95(5), 052212 (2017). [CrossRef]  

26. M. Nixon, M. Friedman, E. Ronen, A. A. Friesem, N. Davidson, and I. Kanter, “Synchronized cluster formation in coupled laser network,” Phys. Rev. Lett. 106(22), 223901 (2011). [CrossRef]  

27. M. Nixon, M. Fridman, E. Ronen, A. A. Friesem, N. Davidson, and I. Kanter, “Controlling synchronization in large laser networks,” Phys. Rev. Lett. 108(21), 214101 (2012). [CrossRef]  

28. J. Ohtsubo, R. Ozawa, and M. Nanbu, “Synchrony of small nonlinear networks in chaotic semiconductor lasers,” Jpn. J. Appl. Phys. 54(7), 072702 (2015). [CrossRef]  

29. A. Argyris, M. Bourmpos, and D. Syvridis, “Experimental synchrony of semiconductor lasers in coupled networks,” Opt. Express 24(5), 5600–5614 (2016). [CrossRef]  

30. J. D. Hart, D. C. Schmadel, T. E. Murphy, and R. Roy, “Experiments with arbitrary networks in time-multiplexed delay systems,” Chaos 27(12), 121103 (2017). [CrossRef]  

31. J. M. Buldú, M. C. Torrent, and J. García-Ojalvo, “Synchronization in semiconductor laser rings,” J. Lightwave Technol. 25(6), 1549–1554 (2007). [CrossRef]  

32. R. Lang and K. Kobayashi, “External optical feedback effects on semiconductor injection laser properties,” IEEE J. Quantum Electron. 16(3), 347–355 (1980). [CrossRef]  

33. P. L. Buono and J. A. Collera, “Symmetry-breaking bifurcations in rings of delay-coupled semiconductor lasers,” SIAM J. Appl. Dyn. Syst. 14(4), 1868–1898 (2015). [CrossRef]  

34. T. Niiyama, G. Furuhata, A. Uchida, M. Naruse, and S. Sunada, “Lotka-Volterra competition mechanism embedded in a decision-making method,” J. Phys. Soc. Jpn. 89(1), 014801 (2020). [CrossRef]  

35. D. J. Watts and S. H. Strogatz, “Collective dynamics of ‘small-world’ networks,” Nature 393(6684), 440–442 (1998). [CrossRef]  

36. K. Takano, C. Sugano, M. Inubushi, K. Yoshimura, S. Sunada, K. Kanno, and A. Uchida, “Compact reservoir computing with a photonic integrated circuit,” Opt. Express 26(22), 29424–29439 (2018). [CrossRef]  

37. T. Heuser, M. Pflüger, I. Fischer, J. A. Lott, D. Brunner, and S. Reizenstein, “Developing a photonic hardware platform for brain-inspired computing based on 5×5 VCSEL arrays,” JPhys Photonics 2(4), 044002 (2020). [CrossRef]  

38. L. Lai, H. E. Gamal, H. Jiang, and H. V. Poor, “Cognitive medium access: Exploration, exploitation, and competition,” IEEE Trans. Mob. Comput. 10(2), 239–253 (2011). [CrossRef]  

39. N. Chauvet, D. Jegouso, B. Boulanger, H. Saigo, K. Okamura, H. Hori, A. Drezet, S. Huant, G. Bachelier, and M. Naruse, “Entangled-photon decision maker,” Sci. Rep. 9(1), 12229 (2019). [CrossRef]  

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (12)

Fig. 1.
Fig. 1. (a) Schematic diagram of the ring laser-network and (b) the corresponding numerical model of the three unidirectionally coupled semiconductor lasers. Att, variable optical attenuator; BS, beam splitter; ISO, optical isolator; M, mirror; ${\kappa _n}$ , coupling strength from laser $n - 1$ to laser n (Laser 0 corresponds to Laser 3); $\tau $ , coupling delay time.
Fig. 2.
Fig. 2. (a) Temporal waveforms for the unidirectionally coupled ring laser-network. (b) Temporal waveforms filtered from original temporal waveforms in (a) using a low-pass filter with a cut-off frequency of 60 MHz. (c) Enlarged view of (b).
Fig. 3.
Fig. 3. (a) Short-term cross-correlation values calculated from Fig. 2(a). (b) Comparison between the low-pass-filtered temporal waveforms (Fig. 2(c)) and the short-term cross-correlation values.
Fig. 4.
Fig. 4. Leader probabilities as one of the coupling strengths ${\kappa _n}$ is changed. The other coupling strengths are fixed at 40 ns-1, represented by the vertical dotted lines. (a) ${\kappa _1}$ is changed, (b) ${\kappa _2}$ is changed, and (c) ${\kappa _3}$ is changed.
Fig. 5.
Fig. 5. Schematic diagram of solving the three-armed bandit problems using a ring laser network with three unidirectionally coupled semiconductor lasers. ${S_n}$ is slot machine n, ${\kappa _n}$ is the coupling strength from Laser $n - 1$ to Laser n (Laser 0 corresponds to Laser 3), and $\tau $ is the coupling delay time. Circle n corresponds to Laser n.
Fig. 6.
Fig. 6. Decision making results. (a) Short-term cross-correlation values ${C_n}$ , (b) slot machine selection ${S_n}$ , (c) estimated hit probabilities ${\bar{P}_n}$ ( ${P_n}$ is the original hit probability, indicated by the dotted lines), and (d) coupling strengths ${\kappa _n}$ as a function of the number of plays.
Fig. 7.
Fig. 7. Evolution of correct decision rate (CDR) as the number of plays increases for the three-armed bandit problem.
Fig. 8.
Fig. 8. Schematics for solving the four-armed bandit problem using a ring network with four unidirectionally coupled semiconductor lasers. ${S_n}$ is slot machine n and ${\kappa _n}$ is the coupling strength from Laser $n - 1$ to Laser n (Laser 0 corresponds to Laser 4). Circle n corresponds to Laser n.
Fig. 9.
Fig. 9. Evolution of correct decision rate (CDR) as the number of plays increases for different hit probabilities. (a) The highest hit probability ${P_4}$ is changed ( ${P_1}$ = 0.1, ${P_2}$ = 0.2, and ${P_3}$ = 0.3). (b) The third-highest hit probability ${P_2}$ is changed ( ${P_1}$ = 0.1, ${P_3}$ = 0.8, and ${P_4}$ = 0.9).
Fig. 10.
Fig. 10. Evaluation of correct decision rate (CDR) for the position dependency of the slot machines at (a) $\{{{P_A},{P_B},{P_C},{P_D}\; } \}= \{{0.1,0.2,0.3,0.4} \}$ and (b) $\{{{P_A},{P_B},{P_C},{P_D}\; } \}= \{{0.1,0.7,0.8,0.9} \}$ . The color-coded curves correspond to the six hit-probability assignments of the four slot machines, as shown in Table 3.
Fig. 11.
Fig. 11. Schematics for scalable decision making for solving the MAB problem with N slot machines using a ring network with N unidirectionally coupled semiconductor lasers. $\; {S_n}$ is slot machine n and ${\kappa _n}$ is the coupling strength from Laser $n - 1$ to Laser n (Laser 0 corresponds to Laser $N$ ). Circle n corresponds to Laser n.
Fig. 12.
Fig. 12. (a) Correct decision rate (CDR) for the MAB problem with different numbers of slot machines (from three to seven). (b) Scalability of decision making. The number of plays at $CDR\; = \; 0.95$ is plotted as a function of the number of slot machines N.

Tables (3)

Tables Icon

Table 1. Parameter values used in the numerical simulation of the laser network.

Tables Icon

Table 2. Parameter values for decision making.

Tables Icon

Table 3. Six different assignments of the hit probabilities of the four slot machines when circular permutation is considered.

Equations (22)

Equations on this page are rendered with MathJax. Learn more.

d E n ( t ) d t = 1 + i α 2 [ G N ( N n ( t ) N 0 ) 1 + ε | E n ( t ) | 2 1 τ p ] E n ( t ) + κ n E n 1 ( t τ ) exp [ i θ n ( t ) ] ,
d N n ( t ) d t = J N n ( t ) τ s G N ( N n ( t ) N 0 ) 1 + ε | E n ( t ) | 2 | E n ( t ) | 2 ,
θ n ( t ) = ( ω n 1 ω n ) t ω n 1 τ ,
C n = [ I n ( t ) I ¯ n ] [ I n 1 ( t τ ) I ¯ n 1 ] τ σ n σ n 1
L n = T n T total
X n ( t ) = Q n ( t ) 1 N 1 j n Q j ( t )
Q n ( t ) = 2 H n ( t ) ( P ¯ 1st + P ¯ 2nd ) U n ( t )
P ¯ n ( t ) = H n ( t ) U n ( t )
κ n = { κ min ( κ ini k X n ( t ) < κ min ) κ ini k X n ( t ) ( κ min κ ini k X n ( t ) κ max ) κ max ( κ max < κ ini k X n ( t ) )
C D R ( t ) = 1 m i = 1 m C ( i , t )
A = { a i , j = 1 ( with coupling ) a i , j = 0 ( without coupling )
Z = sign ( Y ) = { z i , j = 1 ( y i , j > 0 ) z i , j = 0 ( y i , j = 0 ) z i , j = 1 ( y i , j < 0 )
sign ( A m ) = sign ( A m n ) ( m > n > 0 )
A = ( 0 1 0 0 0 1 1 0 0 )
sign ( A 4 ) = ( 0 1 0 0 0 1 1 0 0 ) = sign ( A )
τ ave = 1 N n = 1 N τ n
D n = τ ave τ n + j = 1 n 1 D j ( n 2 )
X 1 ( t ) = Q 1 ( t ) Q 2 ( t ) = 2 H 1 ( t ) ( P ¯ 1 + P ¯ 2 ) U 1 ( t ) ( 2 H 2 ( t ) ( P ¯ 1 + P ¯ 2 ) U 2 ( t ) )
X 1 ( t ) = ( Δ + Ω ) H 1 ( t ) Ω U 1 ( t ) ( ( Δ + Ω ) H 2 ( t ) Ω U 2 ( t ) ) = Δ H 1 ( t ) Ω ( U 1 ( t ) H 1 ( t ) ) ( Δ H 2 ( t ) Ω ( U 2 ( t ) H 2 ( t ) ) )
X 1 ( t ) = Δ H 1 ( t ) Ω B 1 ( t ) ( Δ H 2 ( t ) Ω B 2 ( t ) )
X 1 ( t ) = Δ H ^ 1 ( t ) Ω B ^ 1 ( t ) ( Δ H ^ 2 ( t ) Ω B ^ 2 ( t ) ) + ( Δ H 1 ( t 1 ) Ω B 1 ( t 1 ) ) ( Δ H 2 ( t 1 ) Ω B 2 ( t 1 ) ) = Δ H ^ 1 ( t ) Ω B ^ 1 ( t ) ( Δ H ^ 2 ( t ) Ω B ^ 2 ( t ) ) + X 1 ( t 1 )
T A ( t ) = X ( t ) + T A ( t 1 )
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.