Advanced root mean square propagation with the warm-up algorithm for fiber coupling

Ziqiang Li; Ziqiang Li; Ziting Pan; Ziting Pan; Ziting Pan; Yuting Li; Yuting Li; Yuting Li; Xu Yang; Xu Yang; Chao Geng; Chao Geng; Xinyang Li; Xinyang Li

doi:10.1364/OE.494734

1. Introduction

In Free-Space Optical Communication (FSOC), it is necessary to couple light into optical fibers for subsequent amplification and demodulation. Therefore, improving the optical fiber coupling efficiency is crucial to enhancing communication distance and reducing communication bit error rates. For the coupling scenario of FSOC, a coupling method that can cope with dynamic errors, such as atmospheric turbulence and mechanical vibrations of carrier platforms, is required [1–3]. A similar situation exists in the self-referencing interferometer wavefront sensors for measuring “deep turbulence” [4,5], where stable coupling is necessary to obtain high-contrast interferograms. Improving the efficiency and stability of fiber coupling is an important optical challenge, alongside excellent optical design and fiber manufacturing processes [6–8]. The development of effective control algorithms to mitigate the impact of atmospheric turbulence and mechanical vibration has been a significant area of research in the optical society [9–12].

Generally, adaptive optics is an effective method to compensate for wavefront aberration. For example, the fast-steering mirror (FSM) is widely used to steer the beam for higher coupling efficiency [13]. Another coupling device, called the Adaptive Fiber Coupler (AFC), is becoming more popular due to its smaller inertia [14–18]. Apart from the coupling device, accurately modeling the coupling process and tracking the light spot without light-splitting is challenging. As a result, fiber coupling is often converted into a model-free optimization problem.

Various approaches have been proposed to perform fiber coupling, including hill climbing [19], the nutation method [10,12], and the widely-known Stochastic Parallel Gradient Descent (SPGD) algorithm [16,20–23]. The nutation method gradually overlaps the spot and the fiber tip by searching the entire focal plane, which has a large effective range. However, the search process is lengthy and requires many steps to converge, resulting in relatively poor tracking ability when dealing with moving spots.

The SPGD algorithm, historically derived from the early stages of artificial neural network development [20,24], has found widespread applications in diverse fields such as wavefront-sensorless Adaptive Optics [21,25,26] and coherent beam combining [20,21]. It also exhibits faster convergence speed in fiber coupling [27], provided that the spot and the fiber tip have a good initial position. With the enormous success of deep learning, many scholars have recently attempted to apply newly devoted optimization algorithms from deep learning to various areas, including wavefront sensor-less adaptive optics [28–30], coherent or incoherent combining [31–35], and fiber coupling [13,36], to enhance the original SPGD algorithm, with a focus on improving the convergence speed.

We firmly believe that leveraging the latest advancements from various scientific domains and applying them in a cross-disciplinary manner serves as a valuable source of innovation. However, fiber coupling has its unique characteristics, and the control algorithms cannot be simply transplanted to fiber coupling systems. While most previous studies focused on the static convergence speed of the algorithm, in reality, the spot is always in motion and may experience sudden large disturbances in the fiber coupling scene. Therefore, it is not enough to only improve the convergence speed after the first closed loop. It is necessary to consider both the convergence performance of the algorithm and its ability to continuously suppress dynamic disturbances after convergence. Additionally, fiber coupling requires a large convergence range, which is different from deep learning and coherent combining that can obtain the gradient regardless of the initial state. Gradients cannot be obtained in fiber coupling if there is a large initial deviation between the fiber tip and the spot. Moreover, the gradient in fiber coupling cannot be accurately obtained as in deep learning but is estimated by perturbations. Thus, improving the effective range of the algorithm while maintaining the convergence speed of the algorithm and its stability after convergence is crucial. In light of these characteristics of fiber coupling, we propose an improved algorithm based on RMSprop, called the Advanced Root Mean Square Propagation with Warm-up (ARW) algorithm. This algorithm can effectively improve the convergence speed of the fiber coupling system to deal with sudden disturbances while enhancing the effective range of the algorithm. This paper presents a detailed comparison and verification of the SPGD and ARW algorithms through simulations and experiments.

2. Principles and simulations

2.1 SMF coupling efficiency

To achieve efficient coupling of a space laser to a single-mode fiber (SMF), mode field matching is necessary. This means matching the amplitude and phase distribution of the electromagnetic field of the laser beam coupled into the SMF with the laser propagating in the SMF [37]. Assuming that the transmission link is far enough and the influence of the transmission link on the beam is negligible, the light wave incident on the receiving aperture after long-distance transmission can be considered a plane wave, which can be expressed by Eq. (1).

(1)$${E_A}(r )= P(r )= \left\{ {\begin{array}{c} {1,r < {D / 2}}\\ {0,else} \end{array}} \right.$$

where $P(r )$ is the aperture function and D is the diameter of the aperture.

The plane wave is focused by a coupling lens to form an Airy spot ${E_O}$ and coupled into the fiber. The distribution of the laser propagating in the SMF is approximately a Gaussian beam which can be expressed as:

(2)$${F_O}(r )= \sqrt {\frac{2}{{\pi \omega _0^2}}} \exp \left( { - \frac{{{r^2}}}{{\omega_0^2}}} \right)$$

where ${\omega _0}$ is the mode field radius of the SMF.

Ideally, the tip of the fiber should be on the focal plane of the lens and its core should coincide with the Airy spot. In this ideal situation, the relationship between coupling efficiency and the optical system can be simplified as follows:

(3)$$\eta = \frac{{{{\left|{\int\!\!\!\int {E_O^\ast {F_O}ds} } \right|}^2}}}{{\int\!\!\!\int {{{|{{E_O}} |}^2}ds} \int\!\!\!\int {{{|{{F_O}} |}^2}ds} }} = 2\frac{{{{[{1 - \exp ({ - {\psi^2}} )} ]}^2}}}{{{\psi ^2}}}$$

(4)$$\psi = {{\pi D{\omega _0}} / {({2\lambda f} )}}$$

where $\lambda$ is the wavelength and f is the focal length. $E_O^\ast $ is the complex conjugation of the Airy spot ${E_O}$. And ${E_O}$ is focused by a coupling lens from the incident light ${E_A}$ and can be calculated by the angular spectrum method in the simulation.

The formulas mentioned above provide guidance for designing optical systems to achieve the highest possible coupling efficiency under ideal conditions. It can be deduced that when $\psi = 1.12$, the coupling efficiency reaches its maximum $\eta = 81.45\%$.However, achieving high coupling efficiency is only possible under ideal conditions. In reality, there are factors such as carrier platform vibration and atmospheric turbulence that cause random angular jitters. Studies have shown that misalignment caused by angular jitters of microradian magnitude can significantly decrease the coupling efficiency [37]. Therefore, it is necessary to dynamically correct the misalignment between the fiber tip and the Airy spot. One commonly used corrector is the Fast-Steering Mirror (FSM), but its large volume and heavy weight result in large inertia and limited correction frequency. An alternative correction device called Adaptive Fiber Coupler (AFC) can directly drive the SMF to align with the Airy spot. AFC has the advantages of small volume, light weight, small inertia, high frequency, and low cost [16].

Note that the theoretical calculations mentioned above are performed assuming a plane wave. Although the beam received in earth-to-satellite uplink may be considered as a spherical wave, it is ommonly simplified as a plane wave for convenience in analyzing [37,38], especially in kilometer-level laser communication within horizontal turbulent links.

Meanwhile, it is important to acknowledge that the ARW algorithm itself, being a model-free optimization algorithm, does not specifically differentiate between a received beam as a spherical wave or a plane wave. This is because both the fast-steering mirror and the AFC can only correct tip/tilt aberrations and do not have the capability to address defocus. Correcting higher-order aberrations requires a comprehensive Adaptive Optics system, which exceeds the scope of fiber coupling. When employing the ARW algorithm for on-board reception, adjustments to system parameters such as focal length may be necessary to accommodate the characteristics of the received spherical wave. However, these modifications do not affect the applicability of the ARW algorithm itself.

A typical application scenario of dynamic fiber coupling based on AFC is shown in Fig. 1. The coupling efficiency is affected by factors such as the atmosphere and platform vibration. A small portion of the coupled power is used as the metric for the controller to guide AFC to move and maximize the coupling efficiency. Most of the coupled power enters different equipment depending on the application scenario, and a power meter is typically used in demonstrations. No matter what kind of optimization algorithm is used, in order to achieve closed-loop control, it is common to require a small portion of the coupling optical power to be fed back to the control system as a performance index. This portion of energy typically accounts for around 2% or 5%. However, in very long-distance wireless optical communication, particularly in satellite-to-ground laser communication, this can pose challenges, as increasing transmission power is often difficult or unattainable. Nonetheless, in the context of a horizontal turbulent link on the ground with a distance of several kilometers, this unavoidable splitting operation is not a critical issue. Instead, the significant decline in coupling efficiency caused by spot drift remains the primary concern.

In the context of single-aperture fiber coupling, the initial static position offset between the initial spot and the fiber tip can be significant, especially after the coarse alignment by the Acquisition, Tracking, and Positioning (ATP) system. In situations where the aiming antenna of the ATP system and coupling antenna are not coaxial, the coupler must independently achieve fine alignment. This necessitates a control algorithm with a large effective range to enable convergence. Similar to an engine starter, correcting the initial large static deviation allows subsequent dynamic coupling abilities to come into play. This requirement becomes even more critical in the case of Phased Fiber Laser Arrays (PFLA). As a tiled aperture coherent combining, the AFC is a fundamental component of the PFLA. After completing the overall pointing using the ATP system, each sub-aperture still requires accurate alignment within a large effective range due to individual initial tip/tilt aberrations. This ensures the formation of a conformal receiving and transceiving system. As the PFLA evolves towards ultra-large apertures, the probability of encountering large initial static deviations in individual sub-apertures increases. The proposed ARW algorithm effectively addresses these challenges, enabling optimal utilization of each sub-aperture.

Overall, control algorithms for fiber coupling should deliver practical value in three crucial aspects. Firstly, they should offer a broader effective range, serving as an engine starter to ensure that the fiber tip can enter the effective convergence range following coarse alignment. Secondly, they should effectively maintain coupling efficiency in disturbed environments. Lastly, they should facilitate rapid recovery from sudden and substantial disturbances.

2.2 Algorithms for fiber coupling

The most commonly used control algorithm of dynamic fiber coupling is SPGD. The SPGD algorithm first generates a set of 2 small-amplitude random voltage perturbations $\Delta {U^{(n )}} = \delta {\Omega ^{(n )}}$ to estimate the gradient of the metric, where $\delta$ stands for the amplitude of perturbations and ${\Omega ^{(n )}}$ is the direction of perturbations. The positive voltages ${U^{(n )}} + \Delta {U^{(n )}}$ and negative voltages ${U^{(n )}} - \Delta {U^{(n )}}$ are applied to AFC in turn. The captured performance metrics are recorded as $J_ + ^{(n )}$ and $J_ - ^{(n )}$ respectively and gradient of the metric can be estimated by:

(5)$$\frac{{\partial J({{U^{(n )}}} )}}{{\partial U_x^{(n )}}} \approx \frac{{J_ + ^{(n )} - J_ - ^{(n )}}}{{2\delta \Omega _x^{(n )}}}$$

(6)$$\frac{{\partial J({{U^{(n )}}} )}}{{\partial U_y^{(n )}}} \approx \frac{{J_ + ^{(n )} - J_ - ^{(n )}}}{{2\delta \Omega _y^{(n )}}}$$

Because ${\Omega ^{(n )}}$ is the direction of perturbations and can be decided by the sign of $\Delta {U^{(n )}}$, the optimization formula is given by:

(7)$${U^{({n + 1} )}} = {U^{(n )}} + \alpha \frac{{J_ + ^{(n )} - J_ - ^{(n )}}}{{2\Delta {U^{(n )}}}} = {U^{(n )}} + \alpha ({J_ +^{(n )} - J_ -^{(n )}} )\frac{{\Delta {U^{(n )}}}}{{2{\delta ^2}}}$$

where $\alpha$ is the gain coefficient.

With the success of deep learning, optimization algorithms in deep learning have also been noticed in fiber coupling. However, deep learning and fiber coupling are two different application scenarios. When applying optimization algorithms to fiber coupling, it is necessary to consider whether the advantages of the original algorithm can be gained and whether the disadvantages of the original algorithm can be accepted. The training tasks of neural networks differ significantly from optical tasks such as fiber coupling, coherent combining, and wavefront-sensorless adaptive optics. Neural network training involves high-dimensional, multi-extremum, and static problems with accurately calculable gradients. In contrast, fiber coupling tasks exhibit a low-dimensional environment with dynamic disturbances. The prominence of the extreme value problem is reduced, and gradients are estimated through perturbations, leading to uncertainty. Hence, it is not straightforward to assume that the most advanced optimization algorithm used in deep learning training tasks will be optimal for coherent combining or fiber coupling.

For example, the AdaGrad algorithm can adaptively decay the learning rate by calculating the square root of the sum of squares of the historical gradients and has two advantages in deep learning tasks [39]. The first advantage is the ability to take larger steps in the gentler direction of the parameter space, thereby speeding up the training speed and smoothing the steep direction. However, this advantage is not present when applied in fiber coupling. The premise of speeding up the gentler direction and smoothing the steep direction is that the learning rate decays of each parameter are different, which requires accurate gradient directions. In fiber coupling, the gradient is estimated by loading voltage perturbations of the same amplitudes with random signs in the X and Y directions, resulting in equal gradient amplitudes in both directions and leading to the same learning rate decays.

The other advantage of the AdaGrad algorithm is that it can take a relatively large learning rate at the start of training, allowing for quick learning in the beginning. The AdaGrad algorithm applied in the fiber coupling scenario is presented in Table 1 [36]. It can be observed that gradients can be estimated after perturbations, similar to the SPGD algorithm. However, in the AdaGrad algorithm, the learning rate is inversely proportional to the square root of the sum of squares of gradient modulus. This means that when the initial point is situated in an area where the gradient of the objective function is very small, the learning rate can be exceptionally high at the start of iteration, enabling the iterative point to swiftly move out of this region. As the iterative points approach the region near the optimal point where the gradient is large, the adaptive update law rapidly reduces the algorithm’s learning rate to ensure convergence.

Table 1. The procedure of AdaGrad algorithm for fiber coupling

View Table | View all tables in this article

The disadvantage of the AdaGrad algorithm is that the learning rate decays monotonically to 0 very quickly, which is unacceptable in dynamic fiber coupling. In deep learning, the object to be optimized or the function to be fitted usually does not change over time. However, in the application of dynamic fiber coupling, the light spot to be coupled changes all the time. Figure 2 below shows the convergence simulation of the AdaGrad closed loop process dealing with multiple impulse disturbances. It can be seen that the AdaGrad algorithm can only work effectively at the beginning, but as the number of iterations increases, the convergence speed becomes slower and slower until it does not converge at all.

The most popular algorithm in deep learning is the Adam algorithm, which has also been tried in fiber coupling [13]. The Adam algorithm combines the RMSprop and Momentum algorithms. RMSprop is an improvement of AdaGrad [40] that solves the problem of the learning rate monotonically decaying by introducing the exponential moving average method to the square of gradients in the process of learning rate decay. The Momentum algorithm uses the exponential moving average of the historical gradient, which can accelerate convergence in the “canyon terrain” and effectively deal with the local minima/maximum problem [41]. However, it should be noted that fiber coupling is different from deep learning. In most cases, deep learning can be considered a function-fitting task in a high-dimensional space, which means it is usually a static multi-extremum optimization problem. On the other hand, the objective function to be optimized in the fiber coupling task is always changing, and the multi-extremum problem is mild. Although there are diffraction rings in the Airy spot, they are very weak relative to the main lobe, especially in the real environment with noise, the extremum problem is not as significant, making it unnecessary to overcome “saddle points” as in neural network training. Moreover, the gradient information in fiber coupling is estimated through perturbations, making the direction information less accurate and limiting the effectiveness of Momentum’s acceleration capability. Additionally, the presence of dynamic disturbances in real fiber coupling scenarios, which are random in nature, can cause the historical gradient information used by Momentum to push the spot out of the effective range. Therefore, the Momentum algorithm and the Momentum component in the Adam algorithm are not suitable for optical fiber coupling. The actual requirements for fiber coupling are centered around expanding the algorithm’s effective range and swiftly returning to the center of the light spot despite disturbances. The aspect of the Adam algorithm that proves effective for fiber coupling is the RMSprop algorithm.

Building upon the analysis of optical fiber coupling scenes and the limitations of existing algorithms, we propose a novel algorithm that is based on the RMSprop algorithm. By adaptively adjusting both the gain rate and the perturbation rate with warm-up operations, the new algorithm can better adapt to the fiber coupling application.

As demonstrated in Table 2, gradients in the ARW algorithm are also estimated through perturbations. However, the rate of perturbation decreases as the number of iteration steps increases. Additionally, the learning rate is inversely proportional to the square root of the sum of squares of gradient modulus S. After performing an exponential moving average, the parameter S is obtained without bias correction but through a shrinking operation. Consequently, the ARW algorithm incorporates two significant enhancements. First, previous optimization algorithms have focused on the learning rate, rather than the perturbation rate. This is because there is no such parameter in deep learning, and the direction and amplitude of the gradient are directly calculated. However, in the application of fiber coupling, the gradient of the objective function cannot be accurately calculated but is estimated by two or more perturbations. Therefore, the parameter of perturbation rate, which decides the amplitude of perturbations, is very important. At the beginning of the algorithm, there is usually a large initial static deviation between the fiber tip and the light spot. If the perturbation rate is too small, the gradient estimation is not accurate, or the perturbation range is too small to collect enough light intensity. As a result, the algorithm cannot effectively converge, so a large perturbation rate is usually required. However, if the perturbation rate is too large, the converged algorithm will fluctuate in a large range, which also has a negative impact on the coupling efficiency. Therefore, we improve the perturbation rate, as shown in step 4 and also in Eq. (8), which is much larger at first and can be gradually reduced with the increase in the number of iterations. In this way, the algorithm will have a large effective range without increasing the fluctuation range after convergence. We call this improved perturbation rate the warm-up of perturbations.

(8)$$\delta = \frac{{\tau + t}}{t}{\delta _0}$$

where t is the number of current iteration step, ${\delta _0}$ is the initial perturbation rate, $\delta$ is the actual perturbation rate and $\tau$ is the perturbation warm-up rate.

For the same reason, we also change the original learning rate update method in the RMSprop algorithm. The original exponential moving average method usually requires a bias correction as shown in Eq. (9). Different from the bias correction, we enlarge the learning rate in the first few iterations by shrinking the weighted sum of squares of gradients, as shown in step 8 and also in Eq. (10). We call this improved gain rate the warm-up of gain rate.

(9)$${S_t} = \frac{{{S_{t - 1}}}}{{1 - {\beta ^t}}}$$

(10)$${S_t} = \frac{{{S_{t - 1}}}}{{1 + \exp ({{{ - t} / \gamma }} )}}$$

where S is the weighted sum of squares of gradients, $\beta$ is the exponential moving average rate and $\gamma$ is the gain warm-up rate.

2.3 Simulation analysis

In order to simulate the actual coupling system more realistically, we carefully selected simulation parameters. Specifically, the simulation assumed a laser beam with a wavelength of 1550 nm, a coupling lens with a focal length of 45.24 mm and an optical aperture in the form of a circular hole with a diameter of 10 mm. The incident light wave on the coupling lens is described in Eq. (1), which forms an Airy spot in the focal plane through the angular spectrum diffraction method. The fiber tip is located in the center of the focal plane, which has a mode field radius of 5µm. The optical field distribution inside the fiber is described in Eq. (2), allowing the coupling efficiency calculated using Eq. (3) to reach the theoretical maximum of 81.45%, which can be verified using Eq. (4). To simulate the effects of slight atmospheric turbulence and platform vibrations on fiber coupling, random tip/tilt aberrations were added to the incident light field, resulting in an optical axis offset with a $\theta$ angle. This results in a spot drift with a distance of d at the focal plane of the coupling lens, as illustrated in Fig. 3. Specifically, the fiber tip is located at the center of the focal plane, and the light spot drift distance at the focal plane is 10µm, representing an incident optical axis angle of approximately 221.04µrad.

(11)$$\theta = \arctan \left( {\frac{d}{f}} \right)$$

If the algorithm can correct the spot drift caused by the angle offset of $\theta$ at most, that is, the algorithm can be effective within the range of $2\theta$, we define it as the effective range of the algorithm or the receiving field of view of the fiber coupling system. In order to compare the effective range of SPGD and ARW algorithms, a simulation test was carried out. As shown Fig. 3, the initial state of the fiber tip is located in the center of the field of view, due to the angle between the incident light and the optical axis of the coupling lens $\theta$, the light spot has a deviation d in the Y direction, and the sinusoidal disturbance is introduced in the X direction, which can verify the algorithms’ ability to suppress dynamic disturbances while testing the maximum effective range of the algorithms simultaneously. As shown in Fig. 4, when the spot deviation increases from 0µm and 2µm to 8µm, both algorithms can quickly converge and suppress the sinusoidal disturbance. When the deviation increases to 10µm, the ARW algorithm also converges quickly, while the SPGD algorithm converges gradually after a large number of iterations. When the deviation increases to 12µm, the ARW algorithm can converge, but the SPGD algorithm no longer can. The comparison shows that the effective range of the SPGD algorithm is 442.08µrad corresponding to 10µm, and the convergence speed is very slow at that time. While the effective range of the ARW algorithm is 530.50µrad corresponding to 12µm, which is 20% higher than the effective range of SPGD, and the convergence speed does not slow down at that maximum effective range.

To compare the convergence speed of the two algorithms within the effective range in more detail, we conducted simulations of the SPGD algorithm and the ARW algorithm 50 times each, with static deviations of 8µm and 10µm, as shown in Fig. 5. The thick blue line represents the average of the 50 convergence processes. We considered the coupling efficiency reaching 95% of the ideal coupling efficiency as the criterion for convergence and calculated the average coupling efficiency over 20 steps after convergence as the steady-state coupling efficiency. The results show that when the static deviation is 8µm, the SPGD algorithm requires an average of 8.30 steps to converge, with a steady-state coupling efficiency of 80.64%. The ARW algorithm requires an average of 7.18 steps to converge, with a steady-state coupling efficiency of 78.41%. When the static deviation is 10µm, the SPGD algorithm requires an average of 22.68 steps to converge, with a steady-state coupling efficiency of 80.42%. The ARW algorithm requires an average of 7.98 steps to converge, with a steady-state coupling efficiency of 78.13%.

Optimal parameters are chosen for both algorithms in the above simulations. Where the perturbation rate $\delta$ for SPGD and the initial perturbation rate ${\delta _0}$ for ARW are both 0.2, which can make both algorithms achieve the maximum effective range and have an acceptable coupling efficiency fluctuation after convergence at the same time. The initial gain rate for SPGD is ${\alpha _0} = 4$, which can make the coupling device have the maximum receiving field of view and move fast without causing algorithm divergence in the simulation. The initial gain rate for ARW is ${\alpha _0} = 0.3$, the exponential moving average rate $\beta = 0.5$, the gain warm-up rate $\gamma = 2$, and the perturbation warm-up rate $\tau = 4$. The simulation results have shown that the convergence speed of ARW is faster than that of SPGD, especially when there is a large static deviation. When the static deviation is 10µm, the number of iteration steps for the ARW algorithm to reach convergence is only 35.19% of that of the SPGD algorithm. However, when the two algorithms converge, the steady-state coupling efficiency of the SPGD algorithm is higher than that of the ARW algorithm. This is because the ARW algorithm has stronger exploration characteristics. From steps 7, 8, and 9 of the ARW algorithm flow in Table 2, it can be seen that when the algorithm converges, the newly obtained gradient will approach 0, and at this time the gain rate of the ARW will increase, so as to achieve a wider range of exploration near the convergence point. Although it will reduce the steady-state coupling efficiency of the ARW algorithm after convergence, on the one hand, it can make the algorithm react faster when encountering large disturbances; on the other hand, because the light spot is always in motion in the real fiber coupling environment, the steady-state coupling efficiency gap between ARW and SPGD will not be as large as in this simulation.

Table 2. The procedure of ARW algorithm for fiber coupling

View Table | View all tables in this article

3. Experimental results

To further investigate the performance of the ARW algorithm for fiber coupling and verify its performance in real-world application systems, we compared the SPGD method with our ARW method on a fiber coupling platform. The platform consisted of a laser, a SMF, a collimating lens, a FSM for making disturbances, a fiber coupling device, and an optical power meter. The fiber coupling device included a coupling lens, an AFC with a SMF in it, a PD, a HVA, and a controller, as shown in Fig. 1. The experimental setup is depicted in Fig. 6. It should be noted that although the AFC was used to construct the coupling platform in this experiment, neither SPGD nor ARW requires a specific coupling device, and both algorithms are applicable to coupling systems with FSM or other dynamic coupling devices.

Fig. 1. A typical AFC-based dynamic fiber coupling system. HVA: High Voltage Amplifier. SMF: Single Mode Fiber. AFC: Adaptive Fiber Coupler. PD: Photodetector.

Download Full Size | PDF

Fig. 2. The AdaGrad algorithm encounters impulse disturbances in the closed loop.

Download Full Size | PDF

Fig. 3. The fiber coupling simulation with an initial deviation.

Download Full Size | PDF

Fig. 4. A simulation comparison of SPGD and ARW algorithms on fiber coupling dealing with sinusoidal disturbances under different initial deviations. (a) The SPGD algorithm. (b) The ARW algorithm.

Download Full Size | PDF

Fig. 5. Simulations of SPGD and ARW algorithms for fiber coupling with 50 times at the initial deviation of 8µm and 10µm respectively. (a) The SPGD algorithm with the initial deviation of 8µm. (b) The SPGD algorithm with the initial deviation of 10µm. (c) The ARW algorithm with the initial deviation of 8µm. (d) The ARW algorithm with the initial deviation of 10µm.

Download Full Size | PDF

Fig. 6. Experimental setup. SMF: Single Mode Fiber. FSM: Fast Steering Mirror.

Download Full Size | PDF

Fig. 7. An experimental comparison of SPGD and ARW algorithms on convergence speed dealing with impulse disturbances in the closed loop. (a) and (c) are the experimental data read from the photodetector with SPGD and ARW algorithms, respectively. (b) and (d) are images that align each convergence process in (a) and (c), respectively, so as to facilitate the comparison of convergence speed.

Download Full Size | PDF

The light coupled into the single-mode fiber is divided into two parts, of which 5% of the energy enters the PD. In the process of algorithm optimization, the voltage value of PD will be used as the performance metric. And the other 95% of the energy enters the optical power meter for the real-time display of the optical power coupled into the fiber. First, the optimal parameters of the two algorithms are obtained through debugging. The perturbation rate $\delta$ for SPGD and the initial perturbation rate ${\delta _0}$ for ARW are both 0.2, which can make both algorithms achieve the maximum effective range and have an acceptable coupling efficiency fluctuation after convergence at the same time. The initial gain rate for SPGD is ${\alpha _0} = 7$. The initial gain rate for ARW is ${\alpha _0} = 0.4$, the exponential moving average rate $\beta = 0.5$, the gain warm-up rate $\gamma = 2$, and the perturbation warm-up rate $\tau = 4$.

In the experiment comparing the convergence speed, deflection voltages with an amplitude of 300 mV were manually loaded onto the FSM several times in the closed loop to simulate sudden disturbances, which are often the main cause of bit errors in wireless optical communications. The algorithm must converge back to the spot center as quickly as possible to minimize the bit error rate. As shown in Fig. 7 below, the ARW algorithm converges faster than the SPGD algorithm. The convergence criterion remains at 95% of the maximum performance metric. SPGD requires an average of 45 steps to converge, while ARW only needs 16.38 steps, which is 36.4% of the SPGD, consistent with the conclusion of the simulation analysis.

In the experiment on the effective range of the algorithms, a sinusoidal disturbance with an amplitude of 300 mV and frequency of 1 Hz is applied to the FSM to simulate dynamic disturbances such as turbulence or platform vibrations. Then a set of large bias voltages (-5 V, -5 V) is loaded onto the AFC, which is the maximum control voltage set that the coupling system can output, to simulate the large initial deviation that may occur in the actual application scenario. The performances of the SPGD and ARW algorithms are shown in Fig. 8 and Fig. 9, respectively. Figure 8(a) and Fig. 9(a) show the data of the optical power meter from the open-loop stage, the large initial deviation stage, to the closed-loop stage, and then the reopen-loop stage. Figure 8(b) and Fig. 9(b) show the voltages recorded from PD from the large initial deviation stage to the closed-loop stage. Figure 8(c) and Fig. 9(c) are the control voltages of the AFC in the process of Fig. 8(b) and Fig. 9(b), respectively. It is evident from Fig. 8 and Fig. 9 that when facing such large bias voltages of (-5 V, -5 V), the control voltages calculated by the SPGD algorithm in Fig. 8 barely change, rendering the fiber tip completely unable to locate the spot.

Fig. 8. The closed-loop experiment of the SPGD algorithm under a large initial deviation. (a) Data from the power meter. (b) Data from the photodetector. (c) Control voltages.

Download Full Size | PDF

Fig. 9. The closed-loop experiment of the ARW algorithm under a large initial deviation. (a) Data from the power meter. (b) Data from the photodetector. (c) Control voltages.

Download Full Size | PDF

As a comparison, the control voltages of the ARW algorithm in Fig. 9 can be explored over a large range soon after the closed loop is established, so that the fiber tip can quickly locate the spot and move to its center. The algorithm is also able to effectively correct the sinusoidal disturbance introduced by FSM after convergence. From the control voltages, it can be observed that the algorithm has a good ability to track the sinusoidal disturbance and the voltage values from the PD can remain stable after the closed loop is established. Upon reopening, due to the correction of the large initial deviation, the value of the optical power meter is significantly higher than it was before, but there may still be large fluctuations, which demonstrates the ARW algorithm's broad effective range and its good ability to counteract dynamic disturbance.

4. Conclusion

In the paper, the ARW algorithm is proposed and employed in fiber coupling systems to replace the conventional SPGD algorithm. First, principles of fiber coupling and the SPGD algorithm have been introduced, with a detailed analysis of special features of fiber coupling. Second, the ARW algorithm has been proposed and implemented by numerical simulation. The numerical results show evidence that the ARW algorithm is satisfactory for a fiber coupling system with a large initial deviation. Additionally, the performance of the ARW algorithm for fiber coupling has been compared with the traditional SPGD algorithm in terms of convergence speed. It can be concluded from the numerical results that the ARW algorithm effectively alleviates the problem of difficulty in converging back to the spot center when encountering sudden disturbances. Finally, an experimental investigation on a fiber coupling platform is demonstrated. The experimental results illustrate that the fluctuations of the metric function introduced by sinusoidal disturbances can be successfully suppressed using the proposed algorithm and improve the convergence speed at the same time. The large effective range of the proposed method has also been verified.

Funding

National Natural Science Foundation of China (62005286, 62175241, U2141255); Science Fund for Distinguished Young Scholars of Sichuan Province (2022JDJQ0042).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. W. Q. Cai, I. N’Doye, B. S. Ooi, M. S. Alouini, and T. M. Laleg-Kirati, “Modeling and experimental study of the vibration effects in urban free-space optical communication systems,” IEEE Photonics J. 11(6), 1–13 (2019). [CrossRef]

2. Y. S. Yuan, J. Q. Zhang, J. J. Dang, W. J. Zheng, G. C. Zheng, P. Fu, J. Qu, B. J. Hoenders, Y. F. Zhao, and Y. J. Cai, “Enhanced fiber-coupling efficiency via high-order partially coherent flat-topped beams for free-space optical communications,” Opt. Express 30(4), 5634–5643 (2022). [CrossRef]

3. F. P. Guiomar, M. A. Fernandes, J. L. Nascimento, V. Rodrigues, and P. P. Monteiro, “Coherent free-space optical communications: opportunities and challenges,” J. Lightwave Technol. 40(10), 3173–3186 (2022). [CrossRef]

4. M. J. Steinbock, M. W. Hyde, and J. D. Schmidt, “LSPV+7, a branch-point-tolerant reconstructor for strong turbulence adaptive optics,” Appl. Opt. 53(18), 3821–3831 (2014). [CrossRef]

5. M. F. Spencer and T. J. Brennan, “Deep-turbulence phase compensation using tiled arrays,” Opt. Express 30(19), 33739–33755 (2022). [CrossRef]

6. S. S. Zhong, C. Xu, L. Duan, F. Zhang, and J. A. Duan, “Optimizing the efficiency of a laser diode and single-mode fiber coupling using multi-aspherical lenses,” Opt. Fiber Technol. 68, 102781 (2022). [CrossRef]

7. A. Bekkali, H. Fujita, and M. Hattori, “New generation free-space optical communication systems with advanced optical beam stabilizer,” J. Lightwave Technol. 40(5), 1509–1518 (2022). [CrossRef]

8. Y. D. Liu, Y. C. Tsai, Y. K. Lu, L. J. Wang, M. C. Hsieh, S. M. Yeh, and W. H. Cheng, “New scheme of double-variable-curvature microlens for efficient coupling high-power lasers to single-mode fibers,” J. Lightwave Technol. 29(6), 898–904 (2011). [CrossRef]

9. H. Takenaka, M. Toyoshima, and Y. Takayama, “Experimental verification of fiber-coupling efficiency for satellite-to-ground atmospheric laser downlinks,” Opt. Express 20(14), 15301–15308 (2012). [CrossRef]

10. B. Li, Y. T. Liu, S. F. Tong, L. Zhang, and H. F. Yao, “Adaptive single-mode fiber coupling method based on coarse-fine laser nutation,” IEEE Photonics J. 10(6), 1–12 (2018). [CrossRef]

11. Q. Q. Kong, Y. Q. Jing, H. Shen, S. W. Deng, Z. G. Han, and R. H. Zhu, “Alignment and efficiency-monitoring method of high-power fiber-to-fiber coupling,” Chin. Opt. Lett. 18(2), 021402 (2020). [CrossRef]

12. L. Zhang, X. A. Yu, B. Q. Zhao, T. Wang, and S. F. Tong, “Method for 10 Gbps near-ground quasi-static free-space laser transmission by nutation mutual coupling,” Opt. Express 30(19), 33465–33478 (2022). [CrossRef]

13. Q. T. Hu, L. L. Zhen, M. Yao, S. W. Zhu, X. Zhou, and G. Z. Zhou, “Adaptive stochastic parallel gradient descent approach for efficient fiber coupling,” Opt. Express 28(9), 13141–13154 (2020). [CrossRef]

14. M. A. Vorontsov, T. Weyrauch, L. A. Beresnev, G. W. Carhart, L. Liu, and K. Aschenbach, “Adaptive array of phase-locked fiber collimators: analysis and experimental demonstration,” IEEE J. Sel. Top. Quantum Electron. 15(2), 269–280 (2009). [CrossRef]

15. W. Luo, C. Geng, Y.-Y. Wu, Y. Tan, Q. Luo, H.-M. Liu, and X.-Y. Li, “Experimental demonstration of single-mode fiber coupling using adaptive fiber coupler,” Chin. Phys. B 23(1), 014207 (2014). [CrossRef]

16. G. Huang, C. Geng, F. Li, Y. Yang, and X. Y. Li, “Adaptive SMF coupling based on precise-delayed SPGD algorithm and its application in free space optical communication,” IEEE Photonics J. 10(3), 1–12 (2018). [CrossRef]

17. D. Zhi, Y. Ma, R. Tao, P. Zhou, X. Wang, Z. Chen, and L. Si, “Highly efficient coherent conformal projection system based on adaptive fiber optics collimator array,” Sci. Rep. 9(1), 2783 (2019). [CrossRef]

18. Y. Ma, G. Luo, S. He, Z. Chen, R. Su, P. Ma, P. Zhou, and L. Si, “Cantilevered adaptive fiber-optics collimator based on piezoelectric bimorph actuators,” Appl. Opt. 61(11), 3195–3200 (2022). [CrossRef]

19. A. J. Wright, D. Burns, B. A. Patterson, S. P. Poland, G. J. Valentine, and J. M. Girkin, “Exploration of the optimisation algorithms used in the implementation of adaptive optics in confocal and multiphoton microscopy,” Microsc. Res. Tech. 67(1), 36–44 (2005). [CrossRef]

20. M. A. Vorontsov, G. W. Carhart, and J. C. Ricklin, “Adaptive phase-distortion correction based on parallel gradient-descent optimization,” Opt. Lett. 22(12), 907–909 (1997). [CrossRef]

21. Y.-H. Gong, K.-X. Yang, H.-L. Yong, J.-Y. Guan, G.-L. Shentu, C. Liu, F.-Z. Li, Y. Cao, J. Yin, S.-K. Liao, J.-G. Ren, Q. Zhang, C.-Z. Peng, and J.-W. Pan, “Free-space quantum key distribution in urban daylight with the SPGD algorithm control of a deformable mirror,” Opt. Express 26(15), 18897–18905 (2018). [CrossRef]

22. K.-X. Yang, M. Abulizi, Y.-H. Li, B.-Y. Zhang, S.-L. Li, W.-Y. Liu, J. Yin, Y. Cao, J.-G. Ren, and C.-Z. Peng, “Single-mode fiber coupling with a M-SPGD algorithm for long-range quantum communications,” Opt. Express 28(24), 36600–36610 (2020). [CrossRef]

23. M. Chen, C. Liu, D. Rui, and H. Xian, “Highly sensitive fiber coupling for free-space optical communications based on an adaptive coherent fiber coupler,” Opt. Commun. 430, 223–226 (2019). [CrossRef]

24. G. Cauwenberghs, “A fast stochastic error-descent algorithm for supervised learning and optimization,” in NIPS (1992).

25. H. Yang and X. Li, “Comparison of several stochastic parallel optimization algorithms for adaptive optics system without a wavefront sensor,” Opt. Laser Technol. 43(3), 630–635 (2011). [CrossRef]

26. G. Xie, Y. Ren, H. Huang, M. P. J. Lavery, N. Ahmed, Y. Yan, C. Bao, L. Li, Z. Zhao, Y. Cao, M. Willner, M. Tur, S. J. Dolinar, R. W. Boyd, J. H. Shapiro, and A. E. Willner, “Phase correction for a distorted orbital angular momentum beam using a Zernike polynomials-based stochastic-parallel-gradient-descent algorithm,” Opt. Lett. 40(7), 1197–1200 (2015). [CrossRef]

27. F. Li, C. Geng, G. Huang, Y. Yang, X. Li, and Q. Qiu, “Experimental demonstration of coherent combining with tip/tilt control based on adaptive space-to-fiber laser beam coupling,” IEEE Photonics J. 9(2), 1–12 (2017). [CrossRef]

28. L. Xu, J. L. Wang, L. Q. Yang, and H. Zhang, “Design and Performance Analysis of NadamSPGD Algorithm for Sensor-Less Adaptive Optics in Coherent FSOC Systems,” Photonics 9(1), 15–23 (2015). [CrossRef]

29. H. Zhang, L. Xu, Y. F. Guo, J. T. Cao, W. Liu, and L. Q. Yang, “Application of AdamSPGD algorithm to sensor-less adaptive optics in coherent free-space optical communication system,” Opt. Express 30(5), 7477–7490 (2022). [CrossRef]

30. H. Zhao, J. An, M. J. Yu, D. A. K. Lv, K. D. Kuang, and T. Q. Zhang, “Nesterov-accelerated adaptive momentum estimation-based wavefront distortion correction algorithm,” Appl. Opt. 60(24), 7177–7185 (2021). [CrossRef]

31. D. B. Che, Y. Y. Li, Y. H. Wu, J. K. Song, and T. F. Wang, “Theory of AdmSPGD algorithm in fiber laser coherent synthesis,” Opt. Commun. 492, 126953 (2021). [CrossRef]

32. J. K. Song, Y. Y. Li, D. B. Che, J. Guo, and T. F. Wang, “Coherent beam combining based on the SPGD algorithm with a momentum term,” Optik 202, 163650 (2020). [CrossRef]

33. J. K. Song, Y. Y. Li, D. B. Che, and T. F. Wang, “Numerical and experimental study on coherent beam combining using an improved stochastic parallel gradient descent algorithm,” Laser Phys. 30(8), 085102 (2020). [CrossRef]

34. G. Q. Yang, L. S. Liu, Z. H. Jiang, J. Guo, and T. F. Wang, “Incoherent beam combining based on the momentum SPGD algorithm,” Opt. Laser Technol. 101, 372–378 (2018). [CrossRef]

35. J. Y. Chen, J. S. Liu, L. Han, M. R. Ci, D. B. Che, L. H. Guo, and H. J. Yu, “Theory of AdaDelSPGD Algorithm in Fiber Laser-Phased Array Multiplex Communication Systems,” Appl. Sci. 12(6), 3009 (2022). [CrossRef]

36. J. J. Peng, B. Qi, H. Li, and Y. Mao, “AS-SPGD algorithm to improve convergence performance for fiber coupling in free space optical communication,” Opt. Commun. 519, 128397 (2022). [CrossRef]

37. J. Ma, F. Zhao, L. Y. Tan, S. Y. Yu, and Q. Q. Han, “Plane wave coupling into single-mode fiber in the presence of random angular jitter,” Appl. Opt. 48(27), 5184–5189 (2009). [CrossRef]

38. L. Jiang, T. Dai, X. Yu, Z. Dai, C. Wang, and S. Tong, “Analysis of scintillation effects along a 7 km urban space laser communication path,” Appl. Opt. 59(27), 8418–8425 (2020). [CrossRef]

39. J. Duchi, E. Hazan, and Y. Singer, “Adaptive subgradient methods for online learning and stochastic optimization,” Journal of Machine Learning Research 12(61), 2121–2159 (2011).

40. A. Graves, “Generating sequences with recurrent neural networks,” arXiv, arXiv:1308.0850 (2013). [CrossRef]

41. I. Sutskever, J. Martens, G. Dahl, and G. Hinton, “On the importance of initialization and momentum in deep learning,” in International conference on machine learning (PMLR, 2013), pp. 1139–1147.

Advanced root mean square propagation with the warm-up algorithm for fiber coupling

Abstract

1. Introduction

2. Principles and simulations

2.1 SMF coupling efficiency

2.2 Algorithms for fiber coupling

2.3 Simulation analysis

3. Experimental results

4. Conclusion

Funding

Disclosures

Data availability

References

Data availability

Cited By

Figures (9)

Tables (2)

Equations (11)

Optics Express

Algorithm 1 The procedure of AdaGrad algorithm
Require: The initial gain rate $α_{0}$ , the perturbation rate $δ$ , the objective function $J (U_{x}, U_{y})$ , and the number of iterations T.
Ensure: The optimal parameters $(U_{x}, U_{y})$ for the maximal J.
1: $S \leftarrow 0$
2: Randomly initialize the parameters $U_{x}^{0}$ and $U_{y}^{0}$ .
3: for t = 1 to T do
4: Randomly generate the values of $(Δ U_{x}, Δ U_{y})$ based on the value of $δ$ .
5: $Δ J \leftarrow J (U_{x}^{t - 1} + Δ U_{x}, U_{y}^{t - 1} + Δ U_{y}) - J (U_{x}^{t - 1} - Δ U_{x}, U_{y}^{t - 1} - Δ U_{y})$
6: $S \leftarrow S + Δ J^{2}$
7: $α \leftarrow \frac{α_{0}}{\sqrt{S} + 10^{- 8}}$
8: $U_{x}^{t} \leftarrow U_{x}^{t - 1} + α Δ J \frac{Δ U_{x}}{δ}$
9: $U_{y}^{t} \leftarrow U_{y}^{t - 1} + α Δ J \frac{Δ U_{y}}{δ}$
10: end for

Algorithm 2 The procedure of our proposed ARW algorithm
Require: The initial gain rate $α_{0}$ , the exponential moving average rate $β$ , the gain warm-up rate $γ$ , the initial perturbation rate $δ_{0}$ , the perturbation warm-up rate $τ$ , the objective function $J (U_{x}, U_{y})$ , and the number of iterations T.
Ensure: The optimal parameters $(U_{x}, U_{y})$ for the maximal J.
1: $S \leftarrow 0$
2: Randomly initialize the parameters $U_{x}^{0}$ and $U_{y}^{0}$ .
3: for t = 1 to T do
4: $δ \leftarrow \frac{τ + t}{t} δ_{0}$
5: Randomly generate the values of $(Δ U_{x}, Δ U_{y})$ based on the value of $δ$ .
6: $Δ J \leftarrow J (U_{x}^{t - 1} + Δ U_{x}, U_{y}^{t - 1} + Δ U_{y}) - J (U_{x}^{t - 1} - Δ U_{x}, U_{y}^{t - 1} - Δ U_{y})$
7: $S \leftarrow β S + (1 - β) Δ J^{2}$
8: $S \leftarrow \frac{S}{1 + e x p (- t / γ)}$
9: $α \leftarrow \frac{α_{0}}{\sqrt{S} + 10^{- 8}}$
10: $U_{x}^{t} \leftarrow U_{x}^{t - 1} + α Δ J \frac{Δ U_{x}}{δ}$
11: $U_{y}^{t} \leftarrow U_{y}^{t - 1} + α Δ J \frac{Δ U_{y}}{δ}$
12: end for

Algorithm 1 The procedure of AdaGrad algorithm
Require: The initial gain rate $α_{0}$ , the perturbation rate $δ$ , the objective function $J (U_{x}, U_{y})$ , and the number of iterations T.
Ensure: The optimal parameters $(U_{x}, U_{y})$ for the maximal J.
1: $S \leftarrow 0$
2: Randomly initialize the parameters $U_{x}^{0}$ and $U_{y}^{0}$ .
3: for t = 1 to T do
4: Randomly generate the values of $(Δ U_{x}, Δ U_{y})$ based on the value of $δ$ .
5: $Δ J \leftarrow J (U_{x}^{t - 1} + Δ U_{x}, U_{y}^{t - 1} + Δ U_{y}) - J (U_{x}^{t - 1} - Δ U_{x}, U_{y}^{t - 1} - Δ U_{y})$
6: $S \leftarrow S + Δ J^{2}$
7: $α \leftarrow \frac{α_{0}}{\sqrt{S} + 10^{- 8}}$
8: $U_{x}^{t} \leftarrow U_{x}^{t - 1} + α Δ J \frac{Δ U_{x}}{δ}$
9: $U_{y}^{t} \leftarrow U_{y}^{t - 1} + α Δ J \frac{Δ U_{y}}{δ}$
10: end for

Algorithm 2 The procedure of our proposed ARW algorithm
Require: The initial gain rate $α_{0}$ , the exponential moving average rate $β$ , the gain warm-up rate $γ$ , the initial perturbation rate $δ_{0}$ , the perturbation warm-up rate $τ$ , the objective function $J (U_{x}, U_{y})$ , and the number of iterations T.
Ensure: The optimal parameters $(U_{x}, U_{y})$ for the maximal J.
1: $S \leftarrow 0$
2: Randomly initialize the parameters $U_{x}^{0}$ and $U_{y}^{0}$ .
3: for t = 1 to T do
4: $δ \leftarrow \frac{τ + t}{t} δ_{0}$
5: Randomly generate the values of $(Δ U_{x}, Δ U_{y})$ based on the value of $δ$ .
6: $Δ J \leftarrow J (U_{x}^{t - 1} + Δ U_{x}, U_{y}^{t - 1} + Δ U_{y}) - J (U_{x}^{t - 1} - Δ U_{x}, U_{y}^{t - 1} - Δ U_{y})$
7: $S \leftarrow β S + (1 - β) Δ J^{2}$
8: $S \leftarrow \frac{S}{1 + e x p (- t / γ)}$
9: $α \leftarrow \frac{α_{0}}{\sqrt{S} + 10^{- 8}}$
10: $U_{x}^{t} \leftarrow U_{x}^{t - 1} + α Δ J \frac{Δ U_{x}}{δ}$
11: $U_{y}^{t} \leftarrow U_{y}^{t - 1} + α Δ J \frac{Δ U_{y}}{δ}$
12: end for