Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Orthant-symmetric four-dimensional geometric shaping for fiber-optic channels via a nonlinear interference model

Open Access Open Access

Abstract

Maximizing the data throughput for optical fiber communication via signal shaping has usually been regarded as challenging due to the nonlinear interference and implementation/optimization complexity. To overcome these challenges, in this paper, we propose an efficient four-dimensional (4D) geometric shaping (GS) approach to design 4D 512-ary and 1024-ary modulation formats by maximizing the generalized mutual information (GMI) using a 4D nonlinear interference (NLI) model, which makes these modulation formats more nonlinear-tolerant. In addition, we propose and evaluate a fast and low-complexity orthant-symmetry based modulation optimization algorithm via neural networks, which allows to improve the optimization speed and GMI performance for both linear and nonlinear fiber transmission systems. The optimized modulation formats with spectral efficiencies of 9 and 10 bit/4D-sym demonstrate a GMI improvement of up to 1.35 dB compared with their quadrature amplitude modulation (QAM) counterparts in additive white Gaussian noise (AWGN) channel. Numerical simulations of optical transmission over two types of fibers show that the 4D NLI model-learned modulation formats could extend the transmission reach by up to 34% and 12% with respect to the QAM formats and the AWGN-learned 4D modulation formats, respectively. Results of effective signal-to-noise ratio are also presented, which confirm that the extra gains in optical fiber channel come from the enhanced SNR by reducing the modulation-dependent NLI.

© 2023 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement

1. Introduction

Optical communication system with advanced techniques has experienced a dramatic evolution for several years. In order to satisfy the requirement of Internet traffic growth, increasing spectral efficiency (SE) has been focused on applying high-order modulation formats. Conventional high-order QAM modulation formats with uniform distribution has 1.53 dB gap to Shannon capacity over the additive white Gaussian noise (AWGN) channel [1]. Furthermore, signal shaping has been investigated widely in the literature for further increasing SE.

Shaping methods have two categories which consist of probabilistic shaping (PS) and geometric shaping (GS). For the AWGN channel with average energy constraint, both shaping methods can approach the Shannon limit as the cardinality size of constellation tends to infinity [2]. PS has been investigated by changing the symbols’ probabilities via different architectures [35]. However, these shaping methods pose requirement on the shaper and deshaper with the aim of matching a target distribution, e.g., distribution matcher and distribution dematcher. On the other hand, GS can also provide shaping gains by modifying the location of the constellation points [610] and multi-dimensional (MD) GS could provide potentially larger shaping gains by using the correlation in MD space [11]. Meanwhile, MD GS increases the complexity of the demapper since more complex Euclidean distance in MD space needs to be calculated. Therefore, designing efficient MD modulation formats with low implementation complexity and higher nonlinearity tolerance is still a challenging problem for the nonlinear optical fiber channel.

To design MD modulation formats in an efficient way, adding constraints of constant modulus [1214], orthant-symmetry [15] and shell shaping [16] in the constellation optimization have been proposed to improve the nonlinear tolerance and reduce the optimization complexity. However, the heuristic ideas, such as the constant modulus constraint, are not able to maximize the performance in nonlinear channel without the knowledge of accurate modulation-dependent nonlinear interference (NLI). Using split-step Fourier method (SSFM) as channel model to capture the effect of modulation-dependent NLI and consider the NLI in the modulation optimization is an option to make the modulation format be more nonlinearity-tolerant, but is time-consuming at the cost [17]. The possibility of predicting the dependency of nonlinear interference power as a function of the modulation format features, i.e., geometrical shape and statistical properties, are first introduced in [18,19]. Therefore, a fast and accurate NLI model is crucial in optimizing and analyzing the performance of modulation format in optical communication systems, such as Gaussian noise (GN) and enhanced GN (EGN) models [20,21]. However, these models are only suitable for estimating the NLI of transmitting polarization-multiplexed (PM)-2D formats, where data on the two polarizations are assumed to be independent and identically distributed. Recently, a new analytical model to predict the NLI power of general dual-polarization (DP) 4D formats has been proposed in [22]. With the help of 4D NLI model, geometrically-shaped DP-4D modulation formats have been investigated to reduce the NLI and maximize the mutual information (MI) in multi-span nonlinear optical channel [23]. However, it is still unknown whether the method can be beneficial for a BICM system in optical channel, where GMI and effective SNR are the more proper performance metrics.

Differently from our previous work of only focusing on optimizing the formats in AWGN channel [13,24], in this paper, we use the recent proposed analytic 4D NLI model [22,25] to design the geometrically-shaped 4D modulation formats for BICM systems by trading-off the linear shaping gains and nonlinear tolerance in optical channel. The 4D modulation formats are optimized to reduce the impact of cross-polarization statistics on the NLI power, called as 4D NLI model-learned formats, and also compared with 4D AWGN-learned formats. Both AWGN-learned and 4D NLI model-learned modulation optimization are considered. We found that the model-learned modulation formats can not only provide shaping gains, but also mitigate the nonlinear impairments with respect to the AWGN-learned formats. Numerical simulation results show that the model-learned modulation formats with SEs of 9 and 10 bit/4D-sym have lower nonlinear distortion in terms of the effective SNR and extend the transmission reach. This approach is applicable to general DP-4D formats with odd bits, e.g., 9 bit/4D-sym, where a single 2D format with the same SE is not even exist. In additional, the impact of finite-resolution digital-to-analog converter (DAC) on the optimized 4D modulation formats is also investigated.

2. System model

The system model for multi-span optical fiber system and the corresponding schematic diagram of the 4D modulation optimization considered in this paper is shown in Fig. 1. Figure 1(a) is the generic structure of bit-interleaved coded modulation (BICM) transmission system over a multi-span fiber-optical channel. At the transmitter, a binary forward error correction (FEC) encoder encodes information bits to generate the binary sequence, which is then mapped into the 4D transmitted symbol vector $\boldsymbol {X}$ by a 4D mapper using the designed constellation and labeling pairs {$\mathcal {X}, \mathcal {B}$}. Furthermore, the 4D symbols $\boldsymbol {X}$ are modulated on two arbitrary orthogonal polarisation states x and y, and then are transmitted over an $N_{\text {sp}}$-span optical fiber link that each span consists of an $80$ km optical fiber followed by an erbium-doped fiber amplifier (EDFA). At the receiver, the received signals are dealt with digital signal processing (DSP) algorithms including chromatic dispersion compensation (CDC), matched filter and sampling. The output signals $\boldsymbol {Y}$ are then processed by a 4D demapper where the soft information is computed and passed to a binary FEC decoder. In this paper, we assume that laser and the equalization is ideal, thus it does not introduce the additional phase noise component into the signal or residual phase noise after carrier phase estimation. Such an effect is dependent on laser linewidth, symbol rate and carrier phase estimation algorithm, and is thus beyond the scope of this investigation.

 figure: Fig. 1.

Fig. 1. (a) The system model for mutli-span optical transmission system employing BICM and soft decision forward error correction (SD-FEC) under consideration. (b) 4D modulation optimization via auto-encoder for two considered channel model, i.e., AWGN channel and 4D NLI model.

Download Full Size | PDF

As shown in Fig. 1(a), the performance of different modulation formats is compared in terms of generalized mutual information (GMI) and effective SNR. Note that we consider the discrete modulation format $\mathcal {X}$ with the cardinality size of $M = 2^m=|\mathcal {X}|$ ($m$ denotes the number of binary bits per symbol) and $N=4$ real dimensions corresponding to the two polarizations of the lightwave (each polarization includes in-phase $I$ and quadrature $Q$ two real dimensions).

Figure 1(b) shows the end-to-end neural network (NN) for optimizing the DP-4D modulation formats. The constellation alphabet $\mathcal {X}$ is generated by an NN using the binary labeling $\mathcal {B}$ as input, which is mapped into an one-hot vector. The optimized alphabet $\mathcal {X}$ is used at the mapper and demapper shown in Fig. 1(a). Note that for the modulation optimization, we consider two scenarios, inlcluding AWGN channel and nonlinear optical fiber channel .

For the AWGN channel, the SNR is obtained by the normalized power of the transmitted constellation symbols and the noise variance, which is defined as $\text {SNR} = \frac {E_{\text {s}}}{\sigma ^2}=\frac {1}{\sigma ^2}$, where the 4D signal power of all the constellation are normalized to $E_\text {s} = 1$ in this paper, i.e., unit energy per dual polarization, and $\sigma ^2$ is the total noise power.

For the nonlinear optical fiber channel, the nonlinear interference is modeled under Gaussian noise assumption and the noise power is dependent on the parameters of the fiber link, e.g., the physical properties of the fiber transmission system, optical signal power and the amplified spontaneous emission (ASE) noise power. The effective SNR ($\text {SNR}_{\text {eff}}$) is used to represent the SNR after DSP at the receiver, given by [20, Sec. VI]

$$\text{SNR}_{\text{eff}} = \frac{P}{ \sigma_{\text{ASE}}^2 + \sigma_{\text{NLI}}^2 },$$
where $P$, $\sigma _{\text {NLI}}^2$ and $\sigma _{\text {ASE}}^2$ denote the transmitted power, nonlinear noise power and ASE noise power, respectively. The nonlinear noise power $\sigma _{\text {NLI}}^{2}$ is estimated with nonlinear coefficient $\eta$ obtained by the NLI model, and thus can help to find the nonlinearity-tolerant 4D modulation format for a targeted optical fiber system with the fiber link parameters as input (see the highlighted blue lines in Fig. 1). Note that the nonlinear interference, especially for multi-level modulation formats, may not be Gaussian additive noise [19].

For the end-to-end learning, GMI is employed as the loss function to optimize the modulation format, which needs the knowledge of the constellation $\mathcal {X}$, bit labeling $\mathcal {B}$ and SNR (or $\text {SNR}_{\text {eff}}$).

3. Achievable information rates and modulation optimization problem

3.1 Performance metrics

The amount of information per symbol that can be reliably transmitted by a certain channel is known as achievable information rate (AIR). The AIR over a certain channel with different coded modulation structures can be quantified by MI or/and GMI. In general, MI applies to nonbinary codes or multilevel codes with multi-stage decoder, while GMI is suitable for the simpler BICM system. As a robust performance metric for optical system with BICM scheme [26], GMI can be considered as a cost function for a given channel law $f_{\boldsymbol {Y}|\boldsymbol {X}}$, which can be estimated via Gauss-Hermite quadrature (GHQ) and is given by [27, Eqs. (17), (18)],

$$\hspace{-1em} \text{GMI} =G(\mathcal{X},\mathcal{B},f_{\boldsymbol{Y}|\boldsymbol{X}}) =m + {\frac{1}{M}\sum_{k=1}^m}{\sum_{b\in(0,1)}\sum_{i\in{\mathcal{I}_k^b}}}{\int_{\mathcal{R}^N}}{f}_{{\boldsymbol{Y}}|{\boldsymbol{X}}}({\boldsymbol{y}}|{\boldsymbol{x}_i})\cdot \log \frac{{\sum_{j\in{\mathcal{I}_b^k}}}{f}_{{{\boldsymbol{Y}}|{\boldsymbol{X}}}}({\boldsymbol{y}}|{\boldsymbol{x}_j})}{\frac{1}{2}{\sum_{p=1}^M}{f}_{{{\boldsymbol{Y}}|{\boldsymbol{X}}}}({\boldsymbol{y}}|{\boldsymbol{x}_p})}dy,$$
where $I_k^b$ is the set of constellation points associating the $k$-th bit to the bit value $b \in \{0, 1\}^m$. Note that a more common approach to estimate the GMI for a binary SD-FEC scheme in fibre optics is using a mismatched receiver, e.g., using the Gaussian distribution as channel law. Replacing the channel law to make it close to the actual channel is expected to achieve more accurate approximation for GMI. For example, when the phase noise is taking into account, the approximated partially coherent AWGN model in [28] can be considered.

3.2 Modulation optimization problem

As shown in Eq. (2), the GMI-based optimization for maximizing AIR requires a joint optimization of the constellation coordinates $\mathcal {X}$ and its labeling $\mathcal {B}$ for a given channel law and energy constraint $\sigma _x^2$, which is defined as

$$\{\mathcal{X}^*,\mathcal{B}^*\}=\mathop{\textrm{argmax}}\limits_{\mathcal{X},\mathcal{B}:E[||X||^2]\leq \sigma^2_x} \{G(\mathcal{X},\mathcal{B}, f_{\boldsymbol{Y}|\boldsymbol{X}})\},$$
where $\mathcal {X}^*$ and $\mathcal {B}^*$ indicate the optimized constellation and labeling, respectively.

Note that the optimized problem above is devoted to search an optimal solution (i.e., the final constellation $\mathcal {X}^{*}$ and labeling $\mathcal {B}^{*}$) with the multiple parameters in MD space. However, for the GMI-based modulation optimization, the optimization complexity and computation demanding are gradually increased with the cardinality size and dimension of the modulation formats [10,11,17]. In [10], an approach for geometric shaping using an analytically derived gradient is proposed to reduce the complexity for numerical optimization. To further reduce complexity and relax the requirement on the optical transceiver, adding constraints, e.g., orthant symmetry [15] and shell shaping [16], are also alternative options for high-order multi-dimensional modulation optimization. In general, shaping the modulation to be Gaussian-like leads to a higher nonlinear interference and, thus results in a larger SNR penalty in nonlinear optical fiber transmission system. To mitigate the effect of NLI from the shape of modulation formats, the NLI models have been considered as a reliable option for estimating the NLI power [29,30] and optimizing the modulation formats [10,11,23,31]. In this work, we focus on designing the nonlinear-tolerant 4D modulation formats with low optimization complexity.

4. Proposed 4D modulation optimization

In this section, we will introduce the methodology used for the proposed 4D modulation optimization. The main approach is to reduce the modulation optimization complexity by adding constraint and improve the transmission performance by taking into account the nonlinear interference during fiber propagation. We first briefly describe the considered OS constraint and the neural network architecture for modulation optimization. Later in the section, the details of the 4D NLI model-learned optimization procedure are presented.

4.1 OS-based 4D modulation optimization

To reduce the complexity of multi-parameter optimization and transceiver requirements, OS constraint has been proposed to design MD modulation formats by mirroring the first-orthant labeled constellation to other orthants [15]. We first briefly review the OS-based modulation optimization in this section and also show how it is possible to reduce the optimization search space by enforcing symmetry. This allows the optimization of high cardinality modulation formats with fast convergence speed, which is used in this paper for 4D modulation optimization.

The orthant-symmetric constellation alphabet $\mathcal {X}^{\circ } = [\mathcal {X}_1;\mathcal {X}_2;\cdots, \mathcal {X}_{2^N}]$ include $2^N$ orthant constellation and $\mathcal {X}_k$ with $k = [1, 2,\ldots, 2^N]$ is $k$-th orthant constellation, which is generated by multiplying the first orthant constellation $\mathcal {X}_1$ with mirror matrix $\mathcal {H}_k$. The mirror matrix $\mathcal {H}_k$ of $k$-th orthant can be defined as $\mathcal {H}_k = \text {diag}[(-1)^{o_{k,1}}, (-1)^{o_{k,2}},\ldots, (-1)^{o_{k,N}}]$, where $\text {diag}$ denotes the diagonal matrix with the elements on the main diagonal and $\boldsymbol {O}_k = [o_{k, 1}, o_{k, 2},\ldots, o_{k, N}] \in \{ 0, 1 \}^N$ is the binary representation of decimal integer, start from $0$ to $2^N-1$. For example, $\boldsymbol {O}_2 = [0,0,0,1]$ denotes a decimal value of 1, which corresponds to the mirror matrix $\mathcal {H}_2$ and 2nd orthant with $N=4$. For a general OS constellation $\mathcal {X}^{\circ }$, the first-orthant constellation $\mathcal {X}_1 = [\boldsymbol {X}_1; \boldsymbol {X}_2; \cdots ; \boldsymbol {X}_{2^{m-N}}]$ has $2^{m-N}$ $N$-dimensional constellation points and each $N$-dimensional constellation point $\boldsymbol {X}_j = [x_{j, 1}, x_{j, 2},\ldots, x_{j, N}]$ with $j = [1, 2,\ldots, 2^{m-N}]$ is a row vector. In general, the first orthant constellation consists of all the $N$-dimensional symbols with all their components to be positive ($\boldsymbol {X}_{j}\in \mathbb {R}_+^N$).

The bit labeling of an OS constellation is defined as $\mathcal {B} = [\mathcal {B}_1; \mathcal {B}_2; \dots ; \mathcal {B}_{2^N}]$ and the submatrix $\mathcal {B}_k = [\boldsymbol {O}_k, \boldsymbol {L}_{m-N}]$ is with dimension of $2^{m-N} \times m$ corresponding to the $k$-th orthant. Bit labeling $\boldsymbol {O}_k$ and $\boldsymbol {L}_{m-N}$ are the bits used to select the orthant and the bits to select the specific constellation point in that orthant, respectively. For the generation of whole bit labeling $\mathcal {B}$, it only changes the orthant bits $\boldsymbol {O}_k$ with a good labeling $\boldsymbol {L}_{m-N}$ for one orthant. According to the definition above, the orthant-symmetric labeled constellation {$\mathcal {X}^{\circ },\mathcal {B}$} can be constructed via Algorithm 1.

With the OS constraint under consideration, the conventional optimization problem in Eq. (3) can be simplified as

$$ \{\mathcal{X}^*_1,\mathcal{B}^* _{1}\}= \mathop{\textrm{argmax}}\limits_{\mathcal{X}_1,\mathcal{B}_1:E[||X||^2]\leq \sigma^2_x} \{\text{GMI}(\mathcal{X}^{{\circ}},\mathcal{B}, f_{\boldsymbol{Y}|\boldsymbol{X}})\}, $$
where $\mathcal {X}^*_1$ and $\mathcal {B}^*_1$ denote the optimal first-orthant constellation. $\mathcal {X}^{\circ }$ and $\mathcal {B}$ denote the orthant-symmetric constellation and labeling, respectively, which are generated by folding the first-orthant constellation $\{\mathcal {X}^*_1, \mathcal {B}^*_1\}$ to the remaining orthants.

For a general modulation optimization shown in Eq. (3), the optimization problem devotes to search the optimal solution from the $M \times N$ continuous space. In contrast, optimizing the modulation format with OS constraint in Eq. (4) could decrease the optimization space by a factor of $2^{N}$ from $M \times N$ to $2^{m-N} \times N$. The comparision of optimization performance and speed will be shown in Sec. 5.

4.2 Modulation optimization via a neural network

The GS optimization over the considered system shown in Fig. 1 is performed using an end-to-end NN learning approach via an autoencoder [32]. To evaluate the performance of adding OS constraint, the NN-based optimization architectures without OS constraint and with OS constraint are both applied.

As shown in Fig. 2, to optimize the modulation formats without OS constraint, the binary labeling $\mathcal {B}$ is mapped to constellation points $\mathcal {X}$ with an NN learning. The NN is denoted by $f_\theta (\cdot )$, where $\theta$ denotes the network parameters (i.e., weight $W$ and bias $b$ ) and is defined as follows:

  • • NN $f_\theta (\cdot )$: $\mathcal {B} \rightarrow \mathcal {X}$ maps the input binary labeling vectors to the one-hot vectors $V^{\text {OH}}$ with scale of $M \times M$, where $M$ is number of input neural of the NN. Then, the vector $V^{\text {OH}}$ is used as the input of the NN to obtain the output result $\mathcal {X}$ according to $\mathcal {X} = f_\theta ({V^{\text {OH}}}) = W^T \cdot V^{\text {OH}} + b$. $N$ denotes the number of output neural of the NN.

 figure: Fig. 2.

Fig. 2. Architectures of the optimized diagram without OS constraint (above) and with OS constraint (below).

Download Full Size | PDF

In contrast, considering the optimization with OS constraint, the flow chart of NN learning is also shown in Fig. 2 (highlighted in green area). Compared to the conventional modulation optimization (red area in Fig. 2), the structure with OS constraint only needs to optimize the constellation points $\mathcal {X}_1$ in the first orthant, which is a small subspace $2^{m-N} \times N$ of the whole constellation. The NN can be defined as follows:

  • • NN $f_{\theta _{1}}(\cdot )$: $\mathcal {B}_1 \rightarrow \mathcal {X}_1$ maps the first orthant labeling $\mathcal {B}_1$ to the $2^{m-N} \times 2^{m-N}$ one-hot vectors $V_{1}^{\text {OH}}$. Note that $\theta _1$ denotes the NN parameters of the first orthant in this scenario. The output results denote the optimized first orthant constellation $\mathcal {X}_1$ obtained by $\mathcal {X}_1 = f_{\theta _1}({V_{1}^{\text {OH}}})$. The counts of input neural and output neural of NN are $2^{m-N}$ and $N$, respectively. The OS modulation format $\mathcal {X}^{\circ }$ is obtained via Algorithm  1 using the optimized first orthant labeling constellation $\mathcal {X}_1$.

Tables Icon

Algorithm 1. OS labeled constellation generation.

For both scenarios above, the optimizations are performed using an Adam optimizer [33] with learning rate 0.1, which is optimized in oure previouswork [24]. The performance difference of two learning rates has been investigated in [24] and the Rectified Adam optimizer [34] can be also implemented to rectify the variance of the adaptive learning rate. Moreover, the input and output of the NN are connected directly without hidden layers and active function, and hence the weight is directly denoted by the coordinates of the constellation points and bias equals zero in this work.

Before the training process for constellation optimization, we employ an initialization step for per-training of the NN parameters, which mainly compares the mean square error (MSE) between the NN optimized result and the initial constellation format. Note that due to the non-convex of the GMI-based optimization problem, a proper initial modulation format selection is an important factor and can reduce the training time as well as find the solutions close to optimal solution. For instance, conventional QAM and amplitude phase shift keying (APSK) with Gray labeling have been shown to be good initial constellations [32]. Moreover, GPU-accelerating codes can be also used to reduce the training process from days to hours. However, for high-cardinality constellation (e.g., 4D 1024-ary modulation formats), the requirement of GPU memory and running time will be significantly increased. We found that the gradient information consumes most of the memory. To make the optimization suitable for the commonly used GPU, we divide GHQ process into several batches. In each batch, only part of the variables are stored with gradient information and the batches involves all variables within one epoch. After updating the parameters for the required variables, the memory of them can be released. By adapting this method, 4D modulation formats with large cardinality size can be optimized.

4.3 Improving the nonlinearity tolerance via 4D NLI model

As shown in Fig. 1, two kinds of scenarios for optimization are considered. For the AWGN channel, we consider the modulation optimization at the fixed SNR corresponding to the GMI threshold of $\text {0.85}m$. For the optical channel, in order to investigate the potential of DP-4D modulation formats, we employ the 4D NLI model [22,29] to optimize the modulation formats and characterize the trade-off between linear shaping gain and nonlinear tolerance. More details about the 4D NLI model are shown in Appendix.

According to the derivation of NLI model, the NLI noise power $\sigma _{\text {NLI}}^2$ in Eq. (1) can be considered as a function of the NLI power coefficient $\eta$ and launch power, which is denoted as $\sigma _{\text {NLI}}^2 = \eta P^{3}$. As the Eq. (7) in the Appendix is shown, the NLI power coefficient $\eta$ related to the considered system configuration $\mathcal {P}$ (e.g., fiber parameters, etc) and $\mathcal {X}$. Therefore, $\eta$ will be denoted as $\eta {(\mathcal {P}, \mathcal {X})}$ in the rest of the paper. For a general dual polarization modulation formats, $\eta {(\mathcal {P}, \mathcal {X})}=\frac {\sigma _{\text {NLI}}^2}{P^{3}}$ can be obtained via the Eq. (7). In order to improve the accuracy of effective SNR by considering signal-ASE noise interaction, Eq. (1) can be further modified as [25]

$$\text{SNR}_{\text{eff}} = \frac{P}{\delta_{\text{ASE}}^2 + \eta{(\mathcal{P}, \mathcal{X})} ( P^3 + 3\xi P_{\text{ASE}}P^2)},$$
where $\xi$ denotes the signal-ASE NLI accumulation coefficient that can be calculated via [25, Eq. (3)].

To design the nonlinear-tolerant modulation formats with the help of 4D NLI model, the OS-based modulation optimization problem in Eq. (4) can be redefined as

$$ \{\mathcal{X}^*_1,\mathcal{L}^* _{1}\} =\mathop{\textrm{argmax}}\limits_{\mathcal{X},\mathcal{L}:E[||X||^2]\leq \sigma^2_x} \{\text{GMI}(\mathcal{X}^{{\circ}},\mathcal{L}, \text{SNR}_{\text{eff}}^{\text{opt}})\}, $$
where $\text {SNR}_{\text {eff}}^{\text {opt}}$ denotes the effective SNR in the optimum launch power $P^{\text {opt}}$. $P^{\text {opt}}$ is calculated by setting $\frac {\partial \text {SNR}_{\text {eff}}}{\partial P}$ to zero. By taking the modulation-dependent NLI into account, Eq. (6) can be considered as an extension of Eq. (3) and Eq. (4) to include the expected change in SNR due to the changed shape of the constellation. Thus, this effect of modulation optimization on the change in $\mathcal {X}$ and $\text {SNR}_{\text {eff}}$ will be reflected on the gradient for the cost function GMI when using the neural network-based modulation optimization in Fig. 1. Although the 4D NLI model does not explicitly assume the channel law $f_{\boldsymbol {Y}|\boldsymbol {X}}$ to be Gaussian, the estimated NLI power can still be used to compute Eq. (2) under a circularly-symmetric Gaussian assumption.

5. Optimization results and analysis

We consider the geometrical shaping optimization for different 4D modulation formats at the SEs of 9 and 10 bit/4D-sym over the AWGN channel and fiber-optical transmission with single channel. Note that optimizing the modulation format with SE of 9 bit/4D-sym can validate that the approach is applicable to general DP-4D formats with odd bits, where a single 2D format is not exist. With the same SE, the performance of the optimized 4D modulation formats with OS constraint (denoted as $M$ w/ OS) and without OS constraint (denoted as $M$ w/o OS) as well as the conventional QAM formats is compared. The cardinality size of 4D modulation $M = \{512, 1024\}$ is considered in this paper, thus 512-ary set-partitioning QAM (512SP-QAM) [35], PM-32QAM and 2D geometrically-shaped format with 32 points (2D-GS32) [6,10] are chosen as baselines. Note that 512SP-QAM and PM-32QAM are the initial constellations for the optimization without OS constraint. While for the optimization with OS, the OS-based 4D constellations are used as the initial constellations, which are generated by the first-orthant constellation selected from PM-32QAM with all their components to be positive. Moreover, the set-partition operation is applied for the first-orthant for the 9 bit/4D-sym modulation formats. In the remainder of this paper, the optimized 4D modulation formats based on AWGN channel or for AWGN channel are referred as “AWGN-learned", whereas the optimized 4D formats based on 4D NLI model are referred as “model-learned". For the AWGN channel, the 4D modulation formats are optimized at GMI threshold with around 15%–25% overheads (OHs). Thus, the corresponding optimized SNRs for 9 and 10 bit/4D-sym modulation formats are at the range of around 11–12.5 dB and 12.5–14 dB, respectively. For fiber-optical nonlinear channel, we consider two types optical fibers, i.e., standard single-mode fiber (SMF) and non-zero dispersion-shifted fiber (NZDSF), to compare the nonlinear performance of the optimized 4D modulation formats. The considered simulation parameters of the multi-span optical transmission system are shown in Table 1. It is also important to emphasise that most of the existing optimization approaches (including the one proposed in this paper) are not guaranteed to find the global optimum. However, the obtained constellations have shown a notable performance improvement over the QAM formats and can also reduce the modulation-dependent nonlinear interference.

Tables Icon

Table 1. Simulation parameters.

5.1 4D modulation optimization for AWGN channel

In order to validate the effectiveness of adding OS constraint in the 4D modulation optimization, we first compare the performance of the optimized 4D modulation formats with OS constraint (denoted as AWGN-learned w/ OS) or without OS constraint (denoted as AWGN-learned w/o OS) in Fig. 3. The left figures of Figs. 3(a) and (b) depict the optimization procedure of solving Eq. (3) (without OS constraint) and Eq. (4) (with OS constraint) over AWGN channel for the modulation formats with SEs of 9 and 10 bit/4D-sym at the SNR of 12 dB and 14 dB, respectively. The conventional QAM formats with the same SE, i.e., 512SP-QAM [35] and PM-32QAM, and 2D geometrically-shaped format with 32 points (2D-GS32) [6,10] are also shown as baselines. The GMI performance in AWGN channel of the optimized 4D modulation formats is obtained after 1500 optimization iterations. Note that we consider the target SEs for optimizing modulation formats at GMI threshold with 15% to 25% OHs for SD-FEC. Therefore, the target GMI ranges for 9 and 10 bit/4D-sym modulations are 7.2-7.8 bit/4D-sym and 8-8.7 bit/4D-sym, respectively (see the highlighted green area in Fig. 3).

 figure: Fig. 3.

Fig. 3. GMI vs. iterations and GMI vs. SNR for the AWGN-learned 4D modulation formats over AWGN channel. The target SNRs are 12 dB and 14 dB for 512-ary and 1024-ary 4D modulation optimization, respectively.

Download Full Size | PDF

As shown in Fig. 3, it can be observed that the OS-based 4D modulation formats spend less time to converge and achieve a better GMI performace after 1500 iterations with respect to the formats without OS constraint. In Fig. 3(a), we can observe that AWGN-learned w/ OS (red lines) provides the GMI gains of 0.25 and 0.8 bit/4D-sym with respect to the AWGN-learned w/o OS (blue lines) and the 512SP-QAM, respectively. These GMI gains translate into the linear performance gain in terms of SNR around 0.5 dB and 1.35 dB, respectively. Note that 512SP-QAM is generated by using the set-partitioning operation for PM-32QAM with no Gray labeling, which results in larger mismatch between the number of nearest neighbors and the length of the bit labeling. When comparing the 4D-1024ary formats in Fig. 3 (b), we can also observe that 4D AWGN-learned format with OS requires less time to converge and perform slightly better than 4D AWGN-learned format without OS. In addition, the optimized 4D formats could provide a GMI gain of 0.23 bit/4D-sym with respect to PM-32QAM. At the same GMI of 8.5 bit/4D-sym, 2D-GS32 can provide a 0.35 dB shaping gain in terms of SNR with respect to PM-32QAM over the two-dimensional space. While optimizing the modulation formats in 4D space, the 4D AWGN-learned modulation formats achieve the SNR gain of 0.5 dB with respect to PM-32QAM. Note that the extra SNR gain (0.15 dB) comes from the joint use of MD space. Similar observation of the performance advantage for 4D-GS modulation formats has been also shown in [10].

According to the results shown in Fig. 3, it can be observed that the OS-based modulation formats have notable advantages of fast optimization speed and less time to converge compared to the modulation formats without OS constraint. For example, the optimization with OS needs approximately 200–300 iterations to make the optimization converge close to the optimum, which is only 10%–20% of the required iterations for the optimization without OS to achieve the similar GMI. In general, the optimization without adding constraint can ideally achieve the optimum result with enough optimization time. However, it will become extremely difficult to achieve the optimal results for multi-dimensional high-order modulation formats to solve the non-convex GMI-based optimization problem. In contrast, the OS constraint limits the optimization space and uses the property of multi-dimensional symmetry, which can result in a better result with less time. These results indicate that the OS constraint can be considered as a reliable option for MD modulation optimization. In the remainder of this paper, AWGN-learned 4D formats without OS are used as a baseline, together with uniform QAM and 2D-GS if it is existing.

5.2 4D NLI model-learned modulation and nonlinear tolerance analysis

In general, maximizing the linear shaping gain by introducing a Gaussian-like distribution can result in a higher AIR over the AWGN channel. However, the gains can not maintain and maybe disappear in the fiber-optical channel due to the Kerr nonlinear effect. In this paper, we employ the 4D NLI model to estimate the 4D modulation-dependent NLI and solve Eq. (6) to obtain the nonlinear-tolerant 4D formats. Note that it is possible to optimize a constellation for arbitrary channel model, calculation of the exact channel law requires significant computational resources and also depend on the implemented DSP chain. To avoid these bottlenecks, here we use 4D NLI model to estimate the NLI power under the assumption of considering the NLI as extra AWGN, and thus to maximize the mismatched GMI performance (discussed in Sec. 3). Signal propagation for a single channel optical transmission system over SMF and NZDSF are considered and the parameters are shown in Table 1.

Figure 4 shows the coordinates in two 2D-projections of two model-learned modulation formats with 10 bit/4D-sym obtained by the 4D NLI model for SMF (Fig. 4(a)) and NZDSF (Fig. 4(b)). The launch powers are 1 dBm and -1.5 dBm, while the transmission distances are 3040km and 2160km, for the learnt constellations shown in Fig. 4 (a) and Fig. 4(b), respectively. In order to clearly show the 4D symbol dependency, we use the colors that span on the colorbar to indicate the 4D symbol energy range of the all constellation points. In other words, 2D projected symbols in the first and second 2D are valid 4D symbols only if they have the same color.

 figure: Fig. 4.

Fig. 4. Constellation diagrams of 4D 1024-ary modulation formats in $2 \times$ 2D projection after 4D NLI model-learned optimization for the SMF (a) and NZDSF (b). The color per point denotes the 4D symbol energy value. Four binary bits are highlighted to represent the orthants.

Download Full Size | PDF

In order to further analyse the nonlinear tolerance of the optimized 4D formats, the empirical cumulative distribution functions (CDFs) of the energy per 4D symbol for AWGN-learned w/o OS, model-learned w/ OS over SMF or NZDSF are compared in Fig. 5. The variation of the variance of the 4D symbol energy during the optimization process for the three optimized 4D modulation formats are also shown as inset. The peak symbol energies of the two model-learned modulation formats are highlighted as red markers (A for SMF and B for NZDSF). It can be observed that the blue line has larger peak 4D symbol energy and larger energy variation, since the AWGN-learned modulation tries to change the shape of constellation to approach the Gaussian-like distribution. In contrast, the two green lines reduce the peak symbol energy as well as the variance of the 4D symbol energy by using the 4D NLI model, thus it should in principle result in a lower nonlinear interference. Moreover, the model-learned for NZDSF (solid green) has a smaller peak symbol energy and lower variation of energy than the model-learned for SMF (dashed green). This effect can be explained by the stronger short-autocorrelation nonlinear noise in low-dispersion optical fibers, e.g. NZDSF. Therefore, the proposed approach is expected to be more tolerant with NLI for higher nonlinear transmission system with the help of 4D NLI model.

 figure: Fig. 5.

Fig. 5. Empirical CDF of the 4D-symbol energy distribution for three 4D 1024-ary modulation formats. All the formats are normalized to $E_s=1$ (i.e., unit energy for dual polarization). Inset: The variation of the variance of 4D symbol’s energy for the 4D modulation during the optimization procedure.

Download Full Size | PDF

Consider for example the two 4D symbols with peak power (markers A and B) in Fig. 5, these 4D symbols are also highlighted as a combination of two 2D symbol in Fig. 4 (a) and Fig. 4 (b), respectively. Note that there are at least sixteen 4D symbols have the same energy as marker A or marker B due to the orthant symmetry. We can observe that, even though the energy of the first 2D symbol in point A is much smaller than both two 2D symbols of point B, but the total energy of the 4D symbol A is large than symbol B. Apart from the peak energy point, for evaluating an overall 4D modulation-dependent NLI performance, all the 4D symbols’ energies should be taken into account (see the 4D energy variation shown in the inset of Fig. 5). Therefore, in terms of 4D symbol energy variation, the NLI-tolerance of the model-learned format for NZDSF in principle increases compared to the model-learned format for SMF as well as the AWGN-learned. It implies that, to analyse the nonlinearilty of a general DP-4D modulation format, the energy of 4D symbol over two polarization should be investigated, instead of only looking into the two independent 2D symbols’ energy. These predictions will be confirmed in Sec. 6.

6. Numerical results for nonlinear fiber optical transmission system

In this section, we first validate the accuracy of the 4D NLI model for SNR prediction by comparing with the optical fiber transmission via split-step Fourier method. Then, the numerical simulation results over SMF and NZDSF are presented by comparing the three groups of modulation formats, i.e., conventional QAM formats, AWGN-learned modulation formats without OS constraint and 4D NLI model-learned modulation formats with OS constraint.

6.1 Validation of the SNR predication via 4D NLI model

To study the accuracy of the 4D NLI model for predicting the power of nonlinear interference, especially for modulation-dependent interference, Fig. 6 shows the effective SNR as a function of launch power for 10 bit/4D-sym modulation formats over 3040 km SMF and 2160 km NZDSF. The solid lines denote the SNR prediction via 4D NLI model given by Eq. (5) and the markers represent the numerical simulation results via SSFM which corresponds the results in Fig. 7(b) and Fig. 8(b). It can be observed that both (solid lines and markers) results are matched well for all the considered modulation formats.

 figure: Fig. 6.

Fig. 6. The effective SNR vs. launch power for 10 bit/4D-sym modulation formats over SMF (a) and NZDSF (b).

Download Full Size | PDF

As expected, all the geometrically-shaped modulation formats result in a SNR penalty with respect to PM-32QAM due to the location change for approaching a Gaussian-like distribution. However, the 4D model-learned format shows less SNR penalty than the 2D AWGN-optimized format and the 4D AWGN-learned formats with and without OS. In addition, the effective SNR differences between the 4D model-learned format and the three AWGN-optimized formats become larger in NZDSF. For instance, the model-learned format provides around 0.08 dB effective SNR gain with respect to 4D AWGN-learned format at the optimum launch power.

6.2 Numerical results over SMF and NZDSF

We consider 4D modulation formats of 9 and 10 bit/4D-sym transmitted over a single channel optical fiber communication system with symbol rate of 45 GBud and roll-off factor of 0.1 for root-raised-cosine (RRC) filter. Symbols with length of $2^{17}$ are propagated over the SMF with parameters shown in Table 1. Each symbol is up-sampled and is transmitted over an 80 km SMF through a split-step Fourier solution of the nonlinear Manakov equation. The simulated step size is 0.1 km and the EDFA is with 5 dB noise figure. After propagation, the received signal is digitally compensated for chromatic dispersion, then the signal is matched filtered and downsampled. Potential constant phase rotation is compensated. Then, the performance of GMI and effective SNR are calculated to evaluate the performance of difference modulation formats.

Figure 7(a) depicts the GMI as a function of transmitted distance over the SMF for 4D modulation formats with SE of 9 bit/4D. It can be seen that model-learned modulation format provides 11% and 31% reach increases at GMI of 7.5 bit/4D-sym over the AWGN-learned modulation format and 512SP-QAM, respectively. In addition, the green line has 0.28 bit/4D-sym GMI gain at 4640 km with respect to blue line. Compared to the gain of 0.25 bit/4D-sym shown in Fig. 3, it can be explained as the nonlinear tolerance of model-learned format and the AWGN-learned format results in a larger effective SNR penalty. Figure 7(b) shows the GMI performance for 2D and 4D modulation formats with SE of 10 bit/4D at the optimal launch power, including PM-32QAM, 2D and 4D modulation formats optimized for the AWGN channel and 4D formats optimized with 4D NLI model in terms of the trade-off between the linear and nonlinear shaping gain. Both 2D and 4D constellation shaped for AWGN (2D-GS32 and 4D AWGN-learned) can exhibits better performance than PM-32QAM. It should be note that although 4D AWGN-learned format show clearly better performance in AWGN channel (see Fig. 3(b)), it results in a smaller resultant gain in optical channel. In contrast, the 4D model-learned format provides larger gain and has the best performance in terms of GMI. It can be also observed that the model-learned modulation format achieves 8% reach increase compared to PM-32QAM and outperforms more than one span (100 km) over 2D-GS32 and AWGN-learned formats at GMI threshold of 8.5 bit/4D-sym.

 figure: Fig. 7.

Fig. 7. GMI vs. transmission reach for different modulation formats over SMF at the optimal launch power: (a) 4D 512-ary formats, (b) 4D 1024-ary formats.

Download Full Size | PDF

 figure: Fig. 8.

Fig. 8. GMI vs. transmission reach for different modulation formats over NZDSF at the optimal launch power: (a) 4D 512-ary formats, (b) 4D 1024-ary formats.

Download Full Size | PDF

The similar simulation setup was also implemented over the NZDSF transmission system with the parameters shown in Table 1. Figure 8(a) shows GMI as a function of the transmitted distance over the NZDSF for 4D 512-ary modulation formats at the optimal launch power. In this scenario, the model-learned modulation format provides 12% gain in transmission reach and 0.29 bit/4D-sym gain in GMI with respect to AWGN-learned one without OS. When compared to 512SP-QAM, the model-learned modulation format achieves more than 34% reach increase at GMI threshold of 7.5 bit/4D-sym. Compared to the results shown in Fig. 7(a), the gap of blue line and green line becomes larger, as NZDSF link contributes to higher nonlinearity with respect to SMF link. Figure 8(b) shows that model-learned modulation format with OS provides 210 km (10%) reach increase compared to PM-32QAM and yields 90 km (4%) reach increase with respect to AWGN-optimized modulation formats (2D and 4D) at GMI threshold of 8.5 bit/4D-sym. Note that the effective SNRs of 4D modulation formats for the distances of 3040 km and 2160 km have been analyzed in Fig. 6, which are highlighted in Fig. 7(b) and Fig. 8(b) with triangle markers.

For better understanding of the nonlinear tolerance benefit in optical channel, these GMI gains of model-learned modulation formats in Fig. 7 and Fig. 8 can be compared to those in Fig. 3, where no NLI is present. It can be observed that (i) the AWGN-learned 4D 512-ary format with OS over AWGN channel provides 0.25 bit/4D-sym GMI gain (see Fig. 3(a)) compared to AWGN-learned format without OS, the overall gains of the optimized 4D formats with the help of 4D NLI model and OS constraints becomes larger (>0.28 bit/4D-sym, see Fig. 7(a) and Fig. 8(a)) in nonlinear optical fiber channel, (ii) even though the AWGN-learned 4D 1024-ary format with OS over AWGN channel provides almost no GMI gain compared to AWGN-learned format without OS, the gain providing by model-learned modulation format becomes more obvious (>0.1 bit/4D-sym) in nonlinear optical fiber channel and (iii) the GMI gap between AWGN-learned format and model-learned format over NZDSF link is larger than that gap over SMF link, because the NZDSF link contributes the higher nonlinear interference. These results imply the effectiveness of reducing the modulation-dependent NLI term by using the 4D NLI model into the optimization. The additional shaping gains imply that the model-learned formats can be more beneficial in nonlinear optical channel, especially for the higher nonlinear scenario, and the AWGN-learned modulation formats are more sensitive to the nonlinear interference resulting in a larger effective SNR and GMI penalty. For the optical communication system, the shaping gain should be considered as the combination of linear shaping and nonlinear mitigation gains. In general, the AWGN-learned modulation formats on the one hand can maximize the linear shaping gain to get close to the Shannon capacity, but on the other hand, it also suffers from the nonlinear interference penalty. With the help of 4D NLI model, the model-learned modulation formats can mitigate the nonlinear effect to maintain the linear shaping into the nonlinear channel and are superior in the high nonlinear scenario as shown in Fig. 7 and Fig. 8.

6.3 Performance penalty with finite-resolution DAC

Although 4D geometrically-shaped modulation formats are shown to provide considerable performance improvement, the remaining gains with considering practical hardware limitation, e.g., DAC with limited effective number of bits (ENOB), are more interesting. Hence, we also evaluate the performance of the model-learned format and PM-32QAM with finite-resolution DAC. The transmitted symbol sequence at 2 samples per symbol is filtered by a pulse shaping filter with roll-off factor of 0.1, and then the signals are linearly quantized with variable numbers of quantization levels for DAC. The clipping is not considered throughout this paper. To better understand the effect of DAC quantization, an ideal analog-to-digital converter with infinite resolution is considered in this example.

Figure 9 shows the required SNR for a target GMI of 8.5 bit/4D-sym as a function of the DAC resolution over AWGN channel. Note that the inset denotes the model-learned constellation with two 2D projections (X-pol and Y-pol) after the quantization with DAC resolution of 5 bits. The results in Fig. 9 show that compared to the performance of PM-32QAM at high DAC resolutions, the 4D shaped modulation format can provide around 0.5 dB gain. We can also observe that the performance of the model-learned format (0.28 dB penalty) degrades more rapidly than that of PM-32QAM (0.22 dB penalty) at 5 bit quantization, and quantization with less than 5-bit resolution will result in significant penalties ($>1$ dB) for all modulation formats. Even though finite quantization resolution reduces the performance of all the modulation formats due to the larger quantization distortion for the neighboring points, the model-learned format can still provide 0.44 dB shaping gain at 5 bit quantization. These results imply that the shaping gain deteriorates for fewer quantization bits and at least a 5 bit resolution is require to have notable shaping gains. Similar observation for 2D geometric constellation has been also discussed in [36].

 figure: Fig. 9.

Fig. 9. Required SNR vs. DAC resolution of two modulation formats for achieving the GMI of 8.5 bit/4D-sym. The inset denotes the model-learned 4D constellation after 5-bit DAC quantization.

Download Full Size | PDF

7. Conclusion

In this paper, via the recently proposed 4D nonlinear interference model, we have optimized a set of DP-4D constellations of 9 and 10 bit/4D-sym spectral efficiencies with lower 4D symbol energy variation. This approach is applicable to a wide range of DP-4D formats and even a 4D format, where a single 2D format might not even exist. The performance of different 4D modulation formats are evaluated and analysed for both AWGN channel and optical fiber channel. Numerical results of multi-span optical transmission show that the AWGN-learned modulation formats are more sensitive to the nonlinear interference and the model-learned formats can trade-off the linear shaping gain and nonlinear mitigation gain. The transmission reach improvements between 8% and 34% are shown by the proposed 4D formats in two types of nonlinear fiber links. We believe that 4D NLI model is a good alternative for designing the nonlinear tolerant DP-4D modulation formats in future high capacity transmission systems.

Although the presented method and results are based on the derived 4D NLI model with only considering self-channel interference and accurate for a single-channel transmission scenario, the same approach can be applied more generally to the wavelength-division multiplexing (WDM) system. An extension of this work for optimizing the DP-4D modulation formats to the WDM transmission scenario will be addressed in a future work, which first need to derive an accurate 4D NLI model by taking the cross-channel interference and multiple-channel interference into account as discussed in [37].

Future work will also focus on investigating the impact of carrier phase estimation (CPE) on the residual phase noise after DSP, i.e. laser phase noise and modulation dependent nonlinear phase noise. For such a study, a fully differentiable DSP chain enabling full end-to-end learning of coherent transceivers will certainly play a key role.

Appendix: 4D NLI model for single channel

In order to design multi-dimensional modulation formats, the 4D NLI model is crucial to estimate the nonliner inference. The 4D NLI model was introduced and detailedly derived for single channel optcial transmission system in [22]. Therefore, here we only recall the main defining formulas and key assumptions.

The 4D NLI model is derivied under the following three assumptions: 1) a single-channel signal is transmitted in uncompensated links, i.e. the fiber channel is a multi-span fiber system using EDFA; 2) A second assumption is that the sequence of symbols generated via transmitter is a cyclostationary process of period $W$ and the set of random variables within each period are statistically independent; 3) the symbols is modulated with a single, real pulse which is assumed to be strictly band-limited within the range of frequencies $[-R_s/2, R_s/2]$.

With the assumption above, the NLI power over the x- and y- polarization was derived using a frequency-domain, first-order perturbational approach under the hypothesis of single-channel transmission with quasi-rectangular pulse spectrum, which can be denoted as $\sigma _{\text {NLI}}^2 = (\sigma _{\text {x}}^2,\sigma _{\text {y}}^2)$. According to the estimated NLI power in [22, Eqs. (42,43)], the corresponding NLI power coefficient for x polarization can be shown as

$$\begin{aligned} \eta_\text{x} \triangleq \sigma_{\text{x}}^2/P^3 & =(\frac{8}{9})^2\frac{\gamma^2}{P^3}[R_s^3(\Phi_1\chi_1+\Phi_2\chi_2+\Phi_3\chi_3)+R_s^2(\Psi_1\chi_4 +2\Re\{\Psi_2\chi_5+\Psi_3\chi_5^*\}+\Psi_4\chi_6\\ & +2\Re\{\Lambda_1\chi_7+\Lambda_2\chi_7^*\}+\Lambda_3\chi_8+2\Re\{\Lambda_4\chi_9+\Lambda_5\chi_9^*\}+\Lambda_6\chi_{10})+R_s\Xi_1\chi_{11}], \end{aligned}$$
where $P$ is the transmitted signal power, $\gamma$ is the fiber nonlinearity coefficient, $R_s$ is the symbol rate. The coefficients $\Phi _1, \Phi _2, \Phi _3, \Psi _1, \Psi _2, \Psi _3, \Psi _4, \Lambda _1, \Lambda _2,\ldots, \Lambda _6$ and $\Xi _1$ are functions of different intra- and cross- polarization moments associated with modulation formats, and the coefficients $\chi _i, i=1,2,\ldots,11$ are the frequency-dependent integrals over the channel bandwidth, which are given in [22, Table 8]. The expression in Eq. (7) can be immediately generalized to y polarization component by simply applying the transformation $\text {x} \rightarrow \text {y}$ and $\text {y} \rightarrow \text {x}$.

Funding

National Natural Science Foundation of China (62171175, 62001151); Fundamental Research Funds for the Central Universities (JZ2022HGTB0262); Open Fund of IPOC (Beijing University of Posts and Telecommunications) China, (IPOC2022A08).

Disclosures

The authors declare no conflicts of interest.

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

References

1. G. Forney, R. Gallager, G. Lang, F. Longstaff, and S. Qureshi, “Efficient modulation for band-limited channels,” IEEE J. Select. Areas Commun. 2(5), 632–647 (1984). [CrossRef]  

2. U. Wachsmann, R. F. Fischer, and J. B. Huber, “Multilevel codes: Theoretical concepts and practical design rules,” IEEE Trans. Inf. Theory 45(5), 1361–1391 (1999). [CrossRef]  

3. G. Böcherer, F. Steiner, and P. Schulte, “Bandwidth efficient and rate-matched low-density parity-check coded modulation,” IEEE Trans. Commun. 63(12), 4651–4665 (2015). [CrossRef]  

4. T. Fehenberger, A. Alvarado, G. Böcherer, and N. Hanik, “On probabilistic shaping of quadrature amplitude modulation for the nonlinear fiber channel,” J. Lightwave Technol. 34(21), 5063–5073 (2016). [CrossRef]  

5. A. Amari, S. Goossens, Y. C. Gültekin, O. Vassilieva, I. Kim, T. Ikeuchi, C. Okonkwo, F. M. J. Willems, and A. Alvarado, “Introducing enumerative sphere shaping for optical communication systems with short blocklengths,” J. Lightwave Technol. 37(23), 5926–5936 (2019). [CrossRef]  

6. S. Zhang, F. Yaman, E. Mateo, T. Inoue, K. Nakamura, and Y. Inada, “Design and performance evaluation of a GMI-optimized 32QAM,” in 2017 Eur. Conf. on Opt. Commun. (ECOC), (2017).

7. S. Li, C. Häger, N. Garcia, and H. Wymeersch, “Achievable information rates for nonlinear fiber communication via end-to-end autoencoder learning,” in Eur. Conf. Opt. Commun. (ECOC), (IEEE, 2018), pp. 1–3.

8. D. Pilori, A. Nespola, F. Forghieri, and G. Bosco, “Non-linear phase noise mitigation over systems using constellation shaping,” J. Lightwave Technol. 37(14), 3475–3482 (2019). [CrossRef]  

9. A. Rode, B. Geiger, and L. Schmalen, “Geometric constellation shaping for phase-noise channels using a differentiable blind phase search,” in Opt. Fiber Commun. Conf. (OFC), (2022), pp. 1–3.

10. E. Sillekens, G. Liga, D. Lavery, P. Bayvel, and R. I. Killey, “High-cardinality geometrical constellation shaping for the nonlinear fibre channel,” J. Lightwave Technol. 40, 6374–6387 (2022). [CrossRef]  

11. B. Chen, Y. Lei, G. Liga, Z. Liang, W. Ling, X. Xue, and A. Alvarado, “Geometrically-shaped multi-dimensional modulation formats in coherent optical transmission systems,” J. Lightwave Technol. 41(3), 897–910 (2023). [CrossRef]  

12. K. Kojima, T. Yoshida, T. Koike-Akino, D. S. Millar, K. Parsons, M. Pajovic, and V. Arlunno, “Nonlinearity-tolerant four-dimensional 2A8PSK family for 5-7 bits/symbol spectral efficiency,” J. Lightwave Technol. 35(8), 1383–1391 (2017). [CrossRef]  

13. B. Chen, C. Okonkwo, H. Hafermann, and A. Alvarado, “Polarization-ring-switching for nonlinearity-tolerant geometrically shaped four-dimensional formats maximizing generalized mutual information,” J. Lightwave Technol. 37(14), 3579–3591 (2019). [CrossRef]  

14. R.-J. Essiambre, R. Ryf, M. Kodialam, B. Chen, M. Mazur, J. I. Bonetti, R. Veronese, H. Huang, A. Gupta, F. A. Aoudia, E. C. Burrows, D. F. Grosz, L. Palmieri, M. Sellathurai, X. Chen, N. K. Fontaine, and H. Chen, “Increased reach of long-haul transmission using a constant-power 4D format designed using neural networks,” in 2020 European Conference on Optical Communications (ECOC), (2020), pp. 1–4.

15. B. Chen, A. Alvarado, S. van der Heide, M. van den Hout, H. Hafermann, and C. Okonkwo, “Analysis and experimental demonstration of orthant-symmetric four-dimensional 7 bit/4D-sym modulation for optical fiber communication,” J. Lightwave Technol. 39(9), 2737–2753 (2021). [CrossRef]  

16. S. Goossens, Y. C. Gültekin, O. Vassilieva, I. Kim, P. Palacharla, C. Okonkwo, and A. Alvarado, “Introducing 4D geometric shell shaping for mitigating nonlinear interference noise,” J. Lightwave Technol. 41(2), 599–609 (2023). [CrossRef]  

17. V. Oliari, B. Karanov, S. Goossens, G. Liga, O. Vassilieva, I. Kim, P. Palacharla, C. Okonkwo, and A. Alvarado, “High-cardinality hybrid shaping for 4D modulation formats in optical communications optimized via end-to-end learning,” arXiv, arXiv:2112.10471 (2021). [CrossRef]  

18. A. Mecozzi and R.-J. Essiambre, “Nonlinear shannon limit in pseudolinear coherent systems,” J. Lightwave Technol. 30(12), 2011–2024 (2012). [CrossRef]  

19. R. Dar, M. Feder, A. Mecozzi, and M. Shtaif, “Properties of nonlinear noise in long dispersion-uncompensated fiber links,” Opt. Express 21(22), 25685–25699 (2013). [CrossRef]  

20. P. Poggiolini, G. Bosco, A. Carena, V. Curri, Y. Jiang, and F. Forghieri, “The GN-model of fiber non-linear propagation and its applications,” J. Lightwave Technol. 32(4), 694–721 (2014). [CrossRef]  

21. A. Carena, G. Bosco, V. Curri, Y. Jiang, P. Poggiolini, and F. Forghieri, “EGN model of non-linear fiber propagation,” Opt. Express 22(13), 16335–16362 (2014). [CrossRef]  

22. G. Liga, A. Barreiro, H. Rabbani, and A. Alvarado, “Extending fibre nonlinear interference power modelling to account for general dual-polarisation 4D modulation formats,” Entropy 22(11), 1324 (2020). [CrossRef]  

23. G. Liga, B. Chen, and A. Alvarado, “Model-aided geometrical shaping of dual-polarization 4D formats in the nonlinear fiber channel,” in Opt. Fiber Commun. Conf. (OFC), (IEEE, 2022), pp. 1–3.

24. W. Ling, B. Chen, and Y. Lei, “Orthant-symmetric multi-dimensional geometrically-shaped modulation optimization,” in Int. Conf. on Opt. Commun. and Netw. (ICOCN), (2021), pp. 1–3.

25. Z. Liang, B. Chen, Y. Lei, G. Liga, and A. Alvarado, “Analytical SNR prediction in long-haul optical transmission using general dual-polarization 4D formats,” in Eur. Conf. Opt. Commun. (ECOC), (2022), pp. 1–4.

26. A. Alvarado and E. Agrell, “Four-dimensional coded modulation with bit-wise decoders for future optical communications,” J. Lightwave Technol. 33(10), 1993–2003 (2015). [CrossRef]  

27. A. Alvarado, T. Fehenberger, B. Chen, and F. M. J. Willems, “Achievable information rates for fiber optics: Applications and computations,” J. Lightwave Technol. 36(2), 424–439 (2018). [CrossRef]  

28. M. Sales-Llopis and S. J. Savory, “Approximating the partially coherent additive white gaussian noise channel in polar coordinates,” IEEE Photonics Technol. Lett. 31(11), 833–836 (2019). [CrossRef]  

29. G. Liga, B. Chen, A. Barreiro, and A. Alvarado, “Modeling of nonlinear interference power for dual-polarization 4D formats,” in Opt. Fiber Commun. Conf. (OFC), (2021).

30. H. Rabbani, H. Hosseinianfar, H. Rabbani, and M. Brandt-Pearce, “Analysis of nonlinear fiber kerr effects for arbitrary modulation formats,” J. Lightwave Technol. 40(16), 5567–5574 (2022). [CrossRef]  

31. A. Soleimanzade and M. Ardakani, “EGN-based optimization of the APSK constellations for the non-linear fiber channel based on the symbol-wise mutual information,” J. Lightwave Technol. 40(7), 1937–1952 (2022). [CrossRef]  

32. K. Gümüs, A. Alvarado, B. Chen, C. Häger, and E. Agrell, “End-to-end learning of geometrical shaping maximizing generalized mutual information,” in Opt. Fiber Commun. Conf. (OFC), (2020), pp. 1–3.

33. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” in Proc. ICLR, (2014).

34. L. Liu, H. Jiang, P. He, W. Chen, X. Liu, J. Gao, and J. Han, “On the variance of the adaptive learning rate and beyond,” in 8th International Conference on Learning Representations (ICLR), (2020).

35. J. K. Fischer, C. Schmidt-Langhorst, S. Alreesh, R. Elschner, F. Frey, P. W. Berenguer, L. Molle, M. Nölle, and C. Schubert, “Generation, transmission, and detection of 4-D set-partitioning QAM signals,” J. Lightwave Technol. 33(7), 1445–1451 (2015). [CrossRef]  

36. O. Jovanovic, F. Da Ros, D. Zibar, and M. P. Yankov, “Geometric constellation shaping for fiber-optic channels via end-to-end learning,” arXiv, arXiv:1810.00774v1 (2022). [CrossRef]  

37. R. Dar, M. Feder, A. Mecozzi, and M. Shtaif, “Pulse collision picture of inter-channel nonlinear interference in fiber-optic communications,” J. Lightwave Technol. 34(2), 593–607 (2016). [CrossRef]  

Data availability

Data underlying the results presented in this paper are not publicly available at this time but may be obtained from the authors upon reasonable request.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (9)

Fig. 1.
Fig. 1. (a) The system model for mutli-span optical transmission system employing BICM and soft decision forward error correction (SD-FEC) under consideration. (b) 4D modulation optimization via auto-encoder for two considered channel model, i.e., AWGN channel and 4D NLI model.
Fig. 2.
Fig. 2. Architectures of the optimized diagram without OS constraint (above) and with OS constraint (below).
Fig. 3.
Fig. 3. GMI vs. iterations and GMI vs. SNR for the AWGN-learned 4D modulation formats over AWGN channel. The target SNRs are 12 dB and 14 dB for 512-ary and 1024-ary 4D modulation optimization, respectively.
Fig. 4.
Fig. 4. Constellation diagrams of 4D 1024-ary modulation formats in $2 \times$ 2D projection after 4D NLI model-learned optimization for the SMF (a) and NZDSF (b). The color per point denotes the 4D symbol energy value. Four binary bits are highlighted to represent the orthants.
Fig. 5.
Fig. 5. Empirical CDF of the 4D-symbol energy distribution for three 4D 1024-ary modulation formats. All the formats are normalized to $E_s=1$ (i.e., unit energy for dual polarization). Inset: The variation of the variance of 4D symbol’s energy for the 4D modulation during the optimization procedure.
Fig. 6.
Fig. 6. The effective SNR vs. launch power for 10 bit/4D-sym modulation formats over SMF (a) and NZDSF (b).
Fig. 7.
Fig. 7. GMI vs. transmission reach for different modulation formats over SMF at the optimal launch power: (a) 4D 512-ary formats, (b) 4D 1024-ary formats.
Fig. 8.
Fig. 8. GMI vs. transmission reach for different modulation formats over NZDSF at the optimal launch power: (a) 4D 512-ary formats, (b) 4D 1024-ary formats.
Fig. 9.
Fig. 9. Required SNR vs. DAC resolution of two modulation formats for achieving the GMI of 8.5 bit/4D-sym. The inset denotes the model-learned 4D constellation after 5-bit DAC quantization.

Tables (2)

Tables Icon

Algorithm 1. OS labeled constellation generation.

Tables Icon

Table 1. Simulation parameters.

Equations (7)

Equations on this page are rendered with MathJax. Learn more.

SNR eff = P σ ASE 2 + σ NLI 2 ,
GMI = G ( X , B , f Y | X ) = m + 1 M k = 1 m b ( 0 , 1 ) i I k b R N f Y | X ( y | x i ) log j I b k f Y | X ( y | x j ) 1 2 p = 1 M f Y | X ( y | x p ) d y ,
{ X , B } = argmax X , B : E [ | | X | | 2 ] σ x 2 { G ( X , B , f Y | X ) } ,
{ X 1 , B 1 } = argmax X 1 , B 1 : E [ | | X | | 2 ] σ x 2 { GMI ( X , B , f Y | X ) } ,
SNR eff = P δ ASE 2 + η ( P , X ) ( P 3 + 3 ξ P ASE P 2 ) ,
{ X 1 , L 1 } = argmax X , L : E [ | | X | | 2 ] σ x 2 { GMI ( X , L , SNR eff opt ) } ,
η x σ x 2 / P 3 = ( 8 9 ) 2 γ 2 P 3 [ R s 3 ( Φ 1 χ 1 + Φ 2 χ 2 + Φ 3 χ 3 ) + R s 2 ( Ψ 1 χ 4 + 2 { Ψ 2 χ 5 + Ψ 3 χ 5 } + Ψ 4 χ 6 + 2 { Λ 1 χ 7 + Λ 2 χ 7 } + Λ 3 χ 8 + 2 { Λ 4 χ 9 + Λ 5 χ 9 } + Λ 6 χ 10 ) + R s Ξ 1 χ 11 ] ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.