Deep learning and hierarchical graph-assisted crosstalk-aware fragmentation avoidance strategy in space division multiplexing elastic optical networks

Yu Xiong; Yu Xiong; Yu Xiong; Yulong Ye; Yulong Ye; Yulong Ye; Hua Zhang; Hua Zhang; Hua Zhang; Jinyou He; Jinyou He; Jinyou He; Baohua Wang; Baohua Wang; Baohua Wang; Kunrong Yang; Kunrong Yang; Kunrong Yang

doi:10.1364/OE.381551

1. Introduction

With the advent of big data, cloud computing, high-definition video and the 5G era, there are numerous types of applications in the network and the requirement for flexibility and capacity in the network is increasing. Elastic optical networks (EONs) with fine granularity play an increasingly important role in the current optical transport network. EONs with optical orthogonal frequency division multiplexing (O-OFDM) can use fine spectrum granularity and flexible modulation to achieve effective utilization of spectrum resource. However, with the exponential growth of network traffic, EONs based on single-mode and single-core fiber are gradually approaching the physical limits in terms of total achievable capacity [1]. In order to extend the transmission capacity and flexibility of the network, space division multiplexing (SDM) technology combined with multi-fiber (MF), multi-core (MC) and multi-mode (MM) is under extensive investigation. Meanwhile, Space division multiplexing elastic optical networks (SDM-EONs) with multi-core fiber (MCF) are considered to be the most promising candidate in the future [2].

SDM-EONs could greatly promote network transmission capacity by introducing more parallel cores into traditional single-core EONs [3]. However, it also brings a new challenge, i.e., inter-core crosstalk. The inter-core crosstalk causes physical impairment to signal when the same spectrum slots of adjacent cores are occupied, affecting the quality of transmission (QoT) of the requests [4]. In order to cope with inter-core crosstalk and promote the core multiplexing degree in SDM-EONs, a variety of strategies have been studied that may generally be classified as either best-effort approach or strict constrained approach [5]. The best-effort approach tries the best to avoid or minimize the occupancy of the same spectrum slot between adjacent cores during the establishment of lightpaths. For instance, in [6], the authors introduced an on-demand strategy that calculates the core adjacency according to the core-pitch and selects the core with the lowest adjacency for avoiding the inter-core crosstalk as far as possible. Obviously, this approach can efficiently mitigate the impact of crosstalk, but it will bring higher resource cost and mainly be applied to the light-load situation. The strict constrained approach relies on estimation of inter-core crosstalk during the resource allocation. In particular, a lightpath is only provisioned for a request when the levels of crosstalk of not only this new lightpath but also other already established lightpaths meet a predefined crosstalk threshold. Therefore, the key of this approach is accurately and timely acquiring the values of the inter-core crosstalk of all the adjacent cores on the current network state. The work in [7] proposed a trench-assist MCF and analyzed quantitatively the inter-core crosstalk based on the energy coupling theory. The work in [8] defined two flow tables, which indicate the resource occupancy before and after the allocation of requests, and proposed integer linear programming and mixed integer linear programming strategy based on the node arc to strictly constrain the inter-core crosstalk. The authors of [9] made full use of back propagation (BP) neural network to predict the network parameters, including the core, mode, wavelength, propagation distance at the time of request allocation. Then the inter-core crosstalk will be calculated in advance and compared with the crosstalk threshold. Thereby, inter-core crosstalk can be strictly constrained in the corresponding resource allocation method. In summary, the strict constrained approach is complex in terms of the crosstalk calculations, nevertheless, it should result in the best allocation efficiency with the guarantee of acceptable crosstalk levels of established connections. Therefore, it is necessary to make a comprehensive consideration to these two approaches.

In SDM-EONs, the traditional routing, modulation and spectrum allocation (RMSA) are extended to routing, core, modulation and spectrum allocation (RCMSA). The increase in transmission capacity in turn causes the lightpaths to be set up and tore down more frequently. Due to the constraints of spectrum continuity and contiguity, the spectrum resource in the network will be split into a large number of isolated spectrum blocks. These spectrum blocks are called spectrum fragmentation and will make spectrum resource unavailable, which results in lower resource utilization. Therefore, RCMSA in SDM-EONs must balance the tradeoff between the inter-core crosstalk and the spectrum fragmentation. A crosstalk-aware defragmentation strategy was designed in [10]. The inter-core crosstalk is strictly constrained based on the crosstalk threshold, then the defragmentation is implemented considering the spectrum compactness. It significantly improves the resource utilization. The work in [11], a multi-dimensional resource model based on the characteristics of MCF was proposed. Defragmentation is carried out from spectrum dimension and time dimension based on spectrum and time compactness while avoiding requests interruption as much as possible. However, this work is mainly for the advance reservation (AR) requests. In [12], the authors divide requests into the immediate reservation (IR) requests dedicated areas and IR/AR requests shared areas according to requests type during resource allocation. Then spectrum fragmentation is reduced by dynamically controlling the boundary between the two areas. The work in [13] introduced a crosstalk-aware spectrum and core allocation strategy. Especially, inter-core crosstalk is mitigated by avoiding the allocation of the same spectrum slot between adjacent cores. Besides, fragmentation is avoided by making each sub-area of core only serve the requests with the same number of spectrum slots. In [14], we apply Elman neural network (ENN) technique to forecast the traffic requests and use a two-dimensional rectangular packing model to allocate spectrum so as to decrease unnecessary spectrum fragmentation. We also defined the core priority area and the overlapped spectrum can be avoided by alternately using the spectrum resource in each sub-area between adjacent cores.

In this paper, due to the diversity and dynamic of network applications, we consider to exploit deep learning (DL) model to predict requests in the future. That is, a deep learning and Hierarchical graph-assisted crosstalk-aware fragmentation avoidance (DLHGA) resource allocation strategy is proposed. The appropriate resource allocation algorithms are used to mitigate inter-core crosstalk and spectrum fragmentation, thereby improving network resource utilization and decreasing blocking probability. Compared to our previous work in [14], we make three major contributions. First, combined with historical network traffic data and DL techniques, future traffic requests sequence is more accurately predicted. Especially, in order to improve further the rationality of resource scheduling, we specifically predict the source and destination node of the requests ([14] only predict the size, the arrival time and holding time of future requests). Second, a hierarchical-graph-assisted three-dimensional (3D) network resource model is proposed to mitigate the inter-core crosstalk. To our knowledge, the hierarchical-graph model in SDM-EONs is first presented. Meanwhile, we design a best-effort approach based on the graph dyeing theory and a strict constrained approach based on dynamic distance adaptive modulation. Third, we develop a DL-assisted RCMSA algorithms based on the proposed hierarchical graph to adaptively allocate the resource layer by layer for addressing the dual challenges of inter-core crosstalk and spectrum fragmentation.

The rest of the paper is organized as follows. In Section 2, we introduce a traffic prediction model based on DL. In Section 3, a 3D network resource model considering inter-core crosstalk is established. In Section 4, we design the resource allocation algorithms layer by layer based on the 3D network resource model. Network performance evaluation is presented in Section 5. Finally, Section 6 concludes the paper.

2. DL traffic prediction model

With the extensive application of the machine learning (ML) technology, intelligent optical network has realized the prediction, classification, and recognition of network traffic [15,16]. However, the optical network traffic has burstiness and non-linearity, so the effect of relying on the traditional linear prediction method is not satisfactory. ML has more powerful autonomous learning and nonlinear prediction ability than the traditional linear prediction method. Furthermore, DL algorithms based on neural network, such as deep neural network, convolutional neural network, and recurrent neural network (RNN) are widely used for predicting and classifying of time series. DL algorithms exploit greater self-learning capability through complex architecture to catch the inherent features of massive traffic data. In other words, DL have much better performance than the ML to accurately characterize the inherent correlation of variations of future traffic [17]. The authors in [18] first applied DL techniques to predict network traffic matrices. Further, based on the holding time of the requests, the authors in [19] proposed a DL-assisted resource allocation strategy in data center optical network to improve traffic prediction accuracy. In summary, if the network state information or some crucial parameters of the new requests could be predicted by some appropriate methods, the spectrum fragmentation in the SDM-EONs can be effectively mitigated.

In [20], BP is used to predict the traffic load of the lightpaths to achieve establishment of the lightpaths, but BP is not suitable for processing long-term time series. ENN is used to predict the size, arrival time and holding time information of the requests [14], but ENN is easy to fall into the local optimal solution during training [21]. Obviously, it is not suitable to predict time series with long dependencies. In general, the above works is not suitable for our problem. Long short-term memory (LSTM) is an important improvement for the general RNN, which solves the gradient problem in weight update. In addition, LSTM not only can well handle long-term time series prediction, but also can deal with nonlinear high-dimensional time series, which is good at extracting the feature of long-term traffic and mitigate the error accumulation [22]. To this end, we make use of LSTM to achieve fast and accurate prediction of time-varying traffic in SDM-EONs.

In this paper, we design the DL model based on LSTM to capture more attributes of the requests. Unlike the work of [20], which only predicts the size of requests, we also predict the arrival time and holding time of the requests in order to plan resource reasonably. In addition, compared with [14], so as to make the best of the adaptive modulation of EONs to improve spectrum efficiency and quality of service (QoS), we also need to know the source and destination node information of the requests in advance. Thus, we predict five attributes of the requests, and denote the requests at time ${t}$ as ${{R_t}(s,{\textrm { }}d,{\textrm { }}b,{\textrm { }}{t_a},{\textrm { }}{t_h})}$, where ${s}$, ${d}$, ${b}$, ${t_a}$, ${t_h}$ are the source node, destination node, size, arrival time and holding time of the request, respectively. Figure 1 shows the DL model. It contains three main parts, an input layer, a deep LSTM network, and an output layer. The deep LSTM network is composed of three stacked LSTM layers, and is used to extract network traffic features with excellent self-learning capability.

Fig. 1. Architecture of the DL model based on LSTM

Download Full Size | PDF

Given an input request vector ${R_t}$, the hidden layer vector ${h_t^n}$ and output vector ${R'_t}$ are calculated as following:

(1)$$h_t^1 = \sigma \left( {{R_t}} \right)$$

(2)$$h_t^n = \sigma (h_t^{n - 1},h_{t - 1}^n)$$

(3)$${R'_t} = \Theta ({W^{n,n + 1}}h_t^n + {\alpha ^{n + 1}})$$

where ${h_t^n}$ denotes the outputs of the ${n}$-th (${n}$ = 1, 2, 3, 4) hidden layer at time ${t}$. ${\sigma }$ is sigmoid activation function in the deep LSTM network. ${W}$ represents the weight matrix from the hidden layer to the output layer, and ${\alpha }$ denotes a bias vector to the output layers. ${\Theta }$ is the activation function which can be represented as tanh.

The only different component between standard RNN and LSTM is the architecture of each hidden layer. LSTM has a complex architecture named LSTM cell in its hidden layer, which is shown in Fig. 2. It has three gates named input gate, forget gate and output gate. These gates control the information flow through the cell and the recurrent neural network. To calculate cell output ${C_t}$ and hidden layer output ${h_t^n}$ in Fig. 2, we divide the process into three steps:

Fig. 2. Architecture of the LSTM cell

Download Full Size | PDF

Step 1: For a given input ${r_t}$ at time ${t}$, cell state ${C_{t - 1}}$ and hidden layer output ${h_{t - 1}^n}$ at time ${t - 1}$, the input gate ${i_t}$, forget gate ${f_t}$, output gate ${o_t}$, and input cell gate ${{\tilde C_t}}$ in Fig. 2 can be calculated by the following formulas:

(4)$${i_t} = \sigma ({W_{ix}}{r_t} + {W_{ih}}h_{t - 1}^n + {\alpha _i})$$

(5)$${f_t} = \sigma ({W_{fx}}{r_t} + {W_{fh}}h_{t - 1}^n + {\alpha _f})$$

(6)$${o_t} = \sigma ({W_{ox}}{r_t} + {W_{oh}}h_{t - 1}^n + {\alpha _o})$$

(7)$${\tilde C_t} = \Theta ({W_{Cx}}{r_t} + {W_{Ch}}h_{t - 1}^n + {\alpha _C})$$

where ${W_ *}$ is the weight matrices of ${r_t}$ and ${h_{t - 1}^n}$ connected to three gates and cell input. ${{\alpha _ * }}$ is the bias terms of three gates and cell input.

Step 2: Calculate the cell output state ${{C_t}}$:

(8)$${C_t} = {f_t} * {C_{t - 1}} + {i_t} * {\tilde C_t}$$

Step 3: Calculate the hidden layer output ${h_t^n}$:

(9)$$h_t^n = {o_t} * \Theta ({C_t})$$

A pseudocode description for the training process of DL model (TPDL) is provided as Algorithm 1.

In Algorithm 1, each sample ${r_t}$ in the dataset represents the present request, ${r_{t + 1}}$ is the next request. First, we divide the dataset into training set and testing set, then initialize the parameters of weights matrices, bias vector and epoch (Line 1-3). The training process of DL model is shown in Line 4-13 by a batch learning approach which considers trade-off between the accuracy and complexity of the training. The final output ${{R'_t}\left ( {s,{\textrm { }}d,{\textrm { }}b,{\textrm { }}{t_a},{\textrm { }}{t_h}} \right )}$ represents the prediction results of requests (including the source node, destination node, size, arrival time and holding time of requests). At this time, the loss function can be calculated. Then, the parameters will be returned if ${Loss\;<\;\xi }$ and ${\delta = 100}$ (${\xi }$ is the predefined minimum error threshold, ${\delta }$ is the predefined maximum epoch). Conversely, the Adam algorithm [23] is used to update the parameters ${{W_ * }}$ and ${{\alpha _ * }}$. Finally, after being trained by Algorithm 1, the DL model is applied to obtain the next requests, and the prediction results will be used to estimate the inter-core crosstalk in section 3 and support the following resource allocation for RCMSA in section 4.

3. 3D network resource model based on hierarchical graph

In this section, in order to alleviate the inter-core crosstalk in SDM-EONs, we first design a 3D network resource model according to the characteristics of MCF. Then, a resource allocation model is proposed, which make a tradeoff between best-effort approach in the inter-core and strict constrained approach in the intra-core. To the best of our knowledge, this is the first time to mitigate inter-core crosstalk from the perspective of the best-effort approach and strict constrained approach. As show in Fig. 3, inter-core crosstalk is generated when the same spectrum slot of adjacent cores are occupied. It causes serious physical layer impairment and decreases QoS, resulting in increasing the network blocking probability [4].

Fig. 3. Example of crosstalk in MCF

Download Full Size | PDF

In order to mitigate inter-core crosstalk and increase the multiplexing degree of cores per fiber, the authors in [7] designed a trench-assisted MCF based on the energy coupling theory to quantitatively analyze the inter-core crosstalk ${XT}$ of MCF. The statistical mean inter-crosstalk of a homogenous of an MCF can be expressed as follows:

(10)$$XT = \frac{{N - N \cdot {\textrm{exp}}\left( { - \left( {N + 1} \right) \cdot 2 \cdot L \cdot H} \right)}}{{1 + N \cdot {\textrm{exp}}\left( { - \left( {N + 1} \right) \cdot 2 \cdot L \cdot H} \right)}}$$

(11)$$H = \frac{{2 \cdot {\kappa ^2} \cdot \gamma }}{{\mu \cdot \Lambda }}$$

where ${N}$, ${L}$, ${H}$ denote the number of the adjacent cores, the transmission distance and the mean increase in inter-core crosstalk per unit length, ${\kappa }$, ${\gamma }$, ${\mu }$, ${\Lambda }$ are the coupling coefficient, bend radius, propagation constant, and core-pitch of fiber parameters. From the formula (10) and (11), we can notice that inter-core crosstalk is positively correlated with the number of adjacent cores ${N}$ and the transmission distance ${L}$. Thus, we avoid inter-core crosstalk from two aspects: best-effort approach (the number ${N}$ of adjacent cores) and strict constrained approach (transmission distance ${L}$). When choosing a core, we try our best to avoid the adjacency between the cores using the graph dyeing theory. That is, we adopted the best-effort approach. Then, to get the optimal spectrum placement, we adopt dynamic adaptive modulation in the core. That is, the strict constrained approach is used.

3.1 Avoidance of inter-core adjacency

Generally, when the same spectrum slots overlap on the adjacent cores, inter-core crosstalk will occur. However, the inter-core crosstalk for different spectrum slots or none-adjacent cores can be neglected [5]. To make the best effort to reduce the number of adjacent cores and avoid the inter-core adjacency, we adopt the graph dyeing theory to divide adjacent cores into different priorities, and select the core according to the priority. Especially, the cores section is abstracted as a non-ringed graph ${Q}$ , and the cores are abstracted as a vertex set ${A\left ( Q \right )}$, where ${\Psi }$ is a kind of division of vertex set ${A\left ( Q \right )}$ with ${K}$ colors, so that any adjacent vertex are dyed with different colors. The division process is shown in formula (12) to (14):

(12)$${A_e} = {\Psi ^{ - 1}}\left( e \right) = \left\{ {y \in A\left( Q \right)\left| {\Psi \left( y \right) = e} \right.} \right\},{\textrm{ }}\left( {e = 1,2, \ldots ,K} \right)$$

(13)$$\Psi = \left( {{A_1},{A_2}, \ldots ,{A_K}} \right)$$

(14)$$s.t.{\textrm{ }}:\left\{ \begin{gathered} {A_e} \cap {A_u} = \emptyset \\ {\textrm{ }}\bigcup_{e = 1}^K {{A_e} = A\left( Q \right)} \\ \end{gathered} \right.$$

where ${{A_e}}$ and ${{\Psi ^{ - 1}}\left ( e \right )}$ are the independent set of vertex set ${A\left ( Q \right )}$.

Taking seven-core fiber the most commonly used as an example, as shown in Fig. 4(a). The seven cores could be divided into three independent sets by using the formula (12) to (14). These three sets are labeled as ${{P_1}}$, ${{P_2}}$, ${{P_3}}$ by priorities from high to low, and are dyed yellow, green and blue, respectively. It is worth mentioning that our dyeing method can be expanded. For example, a twelve-core fiber can be divided into three colors, and a nineteen-core fiber can be divided into four colors. Where the cores ${\left \{ {{C_1},{\textrm { }}{C_2},{\textrm { }}{C_3}} \right \}}$ are not adjacent to each other, and the inter-core crosstalk between none-adjacent cores is extremely low, which can be neglected [11]. Therefore, the priority of the cores ${\left \{ {{C_1},{\textrm { }}{C_2},{\textrm { }}{C_3}} \right \}}$ are the highest, which are labeled as ${\left \{ {{C_1},{\textrm { }}{C_2},{\textrm { }}{C_3}} \right \} \in {P_1}}$. The cores ${\left \{ {{C_4},{\textrm { }}{C_5},{\textrm { }}{C_6}} \right \}}$ have the secondary priority, which are labeled as ${\{ {C_4},{\textrm { }}{C_5},{\textrm { }}{C_6}\} \in {P_2}}$. Obviously, the central core ${\left \{ {{C_7}} \right \}}$ has the largest number of adjacent cores of which inter-core crosstalk is more serious, so it has the lowest priority, which is labeled as ${\{ {C_7}\} \in {P_3}}$. On this basis, a hierarchical-graph-assisted core (${C}$), spectrum (${F}$), time (${T}$) 3D network resource model is designed as Fig. 4(b). The yellow area composed of the cores ${\left \{ {{C_1},{\textrm { }}{C_2},{\textrm { }}{C_3}} \right \}}$ of the priority ${{P_1}}$ is the first layer of the 3D model. The green area composed of the cores ${\{ {C_4},{\textrm { }}{C_5},{\textrm { }}{C_6}\}}$ of the priority ${{P_2}}$ is the second layer. And the blue area composed of the core ${\{ {C_7}\}}$ of the priority ${{P_3}}$ is the last layer.

Fig. 4. 3D network resource model based on hierarchical graph

Download Full Size | PDF

3.2 Adjustment of intra-core crosstalk threshold

Up to the present, the above of hierarchical graph 3D network resource model fully considers the inter-core crosstalk from the perspective of the number of adjacent cores and provide a best-effort solution, but it only provides medium resource utilization and there is a possibility that some of the connections greatly exceed acceptable inter-core crosstalk values. Therefore, in heavy-load situation we further develop a strict constrained approach based on dynamic distance adaptive modulation in intra-core from the perspective of the transmission distance. The existing studies adopt the fixed crosstalk threshold [7–9], which bring some inevitable drawbacks. If the crosstalk threshold is set too big, the QoT of the signal will decrease. Conversely, it will lead to unnecessary core unavailability and increase the blocking rate of the network. As we know, the application of O-OFDM makes the network manager to implement a dynamically adaptive modulation according to the intra-core transmission distance of the requests. The acceptable crosstalk threshold values for different modulation formats are also different, as shown in Table 1. Considering the inter-core crosstalk from the perspective of the transmission distance, we proposed a dynamic distance adaptive modulation with downgrading (DDAMD) algorithm, as described in Algorithm 2.

Table 1. Transmission rate, maximal transmission distance and crosstalk threshold of different modulation formats [2,24]

View Table | View all tables in this article

The proposed DDAMD algorithm is to search for an available RCMSA. In line 1, initialize the modulation formats set $M$ and sort modulations in the set in descending order. In line 2, according to modulations in set ${M_j}$, initialize the crosstalk threshold set $\Phi$. In line 3-4, according to the information of the source and destination node from the proposed DL model, we can calculate the transmission distance $L$ of the candidate lightpath in advance for a new request ${R'_i}$ by Dijkstra algorithm and select the available core ${C_q}$ based on the core priority. In line 5-6, the available modulation formats can be acquired for each path according to the transmission distance limitation and we preferentially select the available highest modulation format. Once a modulation format is selected, we can calculate the number of spectrum slots $SS$ according to the request size from the DL model. At the same time, we obtain the spectrum placements that satisfy the continuity and contiguity constraints on each candidate path for the new request. The inter-core crosstalk $X{T_i}$ for ${R'_i}$ and the inter-core crosstalk $X{T_{ed}}$ for each request ${R'_{ed}}$ affected can be calculated by Eqs. (11) and (12). In line 7-18, if $X{T_i}$ and $X{T_{ed}}$ meet the strict constraint of the crosstalk threshold, this RCMSA will be used. Otherwise, if the selected modulation format is not ${M_1}$ (BPSK), downgrade the modulation format by one level and recalculate the $SS$, $X{T_i}$ and $X{T_{ed}}$ until the constraint of the crosstalk threshold is satisfied. Otherwise, block ${R'_i}$.

4. DL-assisted resource allocation layer by layer

Based on the DL traffic prediction model in Section 2, the source node, destination node, size, arrival time and holding time of the future requests can be accurately predicted. In this section, the corresponding resource allocation can be implemented in the 3D network resource model in Section 3 ahead of time. Since the inter-core crosstalk is fully considered from the perspective of the inter-core adjacency and the transmission distance in Section 3, the resource allocation in this section will mainly solve the spectrum fragmentation issue. When a new request arrives, we prefer to select the core in the first layer. In this case, we have adopted best-effort approach to mitigate the inter-core crosstalk, so we only need to consider the spectrum fragmentation, and maximize the spectrum resource utilization of each core in the first layer. If the cores of the first layer is unavailable, we will select the core in the second layer. In this layer inter-core crosstalk and spectrum fragmentation will be occur at this time. Therefore, a strict constrained approach based on DDAMD algorithm is applied to mitigate the inter-core crosstalk and reduce spectrum fragmentation. The resource allocation of the first and second layers are designed in section 4.1 and 4.2, respectively. Furthermore, the core ${{C_{\textrm {7}}}}$ is generally used last, because the core ${{C_{\textrm {7}}}}$ of the third layer is adjacent to all the other cores resulting in the serious inter-core crosstalk. In the core ${{C_{\textrm {7}}}}$, we adopt First-Fit algorithm that considers crosstalk threshold constraint to mitigate the inter-core crosstalk.

4.1 Resource allocation without inter-core crosstalk in first layer

The resource occupancy of the first layer in 3D network resource model at a certain moment is shown in the Fig. 5. Especially, the unique feature of the first layer is not adjacent between the cores. However, as the requests dynamically arrive and leave, the resource of spectrum and time dimension become fragmented, which will affect the resource utilization and lead to higher network blocking probability. Hence the first layer of the 3D network resource model mainly needs to consider the spectrum fragmentation.

Fig. 5. Resource occupancy of the first layer in 3D network resource model

Download Full Size | PDF

Further, the resource occupancy of single core of each link in 3D network resource model can be abstracted as a two-dimensional (2D) resource pool, as show in Fig. 6. The horizontal axis in the 2D resource pool represents the spectrum resource ${F}$ (in units of slots corresponding to a bandwidth ${B}$= 12.5 GHz), and the vertical axis represents the time resource ${T}$ (in units of slots corresponding to a time slot ${\tau }$= 10 min [25]).

Fig. 6. 2D resource pool

Download Full Size | PDF

The time dimension is also divided in non-overlapping time slots which represent the unit of time for which spectrum is allocated. Combined with the current network resource conditions (link, core, spectrum and time resource), we focus on resource scheduling between ${0\;<\;t\;<\;\Delta T}$, where ${\Delta T}$ represents the time interval feature-learning of DL model. To sum up, ${\Delta T}$ represents how far into the future resources will be allocated (reserved) to accommodate the requests that are expected to arrive later. Each request ${{R_i}\left ( {s,{\textrm { }}d,{\textrm { }}{b_i},{\textrm { }}{t_a},{\textrm { }}{t_h}} \right )}$ can be represented as a rectangle area ${{S_{{\textrm {ST}}}} = SS \times TS}$ in the 2D resource pool, where ${SS}$ is the number of spectrum slots required (corresponding to the size ${{b_i}}$ of the request, which is captured by DL model) and ${TS}$ is the number of time slots required (corresponding to the holding time ${{t_h}}$ of the request, which is captured by DL model). Then, the number of spectrum slots and time slots occupied in the area ${{S_{{\textrm {ST}}}}}$ can be obtained:

(15)$$SS = \left\lceil {\frac{b}{{{\vartheta _{{\textrm{BPSK}}}} \cdot \log _2^M}}} \right\rceil$$

(16)$$TS = \left\lceil {\frac{{{t_h}}}{\tau }} \right\rceil$$

where ${M}$ denotes the modulation format used for the request, and ${M}$ is determined by dynamic distance adaptive modulation. ${{\vartheta _{{\textrm {BPSK}}}}}$ is the transmission capacity supported by per spectrum slot when using the BPSK modulation.

Given a number of requests that are forecasted to arrive within the time (i.e., requests with ${{t_a} \leqslant t + \Delta T}$), the problem of allocating spectrum on a specific core of a specific link in the network can be translated into a 2D rectangular packing [14,25,26]. In 2D rectangular packing, the objective is to place rectangular items on a predefined rectangular area so that no items overlap and the area covered by the placed items is maximized. With this insight, the spectrum allocation problem is simplified and we can build upon results from 2D rectangular packing. In particular, we still exploit the 2D rectangular packing spectrum allocation (2D-RPSA) algorithm, designed in section 4.2 of [14], to get an appropriate solution, which can enable 2D resource pool to accommodate as many requests as possible. Due to page constraints, we omit the introduction of the 2D-RPSA algorithm, and refer the reader to [14] for all the details.

4.2 Resource allocation with inter-core crosstalk in second layer

Since the cores ${\{ {C_4},{C_5},{C_6}\}}$ are adjacent to the cores ${\{ {C_1},{C_2},{C_3}\}}$ of the first layer, it is necessary to consider not only the spectrum fragments but also the inter-core crosstalk for the resource allocation of the core in the second layer. To this end, a crosstalk-aware and fragmentation-avoiding resource allocation (CFRA) algorithm combined with Algorithm 2 is proposed. The process of resource allocation is as Algorithm 3.

In CFRA algorithm, we first select a suitable core considering the priority and search for available spectrum slots on this core (Line 1). Then, we use Eqs. (10) and (11) to calculate the inter-core crosstalk ${X{T_i}}$ and $X{T_{ed}}$ generated by the candidate core and the corresponding spectrum placements (Line 2-3). Finally, if ${X{T_i}}$ and $X{T_{ed}}$ are respectively lower than the corresponding inter-core crosstalk, we implement 2D-RPSA algorithm in section 4.2 of [14]. Otherwise, the Algorithm 2 will be implemented (Line 4-7).

4.3 DL-Assisted crosstalk-aware fragmentation-avoiding resource allocation

The routing, core, modulation and spectrum allocation algorithm for SDM-EONs is shown as Algorithm 4. Firstly, we implement Algorithm 1 with the proposed DL model to predict the relative attributes of the coming traffic requests in the future (i.e., requests with ${{t_a} \leqslant t + \Delta T}$). The attributes of each request include the source node, the destination node, size, arrival time and holding time. Then, combined with 3D network resource model based on hierarchical graph, we propose a deep learning and hierarchical graph-assisted crosstalk-aware fragmentation avoidance (DLHGA) resource allocation strategy.

The proposed DLHGA strategy mainly includes two parts, namely the deep learning prediction model of Algorithm 1 and the RCMSA method of Algorithm 3. Before the request arrives, the DL model is trained by collecting historical network traffic to predict the request in the future. When the request arrives, the core resources and spectrum resources are reasonably allocated in combination with the request-related attributes predicted in advance by Algorithm 3. The proposed DLHGA strategy is implemented periodically in each ${\Delta T}$ to allocate resource in SDM-EONs. First of all, input the corresponding network topology and the historical requests sequence ${{\mathbf {R}}}$. Line 1-8 predicts the five attributes of the requests by the proposed DL model, and sorts requests in set ${{\mathbf {\hat R}}}$ according to their arrival times. Line 9-12 establish lightpaths, select core and calculate the suitable modulation format according to the source node ${s}$ and destination node ${d}$ of the connection acquired in advance by DL traffic prediction model. Here, in Line 10, for each request ${{R'_i}}$, ${k}$ shortest paths are calculated. Then, in Line 11, we select the available core ${{C_q}}$ according to the priority and calculate the available set ${M_j}$ of modulation formats at this time. Note that, Line 13-26 implement adaptive allocation algorithms based on the selected core. In Line 13-14, according to the arrival time ${{t_a}}$, the holding time ${{t_h}}$ and the size ${b}$ of the requests predicted by DL model, we adopt the 2D-RPSA algorithm in section 4.2 of [14] on the first layer of 3D network resource model. Further, we adopt Algorithm 3 on the second layer in Line 16-17, which make full use of the predicted requests by DL model in the next ${\Delta T}$. Since the third layer core is adjacent to the others, we use a First-Fit algorithm that considers crosstalk threshold constraint to allocate requests in Line 19-24. Eventually, return the available RCMSA method, or block the request.

5. Performance evaluation

5.1 Simulation setup

We develop a Python simulation tool to implement the proposed DLHGA strategy. For the experiments, we use two topologies: the 14-node, 21-link NSFNET and the 24-node, 43-link USNET shown in Figs. 7(a) and 7(b), respectively. The number of candidate routes is ${k = 3}$, all links have a single MCF with 7 cores as shown in Fig. 4(a), and each core has 320 spectrum slots [13]. The bandwidth of each spectrum slot is assumed to be 12.5 GHz, and each time slot is 10 minutes [25]. Fiber parameters is ${\kappa {\textrm { = }}4 \times {10^{ - 4}}}$, ${\gamma {\textrm { = }}0.05{\textrm {m}}}$, ${\mu {\textrm { = }}4 \times {10^6}{{\textrm {m}}^{ - 1}}}$, ${\Lambda = 4.0 \times {10^{ - 5}}{\textrm {m}}}$ [2]. To ensure the QoS, an adaptive modulation format is adopted, including BPSK, QPSK, 8QAM, 16QAM and 32QAM and the corresponding crosstalk threshold ${X{T_{{\textrm {th}}}}}$ are shown in table I. The arrival of connection requests follows a Poisson process with ${\lambda }$, and the holding time of each request is assumed to follow a negative exponential distribution with ${{1 \mathord {\left /{\vphantom {1 \zeta }} \right. } \zeta }}$, traffic load is ${{\lambda \mathord {\left /{\vphantom {\lambda \zeta }} \right. } \zeta }}$ Erlang in the network. Sequence of requests are generated by uniformly selecting the source and destination node, and their required capacity values are evenly distributed between [12.5Gbps, 200Gbps]. We set the time interval of DL model ${\Delta T = 12}$ hours as in [14,27]. The strategy runs 500 times each time, and the results are obtained from 100000 requests, in which confidence interval is 95%.

Fig. 7. Spectrum utilization: (a) NSFNET, (b) USNET

Download Full Size | PDF

5.2 Performance of the DL model

In order to more powerfully illustrate the advantages of DL model based on LSTM, we should first select the appropriate parameters of the LSTM, including the number of input units, hidden layers, hidden units, and units in the output layer. We assume that the number of input and output layer units is 72, because the size of time slot is 10 min (there are ${72 * 10}$ minutes in each ${\Delta T = 12}$). The number of hidden layers and units are shown in Table 2. Then we adopt Algorithm 1 to obtain the most suitable architecture and parameters for traffic prediction. Meanwhile, we collect the network traffic of the backbone optical network of the United States [14]. The actual network traffic requests in the first week are used as the historical data for feature-learning of DL model. Furthermore, to verify the performance of the DL model, we consider the predictive accuracy, mean square error (MSE) and running time of LSTM with different number of layers and hidden units. It’s obvious that the neural network is not the deeper, the better. When we stack up to a LSTM-4Layer, no clear benefits of depth can be observed anymore. Therefore, we adopt the LSTM-3Layer model with 128 hidden layer units.

Table 2. Performance of different DL parameters

View Table | View all tables in this article

To further evaluate the predictive accuracy and generalization characteristics of the DL model based on LSTM-3Layer, we compare it with BP, ENN, and RNN neural networks, as shown in Table 3. What is more, the accuracy of the prediction of each attribute of the request affects the performance of the proposed DLHGA strategy. In particular, the prediction of the source node and destination node are the most important to the performance of the RCMSA algorithm [22]. We first performed an experimental analysis on the accuracy of the source and destination nodes and found that the proposed LSTM-3Layer has the highest accuracy, and BP and ENN have the lowest accuracy. This is because BP does not consider the relation between traffic flow time series, and ENN can not effectively extract inherent long-term traffic flow features. Second, we compared the prediction performance of the size, arrival time, holding time. It is also found that the proposed LSTM-3Layer has the highest accuracy, especially on the prediction performance of destination node, the proposed LSTM-3Layer is 19.69% higher than the RNN model. The reason is that RNN only considers the most recent state and the LSTM can automatically decide to retain and forget part of the information from inherent long-term traffic flow features. Obvious, it can be seen that LSTM-Layer has the longest running time, but because the predictions happen before the request arrives, these calculations are worth it. It is worth mentioning that in order to reduce the cumulative error of the DL traffic prediction model, we periodically input the actual traffic in the network into the model to update its parameters in a real-time way and increase the prediction accuracy of the model. That is, LSTM-3Layer is more suitable for network traffic prediction with the highest predictive accuracy and a fairly good generalization ability. After the iteration, its MSE value is 0.19 during a negligible running time.

Table 3. Performance of different prediction model

View Table | View all tables in this article

Figure 8 depicts the variation of traffic requests in two weeks. The actual traffic of the first week is used as the historical data for the feature-learning of DL model. The network traffic of the second week is predicted and compared with the actual network traffic. As shown in Fig. 8, after being trained by Algorithm 1, the DL model can capture the behavior of the past traffic profile and the predictive accuracy of the model keep about 97.48%. It is evident that the generalization ability of DL model is very remarkable, and DL model can accurately capture the attributes of the future traffic.

Fig. 8. Traffic prediction using DL model

Download Full Size | PDF

5.3 Performance of resource allocation

The performances of the proposed DLHGA strategy are compared with four benchmark strategies. The First-Fit is a baseline strategy that allocates available cores and spectrum slots when the indices of the allocated slots are minimized. The XT-aware spectrum and core assignment strategy (XT-SCA) was proposed in [13], which applies two predefined method. The first is a core selection priority from the view of best-effort approach for mitigating inter-core crosstalk, and the second is a sub-area of core classification for reducing spectrum fragmentation. The ML-assisted fragmentation avoidance (MLFA) strategy was proposed in [14], which combines ENN and two-dimensional rectangular packing model. Our proposed DLHGA strategy without traffic prediction is named HGA.

We compare the five strategies with respect to four metrics:

• Crosstalk per slot (CpS), which is defined as the ratio of the arrangement of spectrum slots that generates inter-core crosstalk to the number of used spectrum slots.
• Average number of fragments (AF), we assume that each request occupies at least two spectrum slots, and consider these spectrum blocks with one spectrum slot as spectrum fragments. AF represents the ratio of the total number of spectrum fragments in the network to the total number of requests.
• Spectrum utilization (SU), i.e., the resource being used in relation to the total allocated resource in network.
• Bandwidth blocking probability (BBP), which is the percentage of the amount of blocked traffic in relation to the total bandwidth requested.

Figure 9 compares the crosstalk per slot (CpS) of the five comparison strategies. First-Fit causes serious inter-core crosstalk because it allocates spectrum resource from lower to higher spectrum slot indices on each core and does not consider the adjacent problem between the cores. XT-SCA uses an allocation method based on a predefined core priority and efficiently reduces inter-core crosstalk. MLFA avoids the spectrum becoming overlapped by alternately using the spectrum resource in each sub-area which is between adjacent cores. Obviously, XT-SCA and MLFA both adopt best-effort approach, so they have the rough same performance in terms of inter-core crosstalk across all traffic load situation and the topologies. On the other hand, HGA has almost the same inter-core crosstalk with DLHGA. This result can be explained by observing that the same spectrum resource allocation method is introduced in these two strategies. Both of them not only predefine core priority (best-effort approach) under the light-load situation, but also implement dynamic distance adaptive modulation with downgrading based on crosstalk threshold constraints (strict constraints approach) under the heavy-load situation. So our DLHGA strategy performs significantly better than the other four in CpS metric. In particular, DLHGA decreases CpS by about 19.5% and 12.6% than MLFA in NSFNET and USNET, respectively.

Fig. 9. Performance of crosstalk per slot: (a) NSFNET, (b) USNET

Download Full Size | PDF

Figure 10 plots the average number of fragments (AF) in different topologies. The First-Fit strategy generates the largest number of spectrum fragments relative to the others. This is attributed to the fact that First-Fit is a greedy strategy that always allocates the requests in its first available set of slots, and leads to lots of spectrum fragments. XT-SCA utilizes spectrum more efficiently than First-Fit, mainly due to the predefined core sub-area classification policy. Specifically, requests demanding the same bandwidth are preferentially allocated to core sub-area dedicated to that bandwidth, hence resource on each core is used efficiently. The number of AF in HGA is less than XT-SCA, because 2D rectangular packing used in HGA is a good heuristic method in resource allocation. In addition, with the traffic load increasing, HGA is higher than MLFA on AF. The reason is that the traffic predicted by MFLA is reasonably allocated in advance. However, the AF of the DLHGA strategy is slightly better than the MFLA strategy. This is because DLHGA adopts DL model and DDAMD algorithm in hierarchical graph-assisted 3D network resource model, it can predict requests more accurately and allocate spectrum resource more reasonably. An overall scheduling solution can be made for the requests during ${\Delta T}$, so we can obtain a global optimal result with the traffic prediction. Moreover, the AF in USNET topology is slightly lower than in NSFNET. This is because there are more available candidate routing paths and more allocation methods in the USNET. Especially, DLHGA have better performance in the complicated USNET topology, it is verified the fact that DLHGA can schedule resource sufficiently by traffic prediction.

Fig. 10. Performance of average number of fragments: (a) NSFNET, (b) USNET

Download Full Size | PDF

Figure 11 depicts the spectrum utilization (SU) for the five strategies. The First-Fit strategy has the lowest SU among all the strategies, and the reason is that the spectrum resource allocation always starts with the low index in First-Fit, which leads to more spectrum fragments as the set up and release of requests. The SU of XT-SCA is higher than First-Fit owing to the predefined core classification method. Meanwhile, the requests which have the same spectrum slots will be allocated, and the resource in each pre-classified core sub-area will be effectively used. The strategy is slightly better than XT-SCA on SU. Since XT-SCA utilizes best-effort approach, it will bring higher resource cost and lead to some spectrum unavailable. However, HGA adopts best-effort approach and strict constraints approach in 3D network resource, which achieves the balance between reducing inter-core crosstalk and improving resource utilization. The performance of MLFA is much better than HGA, which shows the importance of traffic prediction. But MLFA has a lower SU that DLHGA. This improvement is due to the combination of two factors: (1) DLHGA makes full use of adaptive modulation format and modulation format with degradation, and makes it possible to carry more requests, and (2) DLHGA adopts DL model, it can predict requests more accurately and make fewer requests be blocked. It can be seen that the SU of the DLHGA is the highest among all strategies and it improves about 21.5%, 16.8% on SU over the MLFA in NSFNET, USNET, respectively.

Fig. 11. Performance of spectrum utilization: (a) NSFNET, (b) USNET

Download Full Size | PDF

Figure 12 shows the network bandwidth blocking probability (BBP) of the five strategies. The BBP of XT-SCA is lower than First-Fit, because the harmonious operation in the spectrum and core allocation reduces the spectrum fragmentation. Therefore, XT-SCA can accommodate more requests. HGA and XT-SCA have similar blocking probability. And the performance of MLFA is much better than the HGA, because MLFA can schedule resource sufficiently in advance. BBP of DLHGA is averagely 24.3% and 15.7% lower than MLFA in NSFNET and USNET, respectively. The first reason is that DLHGA acquires higher spectrum utilization by drastically reduces the spectrum fragmentation in the network. It means more requests can be carried in the limited resource. And the second reason is that DLHGA achieves lower inter-core crosstalk by adopts adaptive modulation to dynamically adjust the threshold of inter-core crosstalk based on hierarchical graph model. It means that fewer requests are blocked due to physical impairments. The BBP in USNET topology has similar tendency as NSFNET. However, the BBP in USNET is slightly lower, since the alternative paths in USNET are more than in NSFNET, which makes it easier to find candidate paths and cores for the arriving requests.

Fig. 12. Performance of bandwidth blocking probability: (a) NSFNET, (b) USNET

Download Full Size | PDF

6. Conclusion

While SDM technology improves the capacity and flexibility of EONs, it also introduces inter-core crosstalk and serious spectrum fragmentation. To solve this problem, we proposed a novel strategy, which named DLHGA in this paper. The proposed strategy first exploits DL model based on multilayer LSTM to get five important attributes of the new traffic requests in advance. Then, a 3D network resource model based on hierarchical graph is modeled to reduce the inter-core crosstalk and the spectrum fragmentation. In the first layer, a best-effort approach based on graph dyeing theory is designed to avoid inter-core crosstalk and a more reasonable resource schedule is achieved by the predicted size, arrival time and holding time information of the future request from DL model. In the second layer, a strict constrained approach based on distance adaptive modulation with downgrading is proposed to overcome the tradeoff between inter-core crosstalk and resource cost by the predicted source and destination node information of the future request from DL model. Simulation results of our study shown that compared with other baseline strategies, traffic prediction using DL model, combined with carefully designed inter-core and intra-core resource allocation strategies based on hierarchical graph for MCF are quite effective improving network performance by alleviating the impact of both spectrum fragmentation and inter-core crosstalk.

Funding

National Natural Science Foundation of China (61401052); China Scholarship Council (201608500030); Chongqing Municipal Education Commission (KJ1400418, KJ1500445); Chongqing University of Posts and Telecommunications (A2015-09); Program for Innovation Team Building at Institutions of Higher Education in Chongqing (CXTDX201601020).

Disclosures

There are no conflicts of interest.

References

1. A. D. Ellis, J. Zhao, and D. Cotter, “Approaching the non-linear shannon limit,” J. Lightwave Technol. 28(4), 423–433 (2010). [CrossRef]

2. A. Muhammad, G. Zervas, and R. Forchheimer, “Resource allocation for space-division multiplexing: Optical white box versus optical black box networking,” J. Lightwave Technol. 33(23), 4928–4941 (2015). [CrossRef]

3. G. M. Saridis, D. Alexandropoulos, G. Zervas, and D. Simeonidou, “Survey and evaluation of space division multiplexing: From technologies to optical networks,” IEEE Commun. Surv. Tutorials 17(4), 2136–2156 (2015). [CrossRef]

4. H. Yuan, M. Furdek, A. Muhammad, A. Saljoghei, L. Wosinska, and G. Zervas, “Space-division multiplexing in data center networks: on multi-core fiber solutions and crosstalk-suppressed resource allocation,” J. Opt. Commun. Netw. 10(4), 272–288 (2018). [CrossRef]

5. M. Klinkowski, P. Lechowicz, and K. Walkowiak, “Survey of resource allocation schemes and algorithms in spectrally-spatially flexible optical networking,” Opt. Switch. Netw. 27, 58–78 (2018). [CrossRef]

6. S. Fujii, Y. Hirota, H. Tode, and K. Murakami, “On-demand spectrum and core allocation for reducing crosstalk in multicore fibers in elastic optical networks,” J. Opt. Commun. Netw. 6(12), 1059–1071 (2014). [CrossRef]

7. J. Tu, K. Saitoh, M. Koshiba, K. Takenaga, and S. Matsuo, “Design and analysis of large-effective-area heterogeneous trench-assisted multi-core fiber,” Opt. Express 20(14), 15157–15170 (2012). [CrossRef]

8. M. Yang, Y. Zhang, and Q. Wu, “Routing, spectrum, and core assignment in sdm-eons with mcf: Node-arc ilp/milp methods and an efficient xt-aware heuristic algorithm,” J. Opt. Commun. Netw. 10(3), 195–208 (2018). [CrossRef]

9. Q. Yao, H. Yang, R. Zhu, A. Yu, W. Bai, Y. Tan, J. Zhang, and H. Xiao, “Core, mode, and spectrum assignment based on machine learning in space division multiplexing elastic optical networks,” IEEE Access 6, 15898–15907 (2018). [CrossRef]

10. Y. Zhao, L. Hu, R. Zhu, X. Yu, X. Wang, and J. Zhang, “Crosstalk-aware spectrum defragmentation based on spectrum compactness in space division multiplexing enabled elastic optical networks with multicore fiber,” IEEE Access 6, 15346–15355 (2018). [CrossRef]

11. Y. Zhao, L. Hu, R. Zhu, X. Yu, Y. Li, W. Wang, and J. Zhang, “Crosstalk-aware spectrum defragmentation by re-provisioning advance reservation requests in space division multiplexing enabled elastic optical networks with multi-core fiber,” Opt. Express 27(4), 5014–5032 (2019). [CrossRef]

12. S. Sugihara, Y. Hirota, S. Fujii, H. Tode, and T. Watanabe, “Dynamic resource allocation for immediate and advance reservation in space-division-multiplexing-based elastic optical networks,” J. Opt. Commun. Netw. 9(3), 183–197 (2017). [CrossRef]

13. H. Tode and Y. Hirota, “Routing, spectrum, and core and/or mode assignment on space-division multiplexing optical networks,” J. Opt. Commun. Netw. 9(1), A99–A113 (2017). [CrossRef]

14. Y. Xiong, Y. Yang, Y. Ye, and G. N. Rouskas, “A machine learning approach to mitigating fragmentation and crosstalk in space division multiplexing elastic optical networks,” Opt. Fiber Technol. 50, 99–107 (2019). [CrossRef]

15. J. Mata, I. de Miguel, R. J. Durán, N. Merayo, S. K. Singh, A. Jukan, and M. Chamania, “Artificial intelligence (ai) methods in optical networks: A comprehensive survey,” Opt. Switch. Netw. 28, 43–57 (2018). [CrossRef]

16. Y. Zhao, B. Yan, Z. Li, W. Wang, Y. Wang, and J. Zhang, “Coordination between control layer ai and on-board ai in optical transport networks,” J. Opt. Commun. Netw. 12(1), A49–A57 (2020). [CrossRef]

17. J. Guo and Z. Zhu, “When deep learning meets inter-datacenter optical network management: Advantages and vulnerabilities,” J. Lightwave Technol. 36(20), 4761–4773 (2018). [CrossRef]

18. T. H. H. Aldhyani and M. R. Joshi, “Integration of time series models with soft clustering to enhance network traffic forecasting,” in 2016 Second International Conference on Research in Computational Intelligence and Communication Networks (ICRCICN), (2016), pp. 212–214.

19. S. K. Singh and A. Jukan, “Machine-learning-based prediction for resource (re)allocation in optical data center networks,” J. Opt. Commun. Netw. 10(10), D12–D28 (2018). [CrossRef]

20. Y. Xiong, J. Shi, Y. Yang, Y. Lv, and G. N. Rouskas, “Lightpath management in sdn-based elastic optical networks with power consumption considerations,” J. Lightwave Technol. 36(9), 1650–1660 (2018). [CrossRef]

21. Y. Bengio, I. Goodfellow, and A. Courville, Deep learning, vol. 1 (Citeseer, 2017).

22. B. Li, W. Lu, S. Liu, and Z. Zhu, “Deep-learning-assisted network orchestration for on-demand and cost-effective vnf service chaining in inter-dc elastic optical networks,” J. Opt. Commun. Netw. 10(10), D29–D41 (2018). [CrossRef]

23. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 (2014).

24. J. Perelló, J. M. Gené, A. Pagès, J. A. Lazaro, and S. Spadaro, “Flex-grid/sdm backbone network design with inter-core xt-limited transmission reach,” J. Opt. Commun. Netw. 8(8), 540–552 (2016). [CrossRef]

25. R. Zhu, Y. Zhao, H. Yang, X. Yu, Y. Tan, J. Zhang, N. Wang, and J. P. Jue, “Multi-dimensional resource assignment in spatial division multiplexing enabled elastic optical networks with multi-core fibers,” in 2016 15th International Conference on Optical Communications and Networks (ICOCN), (2016), pp. 1–3 .

26. Y. Xiong, X. Fan, and S. Liu, “Fairness enhanced dynamic routing and spectrum allocation in elastic optical networks,” IET Commun. 10(9), 1012–1020 (2016). [CrossRef]

27. N. Wang, J. P. Jue, X. Wang, Q. Zhang, H. C. Cankaya, Q. She, W. Xie, and M. Sekiya, “Scheduling large data flows in elastic optical inter-datacenter networks,” in 2015 IEEE Global Communications Conference (GLOBECOM), (2015), pp. 1–6 .

Modulation Format $M_{j}$	BPSK	QPSK	8-QAM	16-QAM	32-QAM
Transmission rate [Gb/s]	12.5	25	37.5	50	62.5
Maximal Transmission distance $L_{max}$ [km]	6300	3500	1200	600	300
Crosstalk threshold $X T_{th}$ [dB]	−14	−18.5	−21	−25	−27

Model	Hidden units	Accuracy(%)	MSE	Running time(s)
	64	72.45	1.23	0.62
LSTM-1Layer	128	87.14	0.95	0.71
	256	88.21	0.89	0.74
	512	86.28	0.98	0.82
	64	87.43	0.94	0.74
LSTM-2Layer	128	91.31	0.67	0.81
	256	92.34	0.54	0.90
	512	92.25	0.57	0.98
	64	94.16	0.41	0.89
LSTM-3Layer	128	97.48	0.19	0.97
	256	96.71	0.23	1.03
	512	96.12	0.27	1.25
	64	93.12	0.48	0.94
LSTM-4Layer	128	94.56	0.38	1.01
	256	91.67	0.64	1.12
	512	92.37	0.50	1.31

Model	Accuracy(%)						MSE	Running time(s)
Model	source node	destination node	size	arrival time	holding time	overall	MSE	Running time(s)
BP[20]	58.51	62.58	69.42	75.72	73.25	67.82	1.97	0.51
ENN[14]	63.23	60.97	73.45	83.52	87.45	73.56	1.14	0.60
RNN	80.28	78.15	82.23	86.41	84.45	81.54	1.01	0.61
LSTM-3Layer	94.71	93.54	97.85	98.05	97.76	97.48	0.19	0.97

Modulation Format $M_{j}$	BPSK	QPSK	8-QAM	16-QAM	32-QAM
Transmission rate [Gb/s]	12.5	25	37.5	50	62.5
Maximal Transmission distance $L_{max}$ [km]	6300	3500	1200	600	300
Crosstalk threshold $X T_{th}$ [dB]	−14	−18.5	−21	−25	−27

Model	Hidden units	Accuracy(%)	MSE	Running time(s)
	64	72.45	1.23	0.62
LSTM-1Layer	128	87.14	0.95	0.71
	256	88.21	0.89	0.74
	512	86.28	0.98	0.82
	64	87.43	0.94	0.74
LSTM-2Layer	128	91.31	0.67	0.81
	256	92.34	0.54	0.90
	512	92.25	0.57	0.98
	64	94.16	0.41	0.89
LSTM-3Layer	128	97.48	0.19	0.97
	256	96.71	0.23	1.03
	512	96.12	0.27	1.25
	64	93.12	0.48	0.94
LSTM-4Layer	128	94.56	0.38	1.01
	256	91.67	0.64	1.12
	512	92.37	0.50	1.31

Deep learning and hierarchical graph-assisted crosstalk-aware fragmentation avoidance strategy in space division multiplexing elastic optical networks

Abstract

1. Introduction

2. DL traffic prediction model

3. 3D network resource model based on hierarchical graph

3.1 Avoidance of inter-core adjacency

3.2 Adjustment of intra-core crosstalk threshold

4. DL-assisted resource allocation layer by layer

4.1 Resource allocation without inter-core crosstalk in first layer

4.2 Resource allocation with inter-core crosstalk in second layer

4.3 DL-Assisted crosstalk-aware fragmentation-avoiding resource allocation

5. Performance evaluation

5.1 Simulation setup

5.2 Performance of the DL model

5.3 Performance of resource allocation

6. Conclusion

Funding

Disclosures

References

Cited By

Figures (12)

Tables (3)

Equations (16)

Optics Express