Multi-link faults localization and restoration based on fuzzy fault set for dynamic optical networks

Yongli Zhao; Xin Li; Huadong Li; Xinbo Wang; Jie Zhang; Shanguo Huang

doi:10.1364/OE.21.001496

1. Introduction

Dynamic all optical networks with GMPLS control plane is very attractive for the future core network as it can reduce the power consumption of electronic switches and be very efficient in delivering ultra large files for datacenter application and backup. Due to the huge bandwidth supported by the networks, any fault in the fiber or other component can lead to huge amount of data loss, so it is very important to incorporate efficient network monitoring, fast faults diagnosis and restoration mechanisms into the optical network. As an important fault type, multi-link fault will deteriorate the network performance to a large extent, which may occur in a shared risk link group (SRLG) [1]. A SRLG failure may cause multiple links to break simultaneously due to the failure of a common resource. Accordingly, some SRLG based schemes have been proposed to complete survivability for multi-link faults [2, 3]. For example, a dynamic shared-path protection algorithm (DSPP) is proposed based on Shared Risk Link Group (SRLG) for protecting the multi-link failures in WDM mesh networks [2], and a new path-protection algorithm called Enhanced Shared Backup Paths Protection (ESBPP) to designed to provide the complete survivability for double-link failures in wavelength-division-multiplexing optical networks. Compared to previous algorithms for double-link failures, ESBPP can obtain higher resource-utilization ratio and lower blocking probability [3].

Before carrying out protection and restoration mechanism, the network managers must first find out the actual number and the location of faults and where each fault occurs. In order to localize multi-link faults in large scale optical networks, the network manager must deploy an efficient localization algorithm to find out the most possible faulty elements based on received information. In the literature, there are many fast localization algorithms proposed for optical networks monitoring, especially for SRLG localization [4–6]. These methods may use different network models, channel description, information, and processing methodologies to localize link faults [7]. Usually, centralized fault localization mechanisms provide a list of components whose fault explains the observed alarms. For example, pre-computation and sequential diagnosis mechanisms are used to keep up with the scalability [8]. But distributed mechanisms rely on keep-alive or notification messages to locate the root of a fault. One of the most representative distributed-localization mechanisms is the link management protocol (LMP) [9], which is part of the GMPLS protocol suite. LMP is always required on supervisory channel, which should not be in-fiber to avoid loss of the supervisor channel in the event of link faults.

A distributed fault localization protocol, called Limited perimeter Vector Matching (LVM) protocol, is proposed for localizing single link fault in all optical networks [10]. This protocol assumes that no power monitoring is available at each intermediate node and only destination node and edge node are able to detect the power loss or quality degradation of an optical signal. Also parallel limited perimeter vector matching (P-LVM) protocol is proposed for localizing multi-link faults in all optical networks [11]. To handle multi-link faults, it tries to separate each fault in a small perimeter area after identifying each perimeter area with its corresponding fault and then localize the faults in parallel respectively in a distributed manner. In the paper [12], we implement fuzzy fault set (FFS) based multi-link faults localization mechanism in multi-domain large capacity optical networks. It has higher scalability, speed and success rate compared with extended LVM protocol.

On the other hand, the multi-link faults localization mechanism mainly depends on the obtained information, such as the bit-error-rate (BER) of the lightpath. Although analog optical monitoring techniques [13] can be used to estimate the BER of the links by measuring the optical signal-to-noise ratio (OSNR), but the techniques are expensive and the results may not be accurate [8, 14]. What’s more, some other schemes, such as m-cycle, m-path, and m-trail, will occupy extra bandwidth resource [15–17]. Because the problem of multi-link faults location is NP-hard [18], the processing time may become an issue for large scale mesh optical networks, which leads to the poor scalability for the schemes above. Then, we propose a novel BER monitoring approach for transparent and translucent dynamic optical networks, which can be used to monitor, detect and localize multiple soft link faults without incurring any additional optical monitoring equipment [19].

Different from the centralized and distributed fault localization mechanisms, our proposed multi-link fault localization algorithm can be performed at any node of the network based on fault fuzzy set (FFS), which is constructed with the BER information collected distributed. Compared with all-optical out-of-band monitoring mechanisms, such as monitoring-cycles (m-cycles), m-paths, and m-trails, no supervisory lightpaths are needed. Restoration action is necessary after the completion of localization phase. Through the BER monitoring approach and multiple faults localization mechanism, this paper proposes a novel multi-soft link faults restoration algorithm based on FFS.

The rest of this paper is organized as follows: the problem statement is described in Section 2, and then the concept of fault fuzzy set (FFS) is introduced in Section 3. The necessary protocol extension is designed in Section 4, and multi-link faults localization mechanism is proposed based on FFS in Section 5. Section 6 focuses on the multi-link faults restoration algorithm. Numeric results on the GMPLS enabled optical network testbed are discussed in Section 7. Finally Section 8 concludes the paper.

2. Problem statement

In order to illustrate the important nature of multi-link faults localization such as the mathematical model, we only consider the optical networks with no receivers to estimate the BER at each intra-domain nodes. Only edge nodes and destination nodes are equipped with receivers to estimate the BER. Then the optical flow traverses along the lightpath without any optical-to-electrical conversion except edge nodes and destination nodes.

We assume each fiber is wavelength multiplexed, so that the number of lightpaths traversing a link could be as large as the multiplexing degree. When a link fails, a set of destination nodes connected to lightpaths detects signal loss or high BER. As shown in Fig. 1 , the established lightpaths in the network are P₁: c-f-e, P₂: b-c-f-g-a, P₃: c-f-g-h-d. The destination nodes a, d detect the signal loss or abnormal BER but the destination node e does not detect any problem. So the occurred faults affect lightpaths P₂ and P₃. Here we only consider link fault. In the single fault case, we can determine that Link₁ is the failed link. But under multiple faults, we cannot determine which links are at fault because there are several combinations which can cause the observed status, such as: Link₃&Link₄, Link₃&Link₅, Link₃&Link₄&Link₅, etc. In other words, we cannot determine whether these links failed or not and these links can be defined as network risk resources. We introduce the concept of fuzzy fault set (FFS) and put the network risk resources into FFS.

Fig. 1 Multi-link faults in optical networks

Symbol	Meaning
i	DLP index
j	Link index
p	Light path index
ω	Wavelength index
G = (V,E, η)	An arc-weighted network topology
V	Vertex set
E	Edge set
η	Weight function η: E-R₊ mapping the risk of fuzzy fault set.
N	N = \|V\|
L	L = \|E\|
W	The number of available wavelengths per fiber link
Λ	Λ = {λ₁, λ₂,…,λ_W} is the set of available wavelengths on each fiber link
D	The total number of DLP and it also denotes the total number of symptoms observed by the network fault management center.
P_i,k	Represents the k_th alternate restoration path connecting node S_i to node D_i (source and destination of the ith DLP),
R_i	The set of the restoration paths computed for DLP number i.
P	The set of alternate restoration paths computed between the source and destination nodes of each DLP in the network. Clearly \|P\|< = DK.*

*Algorithm 1* Bip_G
Require: Bip_G to be set up
1: F←Φ
2: *for* $a \in A$ do
3: Find all likely fault: Fault (a) and record the dependency
4: F = F $\cup$ Fault (a)
5: *end for*
6: *for* $n \in N L P$ do
7: Find all normal element Normal (n)
8: F = F /Normal (n)
9: *end for*
10: Return A, F and the dependency

*Algorithm 2* Tol_R
Require: Tol_R to be calculated
1: Tol_R←0
2: *for* $F_{i} \in F T$ do
3: *for* $l \in L & & l \in F_{i}$ do
4: Tol_R = Tol_R + $μ_{F_{i}} (l)$
5: *end for*
6: *end for*
7: Return Tol_R

*Algorithm 3* Risk based Restoration Algorithm for DLP $i$ (RRA)
Require: *DLP* $i$ to be recovered, return a physical route and a free wavelength for the *DLP* i
Input: K, R_i, W
1: *for* each DLP $i$ do
2: *for* k = 1 to *K do*
3: *for* ω = 1 to *W do*
4: Compute $γ_{i, k}^{W}$ , $c_{j, k}^{ω}$
5: *end for*
6: *end for*
7: If ( ${\sum_{k = 1}^{K} σ}_{i, k} \geq π_{i}$ ) and ( $c_{j, k}^{ω} \leq R$ ), *then*
8: Set up the DLP
9: k←1
10: p←1
11: *while* ( $ω \leq K$ ) && ( $p \leq π_{i}$ ) do
12: ω←1
13: *while* ( $ω \leq W$ ) and ( $p \leq π_{i}$ ) do
14: if $c_{j, k}^{ω} \leq R$ *then*
15: $c_{j, k}^{ω} \leftarrow + \infty$ (note: the cost on wavelength $λ_{ω}$ of all paths in $B_{i, k}$ that share at least one common fiber link with $P_{i, k}^{}$ )
16: p = p + 1
17: *end if*
18: $ω = ω + 1$
19: end while
20: *end while*
21: *else*
22: The DLP cannot be set up
23: *end if*
24: *end for*

Abstract

1. Introduction

2. Problem statement

3. Concept of FFS

4. OSPF protocol extension for BER information flooding and FFS construction

5.Multi-link faults localization mechanism based on FFS

6. Survival restoration algorithms for multi-link faults

7. Numeric results and analysis

7.1 Testbed deployment

7.2 Results and analysis

A. Performance of multiple faults localization mechanism

B. Performance of multiple faults restoration algorithm

8 Conclusions

Acknowledgment

References and links

Cited By

Figures (17)

Tables (1)

Equations (10)

Optics Express