Spare Capacity Allocation in Multi-Layer Networksdtipper/3957/Tipper2.pdfSpare Capacity Allocation...

8
Spare Capacity Allocation in Multi-Layer Networks Yu Liut, David Tippert, Kom Vajanapoomt tOPNET Technologies, Inc. 200 Regency Forest Drive, Cary, North Carolina 27511, USA email: [email protected] tDepartment of Information Science and Telecommunications University of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA email: [email protected], [email protected] Abstract- In this paper, we consider the problem of provisioning spare capacity in multi-layer backbone net- works in order to meet survivability requirements. To capture the failure propagation across network layers, two different multi-layer spare capacity allocation optimization problems are formulated in matrix format. They use the failure propagation matrix to determine the location and the amount of spare capacity in each network layer. For scalability, a fast and efficient approximation algorithm based on our previous successive survivable routing (SSR) technique is developed. Numerical results for a variety of networks show that near optimal solutions are found by the proposed heuristic algorithm within limited time. I. INTRODUCTION Survivability in the face of failures has become an es- sential property of backbone transport networks. Current backbone data networks are converging towards a two layer architecture of IP/MPLS or GMPLS over an optical transport layer. Survivability techniques in two layer networks can be classified as: survivability at bottom layer, survivability at top layer, and survivability at both layers, depending on in which layer the survivability technique is deployed [1]. In the bottom layer approach recovery from a failure is performed only at the bottom layer (e.g., recovering failed lightpaths in an optical transport network). This scheme has the benefits that it is simple and provides fast recovery of aggregate traffic. However, the major drawback of this scheme is that it cannot recover from failures that occur in the top layer, such as, the failure of a top layer router or its interfaces. In the survivability at top layer scheme, failure recovery is performed only at the top layer, (e.g., recovering failed label switched paths (LSPs) in an MPLS network using fast reroute methods). The advantage of this scheme is that it can recover from failures that occur in both layers. It also allows a service differentiation between top layer flows by recovering each individual flow in the top layer, which is not possible in the bottom layer survivability scheme where an aggregate of top layer flows is recovered. Among the drawbacks of this approach are its complexity and slower speed of fault recovery. One of the major problems in the survivability of such multi-layer networks is failure propagation, which occurs when the failure of a bottom-layer link or node results in the simultaneous failure of multiple top-layer links [1], [2], [3]. If failure propagation is not considered appropriately in multi-layer networks, the top layer survivability technique may fail to re- cover the communication services upon failure. Several approaches have been proposed to design survivable virtual topologies in the top layer while taking the failure propagation into account [1], [2], [3], [4], [5], [6], [7], [8]. In part due to failure propagation, each layer of a network will typically employ self-healing capabilities to address faults occurring in their own layer. In this multi- layer scheme, coordination between layers is required to provide an efficient recovery process upon a failure. This coordination is called an escalation strategy, which determines which layer will perform a recovery first in response to a particular failure, and when and how a responsibility will be transferred to another layer if the current layer fails to recover from the failure [1], [9]. In this paper, we provide models for the spare capacity allocation (SCA) problem in multi-layer networks using fault independent path restoration. The focus of our study is on SCA models for restoration at the upper layer. First, we derive two models for the spare capacity allocation (SCA) problem in multi-layer networks using the failure- independent path restoration. The first model captures the failure propagation by extending the matrix-based SCA formulation in [10]. The second model further improves the first one to increase the amount of spare capacity sharing. Modifications to the basic models for the failure-dependent path restoration and the stub re- 0-7803-9439-9/05/$20.00 ©2005 IEEE 261

Transcript of Spare Capacity Allocation in Multi-Layer Networksdtipper/3957/Tipper2.pdfSpare Capacity Allocation...

Page 1: Spare Capacity Allocation in Multi-Layer Networksdtipper/3957/Tipper2.pdfSpare Capacity Allocation in Multi-Layer Networks YuLiut, David Tippert, KomVajanapoomt tOPNETTechnologies,

Spare Capacity Allocation in Multi-Layer NetworksYu Liut, David Tippert, Kom Vajanapoomt

tOPNET Technologies, Inc.200 Regency Forest Drive, Cary, North Carolina 27511, USA

email: [email protected] of Information Science and TelecommunicationsUniversity of Pittsburgh, Pittsburgh, Pennsylvania 15260, USA

email: [email protected], [email protected]

Abstract- In this paper, we consider the problem ofprovisioning spare capacity in multi-layer backbone net-works in order to meet survivability requirements. Tocapture the failure propagation across network layers, twodifferent multi-layer spare capacity allocation optimizationproblems are formulated in matrix format. They use thefailure propagation matrix to determine the location andthe amount of spare capacity in each network layer. Forscalability, a fast and efficient approximation algorithmbased on our previous successive survivable routing (SSR)technique is developed. Numerical results for a variety ofnetworks show that near optimal solutions are found bythe proposed heuristic algorithm within limited time.

I. INTRODUCTION

Survivability in the face of failures has become an es-sential property of backbone transport networks. Currentbackbone data networks are converging towards a twolayer architecture of IP/MPLS or GMPLS over an opticaltransport layer. Survivability techniques in two layernetworks can be classified as: survivability at bottomlayer, survivability at top layer, and survivability at bothlayers, depending on in which layer the survivabilitytechnique is deployed [1]. In the bottom layer approachrecovery from a failure is performed only at the bottomlayer (e.g., recovering failed lightpaths in an opticaltransport network). This scheme has the benefits that itis simple and provides fast recovery of aggregate traffic.However, the major drawback of this scheme is that itcannot recover from failures that occur in the top layer,such as, the failure of a top layer router or its interfaces.In the survivability at top layer scheme, failure recoveryis performed only at the top layer, (e.g., recovering failedlabel switched paths (LSPs) in an MPLS network usingfast reroute methods). The advantage of this schemeis that it can recover from failures that occur in bothlayers. It also allows a service differentiation betweentop layer flows by recovering each individual flow in

the top layer, which is not possible in the bottomlayer survivability scheme where an aggregate of toplayer flows is recovered. Among the drawbacks of thisapproach are its complexity and slower speed of faultrecovery. One of the major problems in the survivabilityof such multi-layer networks is failure propagation,which occurs when the failure of a bottom-layer linkor node results in the simultaneous failure of multipletop-layer links [1], [2], [3]. If failure propagation isnot considered appropriately in multi-layer networks,the top layer survivability technique may fail to re-cover the communication services upon failure. Severalapproaches have been proposed to design survivablevirtual topologies in the top layer while taking the failurepropagation into account [1], [2], [3], [4], [5], [6], [7],[8]. In part due to failure propagation, each layer of anetwork will typically employ self-healing capabilities toaddress faults occurring in their own layer. In this multi-layer scheme, coordination between layers is requiredto provide an efficient recovery process upon a failure.This coordination is called an escalation strategy, whichdetermines which layer will perform a recovery first inresponse to a particular failure, and when and how aresponsibility will be transferred to another layer if thecurrent layer fails to recover from the failure [1], [9].

In this paper, we provide models for the spare capacityallocation (SCA) problem in multi-layer networks usingfault independent path restoration. The focus of our studyis on SCA models for restoration at the upper layer. First,we derive two models for the spare capacity allocation(SCA) problem in multi-layer networks using the failure-independent path restoration. The first model capturesthe failure propagation by extending the matrix-basedSCA formulation in [10]. The second model furtherimproves the first one to increase the amount of sparecapacity sharing. Modifications to the basic models forthe failure-dependent path restoration and the stub re-

0-7803-9439-9/05/$20.00 ©2005 IEEE261

Page 2: Spare Capacity Allocation in Multi-Layer Networksdtipper/3957/Tipper2.pdfSpare Capacity Allocation in Multi-Layer Networks YuLiut, David Tippert, KomVajanapoomt tOPNETTechnologies,

TABLE IACRONYM

SCA Spare capacity allocationSPM Spare provision matrixSSR Successive survivable routingBB Branch and bound used in CPLEXInP Integer programmingFID Failure-independent path restorationFD Failure-dependent path restoration

FDStubR Failure-dependent path restoration with stub releaseiff if and only if

lease cases are provided. Numerical results show that theproposed models can be solved by a heuristic algorithm,called the successive survivable routing (SSR) algorithm,to find near optimal solutions.

II. THE SPARE CAPACITY ALLOCATION MODEL

In this section, the spare capacity allocation (SCA)problem for a single layer network presented in [10]is briefly reviewed to provide background. We assumethe network under study uses failure independent pathrestoration (FID) for an arbitrary failure condition. FID isalso called path restoration with disjoint routes, where abackup path is always disjoint from its working path. Thedisjointness could be against either a link or a node. Weassume all traffic flows require a 100% restoration levelfor any failure. This level of restoration requires that allaffected flows be detoured to their backup paths upon anygiven failure. Provisioning enough spare capacity is theprerequisite condition to such restoration. The acrynymused in this paper is in Table I. The notation adopted issummarized in Table II.We model an uncapacitated network by a directed

graph of N nodes, L links, and R flows. A flow r, 1 <r < R is specified by its origin/destination node pair(o(r), d(r)) and traffic demand mr. Working and backuppaths of flow r are represented by two 1 x L binaryrow vectors Pr = {Prl} and q, = {qri} respectively.The l-th element in one of the. vectors equals to oneif and only if (iff) the corresponding path uses link 1.The path link incidence matrices for working and backuppaths are the collections of these vectors, forming twoR x L matrices P = {Prl} and Q = {qrl} respectively.Let M = Diag({mr}Rx1) denote the diagonal matrixrepresenting the demands of each flow.The topology is represented by the node-link incidence

matrix B = (bnl)NxL where bnl = 1 or -1 iff node n isthe origin or destination node of link 1. D = (drn)RxNis the flow node incidence matrix where drn = 1 or -1

TABLE IINOTATION

N, L, R, K Numbers of nodes, links, flows andfailure scenarios

n, 1, r, k Indices of nodes, links, flows and fail-ures;

P = {Pr} = {Prl} Working path link incidence matrixQ = {q,} = {q,1} Backup path link incidence matrixM = Diag({mr}) Diagonal matrix of demand bandwidth

mr of flow rG = {glk}LXK Spare provision matrix, glk iS spare

capacity on link 1 for failure kG={rgk}LXK Contribution of flow r to G

s = {s1}L l Vector of link spare capacityl= {4Ix}L Spare capacity cost function

W, S Total working, spare capacity71 = S/W Network redundancyo(r), d(r) Origin/destination nodes of flow rVr = {Vr I} Vector of cost on additional link spare

capacity for flow rB = {bnl}NxL Node link incidence matrixD = {drn}RXN Flow node incidence matrixF {fkl}KXL Failure link incidence matrix, fkl =

iff link 1 fails in failure kU = {Urk}RxK Flow failure incidence matrix, Urk = 1

iff failure k will affect flow r's workingpath

T = {trl}RXL Flow tabu-link matrix, tjl = 1 iff link 1should not be used on flow r's backuppath

iff o(r) = n or d(r) = n. We characterize K failure sce-narios in a binary matrix F = {fk}KX = {fkl}KxLThe row vector fk in F is for failure scenario k and itselement fkl equals one iff link 1 fails in scenario k. Inthis way, each failure scenario includes a set of one ormore links that will fail simultaneously in the scenario.For a failed node, all its adjacent links are marked asfailed. We also denote a flow failure incidence matrixU = {Ur}Rxl = {Urk}RxK, where Urk = 1 iff flowr will be affected by failure k, and Urk = 0 otherwise.A flow tabu-link matrix T = {tr}Rxl = {trl}RxL hastri = 1 iff the backup path of flow r should not use link1, and trl = 0 otherwise. We can find U and T given Pand F as shown in equations (7) and (8) respectively. Abinary matrix multiplication operation "0" is used in theabove two equations. It is a matrix multiply operator thatis identical to the normal matrix multiple except that thegeneral numerical addition 1 +1 = 2 will be replaced bythe boolean addition 1+1 = 1 as described in [11]. Usingthis binary operator, the complicated logical relationsamong links, paths and failure scenarios are simplifiedinto two matrix operations.We let G = {glk}LxK denote the spare provision

262

Page 3: Spare Capacity Allocation in Multi-Layer Networksdtipper/3957/Tipper2.pdfSpare Capacity Allocation in Multi-Layer Networks YuLiut, David Tippert, KomVajanapoomt tOPNETTechnologies,

matrix (SPM) whose elements glk are the minimum sparecapacity required on link 1 when link k fails. Note thatK = L when the SCA protects all single link failures.Given the backup paths Q, the demand bandwidth matrixM, and the working path P, the spare provision matrix(SPM) can be determined in (3). The minimum sparecapacity required on each link is denoted by the columnvector s = {sl}Lx1, which is found in (2). The functionmax in (2) asserts that an element in s is equal to themaximum element in the corresponding row of G. It isequivalent to s > G in this optimization model. This isa linear programming format. The operator > betweena column vector s and a matrix G guarantees that anyelement in s is always not less than any elements inthe corresponding row of G. Let XI denote the costfunction of spare capacity on link 1. / = {Jl}Lx1 isa column vector of these cost functions and ¢(s) givesthe cost vector of the spare capacities on all links. Thetotal cost of spare capacity on the network is eT4(s),where e is unit column vector of size L. For simplicity,we assume all cost functions are identity functions, i.e.,one bandwidth unit costs one monetary unit.

Given the notation and definitions above the sparecapacity allocation (SCA) problem can be formulatedas follows.

min S = eTs (1)Q,ss.t. s = maxG (2)

G=QTMU (3)T+Q<1 (4)

QBT = D (5)Q: binary (6)

U=PO)FT (7)T=UOF (8)

This SCA problem has the objective to minimizethe total spare capacity in (1) with the constraints (2)-(8). The decision variables are the backup path selec-tion (i.e., the backup path matrix) Q and the sparecapacity allocation (i.e., the vector s). Constraints (2)and (3) associates these variables, i.e., the spare capacityallocation s is derived from the backup paths in Q.Constraint (4) guarantees that every backup path will notuse any link which might fail simultaneously with anylink on its working path. Flow conservation constraint (5)guarantees that backup paths given in Q are feasiblepaths of flows in a directed network. Note, the incidencematrices U and T are precomputed. The matrix U

indicates the failure cases that will influence the workingpaths. The matrix T indicates the links that should beavoided in the backup paths. More detailed explanationof the model is in [10, eq.(7)-(14)].The SCA model formulated above is a mixed inte-

ger programming problem. It is NP-complete. Hence,solving the problem for large networks is infeasibleusing standard integer programming solution methods.In [10], we proposed the successive survivable routing(SSR) heuristic algorithm to solve it. The SSR algo-rithm finds near optimal solutions by routing backuppaths iteratively. Each backup path computation uses theshortest path algorithm. The link routing metric is theincremental spare capacity. It is computed from the mostrecent spare provision matrix that is further based on thepreviously routed backup path. After all flows find theirbackup paths, SSR continues to update existing backuppaths whenever a new one could use less spare capacity.This process keeps reducing total spare capacity until itconverges, (i.e., no more backup path updates). Differentrandom ordering of the flows for routing backup pathsare used to provide various solutions and avoid localminima. The best order gives the final near optimalsolution [10].

III. SCA MULTI-LAYER MODELS

This section is the new contribution of this paper. Weextend the SCA model on a two layer network. In thetop layer network, the notation of the previous section isreused, and the same notation with the superscript "b" isused for the bottom layer. A top-layer link is carried bya bottom-layer path. Such overlay information is definedby the interlayer link incidence matrix H = {hij}L xLb,where 1 < i < L, 1 < j < Lb. Element hij equals toone if the top-layer link i uses the bottom-layer link j.Given the top-layer spare capacity allocation vector s,its equivalent bottom-layer spare capacity vector sb isbelow:

sb= H s. (9)

Usually, each bottom layer link carries several toplayer links; therefore a failure of single bottom layerlink could tear down a large number of top layer linkssimultaneously. In order to provide a restoration atthe upper layer, the interlayer link incidence matrix Hshould guarantee that a failure of any single bottom-layer link would not partition the top layer topology. Thisproperty is called immunity from failure propagation. Amath programming model to find such a mapping His provided in [5]. It is called the survivable topology

263

Page 4: Spare Capacity Allocation in Multi-Layer Networksdtipper/3957/Tipper2.pdfSpare Capacity Allocation in Multi-Layer Networks YuLiut, David Tippert, KomVajanapoomt tOPNETTechnologies,

layout problem in [12, Section 6.2], where the matrix-based formulation is provided. The overlay informationH is used as the input to the following SCA models.

First we consider the SCA problem for the failure-independent (FID) path restoration case. In FID eachflow has a single backup path disjoint from any failuresthat affect its working path. Based on the layers wherethe restoration scheme exists, the SCA models could beclassified as follows.

A. Restoration at the bottom layerIf the restoration only happens at the bottom layer,

each top-layer link has a bottom-layer backup pathpreplanned besides its working path. The top-layer trafficflow is not aware of the bottom-layer restoration. Thisscheme is simple and provides fast restoration. However,it might not protect all top-layer failures such as thefailure of a top-layer router or its interfaces. The SCAmodel for the overlay network here is simplified to one atthe bottom layer only. The only modification beyond [10]is that the bottom-layer working paths are derived fromthe layout information H in (10).

pb= H

Model B: Use the overlay information for both failurepropagation and cross-layer spare capacity reservation:In addition to using the overlay information for failurepropagation, the second model computes spare capacitysharing with a finer granularity. It keeps track of sparecapacity sharing at the bottom layer, instead of at thetop. Since every top link passes one or more bottomlinks, spare capacity sharing on these bottom links couldprovide the finer granularity than on the top links. Thiscould further minimize the total spare capacity. Thedisadvantage of this model is the requirement of anapproach for the top layer to record and reserve sparecapacity on the bottom-layer links. This might requirecomplicated cross-layer protocols in the control plane.

(10)

B. Restoration at the top layerWhen the restoration is at the top layer, equations

(1)-(8) in the previous section are extended based onthe two alternate usages of the overlay information H.We denote the two models proposed as [A] and [B]superscripts added to key SCA model components thatare changed in the multi-layer model.Model A: Use of overlay information forfailure prop-

agation: To protect against failure propagation of anysingle bottom layer link failures, the overlay informationH is used to derive the failure scenario matrix F forthe top layer SCA model in (11). The flow failureincidence matrix U and the spare provision matrix G [A]are modified in (12) and (13) to replace (7) and (3).

F= HT = FboHT (11)

U = PO FT =PO HO) FbT (12)

G[AI = QTMU = QTM(P 0 H (o FbT) (13)

In addition, the objective function to minimize thetotal spare capacity (1) is replaced by (14), where X1T =eTHT is used to compute the actual spare capacity onthe bottom layer reserved by the top-layer links.

min S[A] = eTSb = eTHTS = T max G[A](14)Q

Fig. 1. Network 0: 5-node overlay network

An example is given to illustrate the advantage of thefiner granularity. In Fig. 1, two working flows a-b andc-d at the top-layer pass the bottom-layer links ] and 5respectively. Their backup paths might be a-c-b and c-b-a-d. They pass the top-layer links 2 (a-c) and 3 (a-d)respectively. The bottom-layer paths of these two linkshave overlapped on the bottom-layer link 2 (a-e) whosespare capacity could be shared by the above backuppaths. In this example, the top-layer not only uses theoverlay information H to avoid the failure propagation,but also to reserve the spare capacity shared at the bottomlayer in order to achieve lower redundancy.

This spare capacity sharing scheme is equivalent to:(i) converting all working and backup paths at the toplayer into ones at the bottom by multiplexing them withH, as modeled in (9); (ii) then minimizing the objectivefunction in (15).

minS[B] = eTsb = eTmax(HTG[A]) (15)Q

Since max(HTG[A]) < HT max(G[A]), the totalspare capacity using Model B will be equal to or smaller

264

Page 5: Spare Capacity Allocation in Multi-Layer Networksdtipper/3957/Tipper2.pdfSpare Capacity Allocation in Multi-Layer Networks YuLiut, David Tippert, KomVajanapoomt tOPNETTechnologies,

than that in Model A.

S[B] < S[A] (16)

This model also requires modifications in the succes-sive survivable routing (SSR) algorithm in [10]. Since thespare provision matrix G[A] is converted to its bottomlayer equivalent in (15), the incremental spare capacityvector vr used at the top layer in [10, Eq. (17)] has to bederived from its equivalent vb at the bottom layer in (17)below.

Vr,= Hvb (17)

where vb - sb*(e - tb)-8b- The definitions of 8band sb-r are the bottom layer equivalents of s* and s-rwhose definitions are in step 3 of the SSR section in [10,Section V].

C. Restoration at both Layers

When both layers have path restoration schemes im-plemented and share their spare capacity, they mightachieve better redundancy. This concept is called thecommon pool survivability in [9]. Assuming both layersare immune from single bottom-layer link failure, thespare capacity sharing between both layers could be doneif their spare provision matrices are exchanged. The top-layer spare provision matrix G[A] will be transformedand merged with the spare provision matrix Gb at thebottom layer in (18).

G[C] - Gb + HTG[A] (18)

The objective function for the SCA problem is updatedto (19).

min SIC] = eTsb = eT max G[C]Q,Qb

= eTmax(Gb HTG[A])< eT max Gb + eT max(HTG[A])= Sb + S[B] (19)

In the SSR algorithm, both layers perform their singlelayer SSR algorithms but their link metrics are different.The top layer has the knowledge of H and uses itto convert its bottom-layer link metric vb back to vrusing (17). In this way, both layers share a commonspare capacity provision matrix G[C]. They cooperate tofurther improve spare capacity sharing.

D. failure-dependent path restoration at the top layerAll of the models above assume failure-independent

(FID) path restoration. The SCA problem for the failure-dependent (FD) path restoration is given in [13]. Itsextension for the multi-layer networks is below.We use Model A where the overlay information H is

used for the failure propagation. As shown in (11), anyarbitrary bottom-layer failure has been captured in failurematrix F. Similar to (14) in [13], in order to computebackup paths for individual failures, equation (13) thatfinds the spare provision matrix G[A] should be replacedby G[D] in (20).

G[D] = QkTMUk, 1 < k < K, (20)

The k-th column vector G [D] = {9lk}Lxl of G[D] isdetermined by the k-th column vector Uk = {Urk}Rxlof the failure matrix U in (12), the demand matrix M,and the backup path matrix Qk.

IV. NUMERICAL RESULTS

The SCA models and the SSR algorithm were studiedon nine different multi-layer networks. The top andbottom layer topologies are provided in Fig. 1-9. Twocases of top layer topologies were studied: full mesh andpartial mesh. In the full mesh case all top layer nodeswere interconnected in a full mesh. In the partial meshcase the top layer has a sparser interconnection and thetopology is given in Fig. 1-9. The numbers of links andnodes in both layers are provided in Table III. We firstconsider the failure-independent (FID) path restorationcase for both Models A and B in Section III-B. Thenumerical study for section Ill-A is same as [10]. Thesection Ill-C is for future work.Two algorithms, the branch and bound (BB) and the

successive survivable routing (SSR), are used to find thesolutions for Model A and B in Table III. The total sparecapacity S from both models are listed in the results andthe total working capacity W is given for comparisonpurposes. Since SSR finds the near optimal solutions, therange of solutions from 64 random cases are listed withthe format of the minimum and the maximum resultsbetween a hyphen '-'. BB results are obtained from thecommercial software CPLEX that could find the optimalsolution for small networks.

All demands were unit bandwidth requests at the toplayer. The demand matrices are given as M = IIRI xIRI,where IRI is the number of flows at the top layer, IRI =INI * (INI - 1)/2 for full meshed demands, and INI is thenumber of top layer nodes. The working paths are given

265

Page 6: Spare Capacity Allocation in Multi-Layer Networksdtipper/3957/Tipper2.pdfSpare Capacity Allocation in Multi-Layer Networks YuLiut, David Tippert, KomVajanapoomt tOPNETTechnologies,

TABLE IIIRESULTS OF TOTAL SPARE CAPACITY ALLOCATIONS AND THE CPU TIME IN SECOND

Net INNblILLbI Top Layer INI LI - RI- W S[A] T__ S[B] - Time of S[A] Time of S[B]___ ___ II ~~Topology I _I_ [_ _IBB [SSR BB [SSR [BB SSR IBBI SSR

NetO0 5 7 Full mesh 4 6 6 9 1 1 11-13 1 1 1 1 ,_ 0 1O I,0 r-ONet 1 10 22 Full mesh 6 15 15 27 15 16-22 15 16-22 0.07 1.67 0.08 2.36

Partial mesh 6 9 15 34 32 32-35 31 32-33 0.03 1.63 0.04 2.06Net 2 12 25 Full mesh 7 21 21 39 28 30-37 28 29-35 4.09 1.78 0.54 3.63

Partial mesh 7 12 21 54 45 46-50 45 46-50 0.06 1.73 0.07 2.79Net 3 13 23 Full mesh 8 28 28 63 39 40-53 37 40-49 29.37 2.01 1.09 5.7

Partial mesh 8 14 28 72 49 49-57 48 49-57 0.1 1.8 0.17 3.46Net 4 17 31 Full mesh 10 45 45 106 48 56-68 48 54-68 88.04 3.03 11.98 20.55_____ ~~~~Partial mesh 10 16 45 136 97 98-103 97 98-104 0.34 2.01 0.29 7.12NetS5 18 27 Full mesh 10 45 45 134 121 129-141 118 124-131 56.53 2.91 64.49 13.67

Partial mesh 10 18 45 157 114 115-125 114 115-123 0.55 2.02 30.36 6.78Net 6 23 33 Full mesh 10 45 45 157 121 132-147 117 125-135 1,145 3.01 290 23.52_____ ~~~~Partial mesh 10 22 45 184 162 164-172 160 161-170 1.12 2.16 0.71 9.99Net 7 26 30 Full mesh 8 28 28 116 103 103-130 102 105-111 1.07 2.01 3.54 7.14_________ ____ Partial mesh 8 13 28 126 100 100-108 100 100-111 0.34 1.77 1.37 4.64Net 8 50 82 Full mesh 12 66 66 309 217 230-251 - 221-253 2353 8.16 2days 232.95

___ Partial mesh 12 24 66 389 320 323-339 320 323-338 1.9 3.81 2.61 65.68Note: Branch and Bound (BB) results are from AMPL/CPLEX v9.0 on a Sun Fire V240 Server with 1GHz CPU and 2GBytememory. SSR results are obtained on a PC with Intel Pentium M 1.3GHz CPU. SSR results are provided as the range of"mmn-max" among 64 random cases. The unit for BB and SSR is the bandwidth unit. The unit for the time is second.

from the shortest-hop paths. The total working capacityat the bottom layer can be derived by W = eTPHe.Assume all flows and links are symmetric so the problemcan be modeled in an undirected graph. The bottom layerfailure matrix Eb = IILbIxILbI, where ILb I is the numberof the bottom-layer links.

From the results in Table 1II, one can see that theSSR algorithm closely approximates the optimal BBsolution. In network 0, both BB and SSR find the optimalsolutions for both models. In the other 8 networks, theSSR results using Model B have a slightly smaller rangefor the solution than those found from Model A. Thisindicates the spare capacity can be reduced in ModelB, but the reduction is very small - less than 5% whenusing Model B solved by either BB or SSR.The CPU time to find the optimal solution depends

on both the network size and topology. SSR can findall 64 solutions in less than 30 seconds in all casesexcept network 8, where SSR finds solutions in minuteswhile BB could not find solutions after two days. Inmost cases with the combinations of top layer, bottomlayer topologies and the SCA models, BB has shortertime than SRR. There are 10 out of these total 34 caseswhere SSR has shorter time than BB. For example, whenusing the model A on network 6 with full mesh demands,it takes about 20 minutes to get the optimal results inBB but 3 seconds in SSR. In addition, when using the

model B on network 8 with full mesh demands, BBcould not find a solution after 2 days while SRR usesabout 4 minutes. BB performs very fast on most cases. Itindicates that the optimal solution could be found quicklyin CPLEX v9 in certain cases. However, this fast speedis not guaranteed. It might take much longer time, suchas on network 8. Using model A, it takes about 2353seconds for BB to find the optimal solution while ittakes 8.16 seconds for SSR to obtain a next optimalsolution. Using Model B, BB could not find an optimalsolution after 2 days. In addition, BB is known for worsescalability on large networks. For these reasons, SSRcould be a good alternative for SCA on large networks.

Next, we use Model A and SSR to compare pathrestoration schemes, i.e. FID, FD, and FD with StubRelease, to protect either single link or single nodefailures. Model B is not used in this comparison becauseof its requirement of a complicated cross-layer capacityreservation protocol and its little gain in the total sparecapacity in the previous results.

The SCA results on Network 5 for the failure-dependent (FD) path restoration are shown in Table IV.The FD path restoration finds a little bit lower valuesfor the total spare capacity. This is reasonable sincemultiple backup paths are allowed for each flow at thetop layer. This increases the chance of sharing sparecapacity, hence reduces redundancy. Using the FD path

266

Page 7: Spare Capacity Allocation in Multi-Layer Networksdtipper/3957/Tipper2.pdfSpare Capacity Allocation in Multi-Layer Networks YuLiut, David Tippert, KomVajanapoomt tOPNETTechnologies,

restoration scheme with the stub release function (FD-StubR), the total spare capacity can be further reduced.The stub release feature allows backup path to reuse theworking capacity released by the working paths that areinterrupted by the failure. The model of stub release isdiscussed in [12, Ch. 6].

Furthermore, the SCA problem for protecting anysingle node failures at the bottom layer could be derivedby combining Model A and the Node Failure extensionin [10]. The results are compared for the above three pathrestoration methods, FID, FD and FDStubR, in Table IV.FDStubR could provide the lowest redundancy while FDis slightly higher than FDStubR but much lower than theFID results.

It is important to be aware that the lower sparecapacity values in the node failure does not indicatebetter efficiency. The lower value could come from thedropped demands that cannot be recovered at the failureof their end nodes. On the contrary, the link failurecan guarantee a 100% demand to be recovered on 2-connected networks.

V. CONCLUSION

In this paper, several variations of the SCA problemfor multi-layer networks were formulated as integerprogramming problems. Numerical results were givenshowing that the successive survivable routing algorithmcan be used to efficiently find near optimal spare capacitysolutions.

This paper presents the SCA across two layer net-works. The SCA problem for network with more thantwo layers could be extended in the future work. Relatedproblems such as the layer topology design and the linkcapacity design could also be combined together forvarious design requirements.The SCA problem uses path protection that has appli-

cations in MPLS or optical networks. Currently, the linkcapacity constraint is not included. This should be addedin the future work. In addition, performance metrics suchas the number of failed demands and the average routedistance should be studied as well. Beyond these pathprotection on mesh topology, comparisons with ring orcircle-based methods are also beneficial.

[3] M. Pioro and D. Medhi, Routing, Flow, and Capacity Designin Communication and Computer Networks, Morgan KaufmannMKP, 2004.

[4] 0. Crochat, J.-Y. Le Boudec, and 0. Gerstel, "Protectioninteroperability for WDM optical networks," IEEE/ACM Trans-actions on Networking, vol. 8, no. 3, pp. 384-395, June 2000.

[5] E. Modiano and A. Narula-Tam, "Survivable routing of logicaltopologies in WDM networks," in Proceeding of IEEE INFO-COM, Apr. 2001.

[6] M. Kurant and P. Thiran, "On survivable routing of meshtopologies in IP-over-WDM networks," in Proceeding ofIEEEINFOCOM, 2005.

[7] F. Ducatelle and L. M. Gambardella, "FastSurv: A newefficient local search algorithm for survivable routing in WDMnetworks," in Proceeding of IEEE Global CommunicationsConference, Dallas, TX, USA, Nov. 2004.

[8] F. Giroire, A. Nucci, N. Taft, and C. Diot, "Increasing therobustness of IP backbones in the absence of optical levelprotection," in Proceeding of IEEE INFOCOM, 2003.

[9] P. Demeester and M. Gryseels, "Resilience in multilayernetworks," IEEE Communications Magazine, vol. 37, no. 8,pp. 70-76, 8 1999.

[10] Y. Liu, D. Tipper, and P. Siripongwutikorn, "Approximatingoptimal spare capacity allocation by successive survivable rout-ing," IEEE/ACM Transactions on Networking, vol. 13, no. 1,pp. 198-211, Feb. 2005.

[11] Bernard Kolman, Robert C Busby, and Sharon Ross, DiscreteMathematical Structures, Prentice Hall, 1996.

[12] Yu Liu, Spare capacity allocation method, analysis and al-gorithm, Ph.D. dissertation, School of Information Sciences,University of Pittsburgh, 2001, http: //www. sis .pitt.edu/"yliu/dissertation/.

[13] Y. Liu and D. Tipper, "Spare capacity allocation for non-linearcost and failure-dependent path restoration," Third InternationalWorkshop on Design of Reliable Communication Networks(DRCN), Budapest, Hungary, October 7-10 2001.

TABLE IVCOMPARISON OF S[A] FOR PATH RESTORATIONS IN FIG. 6

Restoration Link failure Node failureFID 115-125 106-115FD 104-118 97-109

FDStubR 103-115 94 -107Note: The numerical results are in bandwidth unit.

REFERENCES

[1] J. Vasseur, M. Pickavet, and P. Demeester, Network Recovery:Protection and Restoration of Optical, SONET-SDH, IP andMPLS, Morgan Kaufmann MKP, 2004.

[2] Wayne Grover, Mesh-based Survivable Networks: Options andStrategies for Optical, MPLS, and ATM Networking, PrenticeHall PTR, 2003.

267

Page 8: Spare Capacity Allocation in Multi-Layer Networksdtipper/3957/Tipper2.pdfSpare Capacity Allocation in Multi-Layer Networks YuLiut, David Tippert, KomVajanapoomt tOPNETTechnologies,

Fig. 2. Net (N = 6, L = 9, Nb = 10, Lb = 22)Fig. 6. Net 5 (N= 10, L = 18, Nb = 18, Lb = 27)

Fig. 3. Net 2 (N = 7, L = 12, Nb = 12, Lb = 25)Fig. 7. Net 6 (N = 10, L = 22, Nb = 23, Lb = 33)

Fig. 4. Net 3 (N = 8, L = 14, Nb = 13, Lb = 23)Fig. 8. Net 7 (N = 8, L = 13, Nb = 26, Lb = 30)

Fig. 5. Net 4 (N = 10, L = 16, Nb = 17, Lb = 31)

Fig. 9. Net 8 (N = 12, L = 24, Nb = 50, Lb = 82)

268