Chapter 1 Parallel Constraint-Based Local Search: An ...

Chapter 1

Parallel Constraint-Based LocalSearch: An Application to DesigningResilient Long-Reach Passive OpticalNetworks

Alejandro Arbelaez, Deepak Mehta, Barry O’Sullivan, and Luis Quesada

Abstract Many network design problems arising in areas as diverse as VLSIcircuit design, QoS routing, traffic engineering, and computational sustain-ability require clients to be connected to a facility under path-length con-straints and budget limits. These problems can be seen as instances of theRooted Distance-Constrained Minimum Spanning-Tree problem (RDCMST),which is NP-hard. An inherent feature of these networks is that they are vul-nerable to a failure. Therefore, it is often important to ensure that all clientsare connected to two or more facilities via edge-disjoint paths. We call thisproblem the Edge-disjoint RDCMST (ERDCMST). Previous work on RD-CMST has focused on dedicated algorithms and, therefore, it is difficult touse these algorithms to tackle ERDCMST. We present a constraint-basedparallel local search algorithm for solving ERDCMST. Traditional ways ofextending a sequential algorithm to run in parallel perform either portfolio-based search in parallel or parallel neighbourhood search. Instead, we exploitthe semantics of the constraints of the problem to perform multiple moves inparallel by ensuring that they are mutually independent. The ideas presentedin this chapter are general and can be adapted to other problems as well. Theeffectiveness of our approach is demonstrated by experimenting with a set of

Alejandro ArbelaezInsight Centre for Data Analytics, University College Cork, Ireland

e-mail: [email protected]

Deepak MehtaInsight Centre for Data Analytics, University College Cork, Irelande-mail: [email protected]

Barry O’Sullivan

Insight Centre for Data Analytics, University College Cork, Irelande-mail: [email protected]

Luis QuesadaInsight Centre for Data Analytics, University College Cork, Ireland

e-mail: [email protected]

1

2 Alejandro Arbelaez, Deepak Mehta, Barry O’Sullivan, and Luis Quesada

problem instances taken from real-world passive optical network deploymentsin Ireland, Italy, and the UK. Our results show that performing moves in par-allel can significantly reduce the elapsed time and improve the quality of thesolutions of our local search approach.

1.1 Introduction

Many network design problems arising in areas as diverse as VLSI circuitdesign [20], QoS routing [32], traffic engineering [27], and computational sus-tainability [11] require clients to be connected to a facility under path-lengthconstraints and budget limits. Here the length of the path can be interpretedas distance, delay, signal loss, etc. For example, in a multicast communica-tion [12] setting where a single node is broadcasting to a set of clients, itis important to restrict the path delays from the server to each client. InLong-Reach Passive Optical Networks (LR-PON) [23], a metro node (MN)is connected to a set of exchange sites via optical fibres; the length of thefibre between an exchange site (ES) and its metro node is bounded by thesignal loss. The goal is to minimise the cost resulting from the total lengthof fibres [21]. In VLSI circuit design, path delay is a function of the maxi-mum interconnection path length while power consumption is a function ofthe total interconnection length [19]. In package shipment services, guaranteeconstraints are expressed as restrictions on total travel time from an originto a destination, and the organisation wants to minimise the transportationcosts [24].

Many of these network design problems are instances of the RootedDistance-Constrained Minimum Spanning-Tree Problem (RDCMST) [19],which is NP-hard. The objective is to find a minimum cost spanning treewith the additional constraint that the length of the path from a specifiedroot node (or facility) to any other node (client) must not exceed a giventhreshold. Many networks are complex systems that are vulnerable to a fail-ure. A major fault occurrence would be a complete failure of the facility,which would affect all the clients connected to the facility. Therefore it isimportant to provide network resilience. We restrict our attention to net-works where all clients are required to be connected to two facilities via twoedge-disjoint paths so that whenever a single facility fails or a single link failsall clients are still connected to at least one facility. We define this problemas the Edge-disjoint Rooted Distance-Constrained Minimum Spanning-TreeProblem (ERDCMST). Given a set of facilities and a set of clients such thateach client is associated with two facilities, the problem is to find a set ofdistance-constrained spanning trees rooted at each facility with minimum to-tal cost. Additionally, each client is connected to its two facilities via twoedge-disjoint paths. This would effectively mean that each pair of distance-bounded spanning trees would be mutually disjoint in terms of edges.

1 Parallel Constraint-based LS: An Application to Designing Resilient LRPON 3

Previous work on RDCMST [16, 25] has focused on dedicated algorithms,which are hard to extend with side constraints, and therefore it is difficultto use these algorithms to tackle ERDCMST. We present a mixed integerprogramming formulation of ERDCMST and a constraint-based local searchalgorithm, which can easily be extended to apply widely. We present twoefficient local move operators and an incremental way of maintaining theobjective function, which is often a key element for the efficiency of a localsearch algorithm. Our local search algorithm is able to solve both RDCMSTand ERDCMST. These move operators were proposed and evaluated withrespect to edge-disjointness, capacity, and distance constraints in [6] and [5].

We extend our sequential algorithm and propose a parallel version of ourconstraint-based local search algorithm. The traditional way of extending asequential algorithm to run in parallel is to perform either portfolio-basedsearch in parallel or parallel neighbourhood search. Instead, we exploit thesemantics of the constraints of the problem to perform multiple moves in par-allel by ensuring that they do not conflict with each other. The effectivenessof our approach is demonstrated by experimenting with a set of problem in-stances taken from real-world passive optical network deployments in Ireland,the UK, and Italy. Our results show that performing moves in parallel cansignificantly reduce the time required to find a target solution and improvesthe anytime behaviour of our local search algorithm.

1.2 Formal Specification and Complexity

Let G be a directed graph with set of nodes N and set of edges L. We assumethat G is a complete graph without self-loops, so |L| = |N |× (|N |−1) in ourcase. Each edge has a cost and a distance value associated with it. Let Mbe a set of facilities and let ui ∈ U be a set of (users or) clients. We defineN as the disjoint union of M and U . Let Ui ⊆ U be the set of clients thatare associated with facility mi ∈ M. We use Ti to denote the tree networkassociated with facility i. We also use Ni = Ui ∪ {mi} to denote the setof nodes in the tree Ti associated with the facility mi, and a set of edgesLi ⊆ N2

i .Ti is a subgraph of G that contains a directed path from facility mi to all

of its clients and contains no cycles. The length of a path (or path-length)between two nodes is the sum of the distances of the edges connecting the twonodes, and the cost of Ti is the sum of the cost of its edges. In this chapter,without loss of generality, we assume that the cost is symmetrical, i.e., thecost of an edge 〈i, j〉 is equal to the cost of the edge 〈j, i〉. As mentionedbefore, we also assume that the graph is complete since non-existing edgesin the original graph can be represented by edges with a very large distancewith respect to the distance threshold. Additionally, we remove edges fromnodes to themselves.


Definition 1 Rooted Minimum Spanning Tree Problem (RMST). Given agraph G = (Ni, Li) with a facility mi ∈M and a real value cjk denoting thecost of each edge (j, k) ∈ Li, RMST is to find a spanning tree of minimumcost in G.

Definition 2 Rooted Distance-Constrained Minimum Spanning-Tree Prob-lem (RDCMST). Given a graph G = (Ni, Li) with a facility mi ∈M, the setof clients Ui, two real values cjk and djk denoting the cost and the distance ofeach edge (j, k) ∈ Li, and a real value λ, RDCMST is to find a spanning treeTi with minimum cost in G such that the length of the path from the facilitymi to any client uj ∈ Ui is not greater than λ.

Definition 3 Edge-Disjoint Rooted Distance-Constrained Minimum Spanning-Trees Problem (ERDCMST). Given a graph G = (N , L) with a set of facilitiesM, a set of clients U , a set of edges L, two real values cjk, and djk denot-ing the cost and distance of each edge (j, k) ∈ L, an association of clientswith two facilities π : U → M2, and a real value λ, ERDCMST is to find aspanning tree Ti of G for each facility mi such that

1. The length of the unique path for all mi to any of its clients is not greaterthan λ.

2. For each client uk, the two paths connecting uk to mi and mj, whereπ(uk) = 〈mi,mj〉, are edge disjoint.

3. The sum of the costs of the edges in all the spanning trees is minimum.

Figure 1.1 shows an example with two facilities F1 and F2 and N = {a,b, c, d, e, f}, black (respectively grey) edges denote the set of edges usedto reach F1 (respectively F2), the value of λ is 16 and the total cost of thesolution is 46 for this illustrative example. Figure 1.1a shows a valid solutionsatisfying both distance and edge-disjointness constraints. The distance fromthe facilities to any node is less than or equal to λ = 16. The paths connectingthe set of nodes to the facilities are edge disjoint. Figure 1.1b shows a cheapersolution replacing a grey edge 〈F2, b〉 with 〈a, b〉, but this solution is not edgedisjoint since a failure in 〈a, b〉 would disconnect node b from both facilities.

Complexity. ERDCMST involves finding a rooted distance-bounded span-ning tree for every facility whose total cost is minimum. This problem isknown to be NP-hard [19].

1.3 A Mathematical Model for ERDCMST

We present a mixed integer programming formulation of ERDCMST.


d

f

e

ab

c

F2F15 7

8

3

4

2

11

63

24

(a) Satisfying both distance and

edge disjointness

d

f

e

ab

c

F2F15 7

8

32

11

63

24

3

(b) Satisfying distance constraint

but violating edge disjointness

Fig. 1.1: Example of an instance of the ERDCMST problem. λ=16

Variables

• Let xijk be a variable over {0, 1} that denotes whether an edge from nodej ∈ Ni to k ∈ Ni of facility i ∈M is selected (1) or not (0).

• Let f ij be a variable that denotes the upper bound on the length of thepath from the facility i to its client j.

We note that the partial order enforced by f helps to rule out cycles inthe solution.

Constraints

Each facility is directly connected to at least one of its clients:

∀mi∈M :∑

uj∈Ui

xiij ≥ 1.

The total number of edges in any tree Ti is equal to |Ui|:

∀mi∈M :∑

uj∈Ni

∑uk∈Ui,uj 6=uk

xijk = |Ui|.

The graph cannot have an edge between a node and itself and the distancefrom the root node (or facility) to itself is 0:


∀mi∈M∀uj∈Ni: xijj = 0.

∀mi∈M : f ii = 0.

The length of the path from any client to its facility is bounded by λ:

∀mi∈M∀uj∈Ni: f ij ≤ λ.

If there is an edge from uj ∈ Ui to uk ∈ Ui then the length of the pathfrom mi to uk is equal to the sum of the length of the path from mi to ujplus the distance between uj and uk. We use the big-M method to modelthis implication as a linear constraint. The value of the constant C has tobe greater than λ plus the maximum distance between any pair of edges inorder to be consistent with the implication:

∀mi∈M∀{uj ,uk}∈Ni: C(1− xijk) + f ik ≥ f ij + djk.

If mi and mi′ are the facilities of the client j, and if there exists any pathin the subnetwork associated with facility i that includes the edge 〈uj , uk〉,then facility i′ cannot use the same edge. Therefore, we enforce the followingconstraint:

∀{mi,mi′}∈M∀{uj ,uk}∈Ui∩Ui′: xijk + xi

′

jk ≤ 1.

Objective

The objective is to minimise the total cost:

min∑

mi∈M

∑{uj ,uk}∈Ni

cjk · xijk.

1.4 Iterated Constraint-Based Local Search

For large networks containing more than 10,000 clients and 100 facilities wecannot expect to solve the previously presented problems using systematicsearch. In this section we present an iterated constraint-based local searchapproach for solving our problem.

The Iterated Constraint-Based Local Search (ICBLS) [14, 29] frameworkdepicted in Algorithm 1 comprises two phases. First, in a local search phase,the algorithm improves the current solution, little by little, by performingsmall changes. The algorithm employs a move operator in order to movefrom one solution to another in the hope of improving the value of the ob-jective function. Second, in the perturbation phase, the algorithm perturbsthe incumbent solution (s∗) in order to escape from difficult regions of the


search (e.g., a local minimum). The acceptance criterion decides whether toupdate s∗ or not. The algorithm accepts s′∗ with probability p, and s∗ other-wise. Furthermore, we limit the space of candidate solutions to valid solutionssatisfying the constraints of the problem.

Algorithm 1: Iterated Constraint-Based Local Search(move-op, s)

1 s∗ := ConstraintBasedLocalSearch(move-op, s);

2 repeat

3 s′ := Perturbation(s∗);4 s′∗ := ConstraintBasedLocalSearch(move-op, s′);5 s∗ := AcceptanceCriterion(s∗, s′∗);

6 until a given stopping criterion is met ;

Our algorithm requires two parameters: s the initial solution where allclients are able to reach their facilities while satisfying all constraints (i.e.,the upper bound in the length and disjointness), and the move operation(move-op) which is a function. We switch from the local search phase to theperturbation phase when a local minimum is observed. In the perturbationphase we perform a given number of random moves. In this chapter, unlessotherwise stated, we use the trivial solution of connecting all clients directlyto their respective facilities as the initial solution. The stopping criterion iseither a timeout or a given number of iterations.

1.4.1 Move operators

We describe the node, subtree, and edge move operators. We use Ti to denotethe tree associated with facility i. An edge between two clients uj and uk isdenoted by 〈uj , uk〉. We have defined node and subtree in [6], and we takethe edge operator from the literature [10]. Informally speaking the locationof a node uj (or the subtree emanating from it) in a tree is a tuple (upj , Sj)where upj

denotes the predecessor of uj and Sj denotes the list of immediatesuccessors of uj in the tree.

Node operator

In Figure 1.2b we move a given node ui from the current location to anotherin the tree. The node (or subtree) changes location if either the predecessor orany successor of the node is different. As a result of the move, all successors ofui will be directly connected to the predecessor node of ui. ui can be placedas a new successor for another node or in the middle of an existing edge inthe tree.


11

4

108 9

5

2

6 7

3

1

(a) Initial state of the tree

11

4

108 9

5

2

6 7

3

15

(b) Node operator

11

4

108 9

5

2

6 7

3

1

1110

5

(c) Subtree operator

11

4

108 9

5

2

6 7

3

1

(d) Pham et al.’s edge operator

Fig. 1.2: Move operators. Grey arrows indicate edges removed from the cur-rent solution

Subtree operator

In Figure 1.2c we move a given node ui and the subtree emanating fromui from the current location to another in the tree. As a result of this, thepredecessor of ui is no longer connected to ui, and all successors of ui arestill directly connected to ui. ui can be placed as a new successor for anothernode or in the middle of an existing edge.

Edge operator

In this chapter we limit our attention to moving a node or a complete sub-tree. In [10] the authors proposed to move edges in the context of the Con-strained Optimum Path problem. The Pham et al. move operator (Figure1.2d) chooses an edge in the tree and finds another location for it withoutbreaking the flow.


1.4.2 Operations and Complexities

We first present the complexities of the node and subtree operators as theyshare similar features. For an efficient implementation of the move operators,it is necessary to maintain bij : the length of the path from uj to the farthest

leaf associated with the tree Ti, and the previously described variable f ij : thedistance from the facility to the client. Let upj

be the immediate predecessorof uj and let Sj be the set of immediate successors of uj in a given tree.

In order to complete a move the node and subtree operators require toexecute the following four steps:

1. Randomly select a node (uj) from a facility (mi) from the current solution;2. Delete uj from Ti if the node operator is used, or uj and the emanating

subtree of the node in Ti if the subtree operator is used;3. Identify the best location, i.e., a new predecessor upj and a potential new

successor unsj for uj in Ti satisfying all constraints;4. Insert uj as a new successor of upj

, and if there is a new successor, addunsj as a new successor of upj

.

Table 1.1 summarises the complexities of the move operators. In this tablen denotes the maximum number of clients associated with a single facility.The last row (move) indicates the time complexity of completing a move witha given move operator, i.e., completing the four previously mentioned steps.

Table 1.1: Complexities of different operations

Node Subtree Edge

Delete O(n) O(n) O(1)

Feasible delete O(n) O(1) O(1)

Feasible insert O(1) O(1) O(n)Best location O(n) O(n) O(n3)

Insert O(n) O(n) O(n)

Move O(n) O(n) O(n3)

Feasible delete

Checking whether a solution is feasible after the node operator deletes a nodeuj in Ti requires linear complexity with respect to the number of clients inSj since it is necessary to check whether the new distance from the root tothe furthest leaf for all nodes uk ∈ Sj satisfies the distance constraint:

f ipj+ bik + dpj ,k ≤ λ.


Delete

Deleting a node or the subtree emanating from the node uj in Ti requireslinear time with respect to the number of clients of facility mi. For bothoperators, it is necessary to update bij′ for all the nodes j′ in the path fromthe facility mi to the client upj in Ti. In addition, the node operator updatesf ij′ for all the nodes j′ in the subtree emanating from uj . After deleting anode uj or a subtree emanating from uj , the objective function is updatedas follows:

obj := obj − cj,pj .

Furthermore, the node operator needs to add to the objective function thecost of disconnecting each successor element of uj and reconnecting them toupj :

obj := obj +∑k∈Sj

(ck,pj− ckj).

Feasible insert

Checking feasibility of a move can be performed in constant time by usingf ij and bij . If uj is inserted between the two nodes of edge 〈up, uq〉 then wecheck the following:

f ip + dpj + djq + biq ≤ λ.

If uj is inserted as a new successor of up we check the distance constraintas follows:

f ip + dpj + bij ≤ λ.

Best location

Selecting the best location involves traversing all clients associated with thefacility and selecting the one with the maximum reduction in the objectivefunction. Both operators need to traverse the tree in order to evaluate thecost of breaking all existing edges or adding a new node or subtree to thecurrent solution. In this chapter we use a depth-first exploration of the tree.

Insert

A move can be performed in linear time. We recall that this move opera-tor might replace an existing edge 〈up, uq〉 with two new edges 〈up, uj〉 and〈uj , uq〉. This operation requires us to update f ij for all nodes in the subtree

emanating from uj , and bij in all nodes in the path from the facility actingas a root node down to the new location of uj . The objective function must


be updated as follows:

obj := obj + cpj + cjq − cpq.

Now we switch our attention to the edge operator. This operator doesnot benefit from using bij . The reason is that moving a given edge fromone location to another might require changing the direction of a certainnumber of edges in the tree as shown in Figure 1.2d. Deleting an edge requiresconstant time. This operation generates two separated subtrees and no datastructures need to be updated. Checking the feasibility of adding an edge〈up′ , uq′〉 to connect the two subtrees requires linear time. It is necessary totraverse the new tree to obtain the distance from uq′ to the farthest leaf inthe tree associated with facility i. Performing a move requires linear time.It involves updating f ij for the new emanating tree of uq′ . Finding the bestlocation requires cubic time complexity as the number of possible locationsis bounded by n2 (total number of possible edges for connecting the twosubtrees) and for each possible move it is necessary to check feasibility. Dueto the high complexity (O(n3)) of completing a move with the edge operator,hereafter we limit our attention to the node and subtree operators.

As pointed out earlier in this chapter, maintaining the backward distancedoes not provide any reduction in the complexity of the moves for the edgeoperator. Notice that deleting an edge does not necessarily require us toupdate the distance of the affected nodes in the tree. In this case, only theaffected edge is removed from the solution.

Disjointness

To ensure disjointness among spanning trees we maintain a 2·|U| matrix,where |U| represents the number of clients. Checking disjointness for everyclient requires constant time complexity and only involves checking that thetwo integers indicating the predecessors in the primary and secondary facil-ities are different. Therefore the complexities presented in Table 1.1 remainvalid when disjointness is considered.

Move

As pointed out in Table 1.1 completing a move for the node and subtree op-erators requires linear time complexity and it involves deleting a node uj (orthe complete subtree emanating from uj), checking feasibility for all potentialcandidate locations, and then inserting the node or subtree in the most suit-able location. We randomly select uj in order to balance the greediness andthe complexity of the move. Alternatively, one can select the best deletion-insertion pair. However, this option make the complexity of completing a


move O(n2) as it involves deleting all nodes in the current solution and re-inserting them. Additionally, the use of the best deletion-insertion pair maylead to premature convergence to local minima.

In [25] the authors proposed two move operators for RDCMST. Edge-Replace is similar to the edge operator described above. The difference is thatin this case the authors only take into consideration the distance constraint,and therefore, the authors use the cheapest alternative edge to reconnect thetwo trees. Similarly to the edge operator, Edge-Replace might change thedirection of some affected edges and therefore the new solution might notsatisfy the edge-disjointness constraint. An alternative solution might be tolimit the set of candidate neighbours to those not changing the direction ofthe tree. This would be a particular case of the subtree operator where thesubtree is reconnected to an existing leaf of the current solution. Component-Renew removes an edge in the tree. Nodes that are separated from the rootnode are sequentially added to the tree using Prim’s algorithm. Nodes thatviolate the length constraint are added to the tree using a pre-computedroute from the root node to any node in the tree. It is worth noticing thatthe Component-Renew operator cannot be applied to ERDCMST as the pre-computed route from the root node to a given node might not be availabledue to the disjointness constraint.

1.5 Sequential Algorithm

We describe a constraint-based local search algorithm, which is parameterisedby a move operator. A (node or subtree) move is composed of four sub-operations: delete, best location, insert, and feasibility check. A move involvesremoving a node or a subtree from the tree and adding it in the best location.In order to find the best location a sequence of feasibility checks are carriedout.

The location of a node in a tree is defined by its parent node and its setof successor nodes. If the node operator is used then a new location for anode uj can be found by either (1) selecting an existing edge 〈ua, ub〉 andinserting uj between the two nodes such that the parent node of uj is ua andthe set of successors is the singleton set {ub} or (2) selecting any node uaand adding a new edge 〈ua, uj〉 such that the parent node of uj is ua andthe set of successor nodes is empty. Similarly, if the subtree operator is usedthen a new location for a subtree emanating from a node uj can be found byeither (1) selecting an existing edge 〈ua, ub〉 and inserting uj between the twonodes such that the parent node of uj is ua and the set of successors of uj isupdated by adding the node ub; or (2) selecting a node ua and adding a newedge 〈ua, uj〉 such that the parent node of uj is ua and the set of successornodes of uj remains unchanged. We use Locations(uj , Ti,move-op) to denote


m1 m2

m3 m4

u1, u2, u3, u4

u5, u6 u7, u8, u9 u10, u11

Fig. 1.3: Example of the facility connectivity graph (fcg) for a problem with4 facilities (i.e., m1, m2, m3, and m4), and 11 clients

all the possible locations of a node (or a subtree emanating from) uj in treeTi using move-op.

Let list be a set of potential nodes for which we might be able to findbetter locations. If the list is empty then it means that the algorithm hasreached a local minimum for the chosen move operator. We also compute agraph, which we call the facility connectivity graph, denoted by fcg, wherethe vertices represent the facilities and an edge between a pair of facilitiesrepresent that a change in the tree of a facility might help in finding betterlocations for some nodes in the tree associated with the other facility. Anedge between two vertices in fcg is added if the facilities share at least twoclients. Notice that if two facilities do not share at least two clients thenthey are independent from the disjointness point of view. Figure 1.3 showsan example of an fcg of the problem with four facilities and 11 clients: clients{u1, u2, u3, u4} are common to m1 and m2; clients {u7, u8, u9} are commonto m2 and m4; clients {u5, u6} are common to m1 and m4; and clients {u10,u11} are common to m1 and m3.

The pseudo-code of our constraint-based local search is shown in Algo-rithm 2. It starts by initialising list and fcg (Lines 2 and 3). It repeatedlyselects a client uj of a facility mi randomly from Tree Ti (Line 4), saves itscurrent location and deletes it from the tree (Lines 6-7). The delete opera-tion depends on the move operation (move-op) and deletes a single node orthe entire subtree emanating from the node. The cost of the current loca-tion is saved in cost (Line 8). For a chosen node or subtree, in each iteration(Lines 9-16), the algorithm identifies the best location. Line 10 verifies thatthe new move does not break any constraint and CostLoc returns the cost ofusing the new location using the given move operator (Line 11). If the costof the new location is better then the best set of candidates is reinitialisedto that location (Lines 12-14). If the cost is equal to the best known costso far then the best set of candidates is updated by adding that location


Algorithm 2: ConstraintBasedLocalSearch (move-op, {T1, . . . , T|M|})1 list := {(mi, uj)|mi ∈M∧ uj ∈ Ui};2 fcg := {(mi,mj)|Ni ∩Nj | ≥ 2};3 while list 6= ∅ do4 Select (mi, uj) randomly from list;5 if FeasibleDelete(Ti, uj , move-op) then

6 oldLoc := (upj , Sj);

7 Delete(Ti, uj , move-op);

8 cost := CostLoc((upj , Sj), uj ,move-op);

9 for (u′pj , S′j) in Locations(uj , Ti,move-op)− {oldLoc} do

10 if FeasibleInsert((u′pj , S′j), uj ,move-op) then

11 cost′ := CostLoc((u′pj , S′j), uj ,move-op);

12 if cost′ < cost then13 BestLoc := {(u′pj , S

′j)};

14 cost := cost′;

15 else if cost′ = cost then16 BestLoc := BestLoc ∪{(u′pj , S

′j)};

17 if BestLoc 6= ∅ then18 Select (u′pj , S

′j) randomly from BestLoc;

19 loc := (u′pj , S′j);

20 list := list ∪ {(mk, ul)|(mi,mk) ∈ fcg ∧ ul ∈ Nk ∩ (Sj ∪ {uj})};21 list := list ∪ {(mi, ul)|ul ∈ Ni};

22 else

23 loc := oldLoc;

24 Ti :=Insert(Ti, loc, uj , move-op);

25 list := list− {(mi, uj)};

26 return {T1 . . . Tn};

(Lines 15-16). The new location for a given node is randomly selected fromthe best candidates if there are more than one (Lines 18).

Instead of verifying that a local minimum is reached by exhaustively check-ing all moves for all clients of all facilities, we maintain a list of pairs of clientsand facilities. The list is initialised at Line 1 with all pairs of facilities andclients. In each iteration a pair consisting of a facility and a client, (mi, uj),is selected and removed from the list (Line 25). When list is empty the al-gorithm reaches a local minimum. If an improvement is observed then onesimple way to ensure the correctness of the local minimum is to populate listwith all pairs of facilities and clients. However, this can be very expensive interms of time. We instead exploit the facility connectivity graph (fcg) wherean edge between two facilities is added (in Line 2) if they share at least twoclients. If an edge between mi and mk exists in fcg then it means that ifthere is a change in Ti then it might be possible to make another change inTk such that the total cost can be improved. Consequently, we only need to


consider the clients of affected facilities. This mechanism helps to significantlyreduce the time taken by reducing the number of useless moves. We furtherstrengthen the condition for populating list by observing the fact that theset of clients that can be affected in a facility mk is a subset of {uj}∪Sj . Thereason is that all the other clients of mk are not subject to any constraintfrom the node uj of Ti (Line 20). Furthermore all the clients of Ti are alsoadded to the list.

The perturbation phase of the algorithm works similarly to Algorithm 2,by selecting a random node, deleting this node or its subtree in the currentsolution, and then instead of selecting the best location, the algorithm selectsa random valid location. In particular, for the perturbation phase, we replaceLines 10-16 with Line 16, i.e., BestLoc := BestLoc ∪ {(u′pj

, S′j)}. Therefore,the algorithm selects a new random location for uj .

We note that Algorithm 2 can tackle both RDCMST and ERDCMST.In the case of RDCMST, there is only one facility, and FeasibleInsert onlychecks the path-length constraint.

1.6 Parallel Algorithm

Parallelisation has been widely studied to speedup and improve the perfor-mance of local search algorithms to tackle a large variety of problems includ-ing TSP [7], Capacitated Network Design [9], Steiner Tree [31], SAT [17], andCSPs [8] just to name a few. These approaches employ the Multi-walk and/orSingle-walk framework [30] to devise the parallel algorithm. In particular wefocus our attention on constraint-based local search solvers.

1.6.1 Multi-walk and Single-walk

Multi-walk (also known as parallel portfolio) involves executing several al-gorithms (or different copies of the same one with different random seeds)in parallel, with or without cooperation, until a solution is found or a giventimeout is reached. The implicit assumption is that different processes handledifferent parts of the search space. The multi-walk method has two importantproperties. First, no load balancing is required to parallelise the sequentialalgorithm. Second, the speedup of the parallel algorithm strictly depends onthe performance of the sequential one. As discussed in the literature (e.g.,[28] and [26]) a high variance of the sequential algorithm usually means goodparallel speedup factors.

The Single-walk methods involves using parallelism inside a single searchprocess. In this approach, a typical way to develop the algorithm involvesparallelising the exploration of the neighbourhood, e.g., dividing the neigh-


bourhood into several sub-neighbourhoods and searching them in parallel tofind the best move.

We observe the two mentioned levels of parallelism in the context of SAT(see [3] for a recent survey). An elaboration on multi-walk approaches withand without cooperation can be found in [1, 4], where the cooperation isimplemented by sharing the best solution in order to properly craft a newstarting point. In SAT, the single-walk approaches are implemented by flip-ping multiple variables at the same time [22].

The Comet solver [18] has been proposed in the context of constraint-based local search. Comet provides abstractions for implementing multi-walkparallelism (with and without cooperation) and single-walk parallelism.

1.6.2 Parallel Moves for ERDCMST

The aim of our parallel approach is to reduce the elapsed time for finding atarget solution (i.e., optimal or near-optimal solution) by taking advantage ofthe availability of multiple cores and processors. In order to accomplish this,we propose a novel approach to perform multiple moves in parallel, whichcan be applied both in single-walk and multi-walk settings.

A move for ERDCMST can be defined as selecting and removing a nodefrom a tree and adding it back to the tree; preferably in a different location.The general idea is to partition the set of all moves in such a way that whenmultiple moves are performed by selecting them from different elements ofthe partition no constraint is violated. We use this approach to develop aparallel algorithm for the ERDCMST problem.

Informally speaking the algorithm takes into account the disjointness con-straint to divide the problem space into mutually exclusive subproblems andasynchronously perform multiple moves in parallel. We use the fcg to decom-pose the problem. Let us recall that if N facilities do not share any client wecan use the LS algorithm to optimise the local solution of individual facilitiesin parallel. For instance, in Figure 1.3 m3 and m2 do not share clients, so wecan improve the solution by executing two parallel copies of Algorithm 2 lim-iting the input solution to m3 for one core, and m2 for the other one. In thissection we describe two methods to decompose the problem. The first methodcomputes a set of independent facilities. The second method randomly selectsa set of facilities and resolves the conflicts between clients before executingthe LS algorithm.

Let Γij be the set of all potential predecessors of client uj in Ti whenconsidering either a node move or a subtree move. Ideally, we would like tofind a set of nodes (or clients) whose sets of locations are pairwise mutu-ally exclusive so that moving all those nodes simultaneously in their trees isconflict-free. The advantage is that finding a best location for all such nodes


Algorithm 3: Random Independent set(fcg, card)

1 S := ∅;2 while fcg 6= ∅ and |S| < card do3 v := random vertex in fcg;

4 S := S ∪ v;

5 Remove v and its neighbours from fcg;

6 return S;

can be done in parallel without restricting access to the data structures orcreating duplicate copies of the same data structure.

As the sets of locations for the selected nodes must be independent, weselect at most one node from a tree. It is indeed possible to find two nodes inthe same tree whose sets of potential predecessors are mutually exclusive, butfinding such nodes is not straightforward. Therefore, the number of movesthat can be performed simultaneously is bounded by the number of facilities.

As mentioned before, changing the location of a node within a tree Ti is notonly constrained by the other nodes of Ti but also by the nodes of the othertrees sharing nodes with Ti because of the disjointness constraint. In order todetermine the number of subproblems, we use the previously defined facilityconnectivity graph. In particular, we explore the following two mechanisms:

1. Independent set defines partitions by computing independent sets in fcg.In this approach, we know beforehand that each client has at most onepredecessor, so all elements in the set can be safely executed in paral-lel without violating the disjointness constraint. Algorithm 3 computes arandom set of independent elements in fcg. These elements will then beused in the parallel section of the algorithm to solve the problem. cardrefers to the cardinality of the independent set to be computed. Ideally,card should match the number of cores available. However, the degree ofparallelism of the algorithm is bounded by the cardinality of the maximumindependent set in fcg. For instance, let us revisit the fcg in Figure 1.3.In this case, we will have at most two independent facilities (i.e., m3 andm4). Therefore, even if card is greater than two, Algorithm 3 will returnat most two vertices. In practice we expect sparse graphs in real networks,so the cardinality of independent sets should be more than a few tens ofelements for realistically sized networks.

2. Random conflict selects, uniformly at random, n facilities and resolvesthe conflicts between clients a priori. It is recalled that two facilities canbe in conflict if and only if they share at least two clients. Let us saythat two facilities mi and m′i are selected, and the clients uj and uj′ areconnected to both facilities. Let C = Γij ∩ Γi′j′ be a non-empty set. Toresolve the conflict we modify the sets Γij and Γi′j′ such that they becomemutually exclusive. We recall that Γij (Γi′j′) represents the set of potentialpredecessors for uj (uj′) in the tree Ti (T ′i ).


Algorithm 4: Iterated Constraint-based Parallel Local Search(move-op,t)

1 s∗ := Initial Solution;

2 repeat

3 {s∗1, . . . , s∗n} := s∗;4 {s′1, . . . , s′n} := s∗;5 P := CreatePartition(s∗);6 for each pi ∈ P do in parallel7 while local time limit t for parallelism has not been reached do

8 if s∗i is internally in a local minimum then9 s′i := Perturbation(s∗i );

10 s′∗i := ConstraintBasedLocalSearch(move-op, s′i);

11 s∗i := AcceptanceCriterion(s∗i , s′∗i );

12 s∗ := {s∗1, . . . s∗n};13 until a given stopping criterion is met ;

• If uk ∈ C is already a predecessor of uj in Ti then we remove uk fromΓi′j′ , or viceversa.

• If uk ∈ C is a predecessor of neither uj in Ti nor uj′ in T ′i then we removeuk randomly from either Γij or Γi′j′ . We can say that the algorithmdecides beforehand to which set uk should belong.

Unlike independent set where the degree of parallelism is limited by thesize of the maximum independent set, random conflict allows as manyprocesses as the number of facilities in the problem, which in practice goesup to a few hundred cores. The solution of a problem whose fcg is theone in Figure 1.3 will start by randomly selecting a set of facilities, e.g.,m1, m3, and m4, and then the random conflict method resolves conflictsbeforehand for conflicting clients. For instance, if edge 〈u10, u11〉 is presentneither in the current solution of m1 nor in the current solution of m3, thealgorithm randomly decides whether to forbid the edge in m1 or in m3.The algorithm repeats this process for all pairs of conflicting clients.

The Iterated Constraint-based Parallel Local Search algorithm (ICPLS)works in two phases. First, the algorithm selects a set of facilities, denotedby P . If the set is independent then the locations of the clients of the setfacilities are mutually exclusive. If the set is in conflict then the locations ofthe clients of the set facilities are modified by restricting their locations toresolve the conflicts. Second, for each facility pi ∈ P it performs, in parallel,the sequential local search algorithm for a given amount of time t to explorethe search space. As is the case for the sequential algorithm, s∗ denotesthe current incumbent solution of the problem, s′ denotes the solution afterperturbing the incumbent solution, s′∗ denotes the best solution obtainedafter executing the constraint-based local search framework. s∗i (respectivelys′i and s′∗i ) denotes the solution (or tree) associated with partition pi. In the


parallel section of the algorithm (Lines 6-11) we differentiate between thelocal minimum for each partition and the local minimum of the problem. Inthe sequential algorithm we diversify the current incumbent solution afterfinding a local minimum of the problem, i.e., a state in which no neighboursolution leads to an improvement in the objective. In the parallel algorithmwe diversify when the solution si associated with partition pi is in a localminimum but the global state of the whole problem is still unknown.

The sequential algorithm scans the list of active clients (list), deletingclients from the list when they cannot improve the objective function. Theperturbation starts when a local minimum in the problem is reached (i.e.,list = { }). Following the same approach in the parallel algorithm may intro-duce significant processor idle time, in particular when approaching a localminimum. Therefore, we start the perturbation locally for each tree as soonas an internal local minimum for a given tree is reached to minimise idletime. Moreover, after applying a move in a tree, only nodes of the same treeare added to list to reduce synchronisation among processors (Lines 23-24 inAlgorithm 2).

Algorithm 4 shows the ICPLS proposed in this chapter. The algorithmstarts with an initial solution. CreatePartitions computes P using either in-dependent set of random conflict. We recall that random conflict computesthe set of potential predecessors for all nodes connected to the selected facil-ities. In the parallel section of the algorithm (Lines 6-11), we check whetherthe tree associated with pi is internally in a local minimum and if so perturbthe solution associated with the partition (Lines 8-9). Then, the Constraint-based Local Search procedure is invoked with the current solution si associ-ated with pi using the acceptance criterion of the sequential algorithm. It isworth noticing that unlike the sequential algorithm, where a local minimumis sure to be reached after invoking the local search procedure, in the parallelalgorithm it might be the case that the parallel algorithm has not reachedthe local minimum due to the local time limit of each parallel execution.

1.7 Application: Long-Reach Passive Optical Networks

To demonstrate the effectiveness of our approach we consider a real-worldproblem arising in the domain of optical networks. Long-Reach Passive Op-tical Networks (LR-PONs) are attracting increasing interest as they providea low-cost and economically viable solution for fibre-to-the-home network ar-chitectures [21]. An example of a Long-Reach PON is shown in Figure 1.4.In LR-PON each metro node is connected to tens of thousands of customersvia tens of hundreds of exchange sites. A major fault occurrence would be acomplete failure of the metro node that terminates the LR-PON, which couldaffect tens of thousands of customers. The dual homing protection mecha-nism for LR-PON enables customers to be connected to two metro nodes via


Fig. 1.4: Example of a Long-Reach Passive Optical Network

a local exchange site, so that whenever a single node fails all customers arestill connected to a backup [15]. Simply connecting two metro nodes to anexchange site is not sufficient to guarantee the connectivity because if a linkis common in the routes of the fibre going from the exchange site to its twometro nodes then both metro nodes would be disconnected. The part of theLR-PON that one wants to protect is between the metro node and the oldlocal exchange site. An important property of such a network is resilience toa single metro node failure. Nevertheless, connecting an exchange site to anadditional metro node has a cost overhead. Here a metro node is a facilityand an exchange site is a client.

In the optical network fibres are distributed from the metro nodes to theexchanges through cables that form a tree distribution network. As the as-sociation between metro nodes and exchange sites is already given, we couldtreat each one-to-many relation (i.e., tree) independently if disjointness werenot an issue. However, the paths from the metro nodes to the exchange sitesmay share edges since a pair of exchange sites may be associated with thesame pair of metro nodes. Therefore, we need to make sure that the routesthat we choose for connecting the exchange sites to their metro nodes (i.e.,main metro node and backup metro node) are disjoint. Otherwise, this wouldvoid the purpose of having double coverage. In Figure 1.5 we show two waysof connecting a given set of exchange sites to a metro node. In the first case(Figure1.5a) we simply connect each exchange site. Certainly the option ofconnecting each exchange site directly to the metro node leads to shorterconnection paths. However, the drawback of connecting each exchange site


(a) (b)

Fig. 1.5: (a) Local exchanges are directly connected to the metro node. (b)Local exchanges are connected to the metro node through a spanning tree.

directly is the total amount of cable used. In the second case (Figure 1.5b)we compute a minimum spanning tree rooted at the metro node. Certainlythis option minimises the total length of cable but the drawback is that wemight be violating the maximum cable distance allowed between the metronode and any of its exchange sites.

We are interested in both restricting the length of the paths and the totalamount of cable used. Keeping both requirements is known to be a hard prob-lem [13]. As mentioned before this problem is computationally complex [19].Notice that our problem is even more complicated than the bounded span-ning tree problem. The objective of our problem is to determine the optimalroutes of cables in the context of an already existing association of metronodes with exchange sites such that the total cable length required for con-necting each exchange site to two metro nodes is minimised subject to themaximum distance constraint and disjointness constraint. In other words, theidea is to find two edge-disjoint paths for all exchange sites to their respectivemetro nodes by maximising sharing (since we want to minimise the cost ofdigging, for example), and reducing the amount of cable.

1.8 Empirical Evaluation

In this section we present experimental results for the sequential and paral-lel versions of the local search algorithm proposed in this chapter to tackleERDCMST. Moreover, we also demonstrate the effectiveness of the algorithmwhen compared to a dedicated algorithm for the RDCMST. We present re-sults for random and real-world instances. The real-world instances corre-spond to the optical networks of three EU countries: Ireland, with 1,121exchange sites and 18, 20, 22, and 24 Metro Nodes; the UK, with 5,393 ex-change sites and 75, 80, 85, and 90 Metro Nodes; and Italy, with 10,709


exchange sites and 140, 150, 160, and 170 Metro Nodes. We present exper-imental results for the ERDCMST problem using both the sequential andparallel LS algorithms.

All the experiments were performed on a four-node cluster; each node fea-tures two Intel Xeon E5-2640 processors at 2.5 Ghz, and 64 GB of RAMmemory. Each processor has six cores for a total of 12 cores per node. Thelocal search algorithm was implemented in C++ and used openMP to im-plement the parallel version using shared memory. In all the experiments weuse the following parameters for Algorithm 1: p = 5%, 20 random movesin the perturbation phase of the algorithm, and the algorithm stops aftera given time limit is reached. Additionally, we use the trivial solution, i.e.,direct connections from the MNs to the ESs, as an initial solution for thealgorithm.

We now switch our attention to the ERDCMST problem. To this end, weevaluate the proposed algorithms using the following scenarios:

• Real-life: We consider real-life instances from our industrial partners withreal networks in Ireland, the UK, and Italy. Figure 1.6 shows the histogramwith the distribution of the distances from the ESs to their MNs. In thefollowing experiments we use λ = 67 for the Ireland and λ = 62.5 forthe UK and Italy. Additionally, Figure 1.7 shows the resulting facilityconnectivity graph for the three countries.

• Random: We generated 10 random instances extracted from the previousreal-life network in Ireland. Each instance is generated by using 18 facilitiesand for each facility we randomly selected |U | ∈ {100, 200, . . ., 1000} nodes.Instances were generated by iteratively selecting a random client from theIrish dataset and balancing the load of clients per metro node.

1.8.1 ERDCMST Results: Sequential LS

Table 1.2 reports results for the Random experimental scenario, which depictthe median value across 11 independent executions of the node and subtreeoperators; the best solution obtained with CPLEX; the best solution obtainedwith CPLEX using the solution of the first execution of LS with the subtreeoperator as warm start (LS+CPLEX); and the best known LBs for eachinstance obtained with CPLEX using a larger time limit.1 The time limitfor each local search experiment was set to 30 minutes, and 4 hours for theCPLEX-based approaches.

In these experiments we observe that the subtree operator generally out-performs the node operator. We attribute this to the fact that moving a

1 In this chapter CPLEX corresponds to solving the MIP model with IBM ILOG CPLEXOptimisation Studio version 12.5.1.


0 10 20 30 40 50 60 700

50

100

150

200

Distance

Exch

anges

Primary Backup

(a) Ireland with 18 MNs

0 10 20 30 40 50 600

100

200

300

400

Distance

Exch

anges

Primary Backup

(b) UK with 75 MNs

0 10 20 30 40 50 600

200

400

600

800

Distance

Exch

anges

Primary Backup

(c) Italy with 140 MNs

Fig. 1.6: Histogram of distances from exchanges to primary and backup metronodes

Table 1.2: Results for the small-sized instances of ERDCMST problem where|M | = 18, λ = 67, 30 minutes time limit for LS approaches, 4 hours forCPLEX, and 5 hours for LS+CPLEX

|U | LS (Subtree) LS (Node) CPLEX LS+CPLEX LB

100 4674 4674 4674 4674 4674

200 6966 6988 6962 6962 6962

300 8419 8575 8404 8404 8152400 9728 10008 9728 9721 9329

500 11203 11672 11318 11203 10298

600 11885 12559 12276 11924 10517700 13148 13981 13812 13140 11485800 14040 15133 15118 13977 12402

900 14770 16098 16438 14839 128601000 15962 17479 18174 16009 13943

complete subtree helps to maintain the structure of the tree in a single iter-ation of the algorithm. The node operator might eventually reconstruct thestructure, however, more iterations would be required. For the CPLEX-based


(a) Ireland with 18 MNs (b) UK with 75 MNs

(c) Italy with 140 MNs

Fig. 1.7: Facility connectivity graph (fcg) for each country


Table 1.3: Results for Ireland, UK, and Italy with 30 minutes time limit (walltime) for the LS algorithm and 4 hours time limit for CPLEX (wall time)

Country |M | Subtree CPLEX LB Gap-Subtree Gap-CPLEX

18 17155 26787 14809 13.67 44.71Ireland 20 16884 83746 14845 12.07 82.27

|U |=1121 22 16715 79919 14990 10.32 81.24

24 16173 26918 14570 9.91 45.87

75 66367 285014 54720 17.54 80.80

UK 80 65380 301190 54975 15.91 81.74

|U |=5393 85 64189 281546 55035 14.26 80.4590 62763 220041 55087 12.23 74.96

140 90796 – 76457 15.79 –

Italy 150 89519 – 76479 14.56 –|U |=10708 160 89497 – 76794 14.19 –

170 88497 – 77013 12.97 –

approaches we report the optimal solution for 100 and 200 clients, while themedian execution of the local search approaches reported the optimal solu-tion for 100 clients, and the subtree operator reached the optimal solutionin five out of the 11 executions for 200 clients. After |U | = 500 LS domi-nates the performance and a margin ranging between 1% (|U | = 500) to 12%(|U |=900). LS+CPLEX was only able to improve the average performanceof the subtree operator (reached after executing LS for one hour) by a verysmall factor, i.e., up to 0.4% for |U | = 800, after running CPLEX for fourhours. We also experimented with instances with |U | < 100 and |U | > 800.In the first case the three algorithms and the mixed approach (LS+CPLEX)reported similar results. In the second case, only LS with the subtree operatorwas able to provide good quality solutions with a gap of 10% with respect tothe LB.

Our second set of experiments for the sequential version of the constraint-based local search algorithms are showed in Table 1.3 where we report resultsfor real ERDCMST instances from Ireland, Italy, and the UK. In this case, weused a time limit of 30 minutes for LS (using the subtree move operator), andfour hours for CPLEX. As can be observed, LS dominates the performancein all these experiments and, once again, the solution quality of LS doesnot degrade with the problem size. Indeed, the gap with respect to the LBfor local search varies from 9.9% to 13.6% for Ireland, 12.2% to 17.5% forUK, and 12.9% to 15.7% for Italy. CPLEX ran out of memory when solvinginstances from Italy. We report ‘–’ when no valid solution was obtained. Forthe UK instances CPLEX also ran out of memory before the time limit.Once again we would like to recall that algorithms such as BKRUS, PBH,and KBH cannot be used for the ERDCMST problem as they are dedicatedalgorithms for the RDCMST that rely on the use of shortest paths to build


valid solutions. However, in the ERDCMST problem such paths might notbe available due to disjointness.

1.8.2 ERDCMST Results: Parallel LS

In this section we evaluate the performance of the proposed parallel localsearch algorithm. To this end we use the same real-world instances used forthe sequential algorithm for Ireland, the UK, and Italy. We limit our attentionto the subtree operator as it greatly outperforms the node operator in thesequential setting.

We define the gain of the parallel algorithm as the relative percentagegain with respect to the sequential algorithm in the cost solution after agiven time limit and using a given number of cores. Let C(t, inst, c) be thecost of the best solution obtained after t seconds using c cores to solve inst.Let T (cost, inst, c) be the time to reach a solution whose cost is at least asgood as cost for a given instance inst using c cores.

Gain(inst, c, t) = C(t,inst,1)−C(t,inst,c)C(t,inst,c) × 100.

Tables 1.4, 1.5, and 1.6 show the results of the empirical evaluation of theparallel algorithm. In these tables we present the cost of the solution for thesequential algorithm, the parallel algorithms, and the relative cost gain ofthe parallel algorithm after the 30-minute time limit. We use the multi-walk(MW) framework, i.e., executing multiple copies of the algorithm with differ-ent random seeds, as a baseline for comparison. We also include the proposedparallel algorithms using both random conflict (RC) and independent set (IS)for selecting multiple moves. For each instance and each approach we reportthe median value across 11 executions with a time-limit of 30 minutes.

Figures 1.8, 1.9, and 1.10 show the average performance evolution of thealgorithms (parallel and sequential) to tackle one instance for each dataset,i.e., Ireland with 18 facilities, the UK with 75 facilities, and Italy with 140facilities. The x-axis indicates the runtime and the y-axis the median qualityof the solution of the 11 executions.

Ireland instance (Table 1.4 and Figure 1.8). The local search algorithms finda very good solution within a very short time window (gap of up to 13%with respect to the lower bound), and the variance between independentexecutions of the algorithm is very low. For this reason, when increasingthe number of cores we observe very little difference in the performance ofthe algorithms.2 We observe that the sequential algorithm is slightly betterthan the parallel ones with a percentage gain of between 0.68% to 1.05%within the 30-minute time limit. However, we note that the parallel algorithm

2 Similar behaviour for other local search algorithms has been observed in [2] in the context

of the satisfiability problem.


reaches good solutions faster than the sequential one as depicted in Figure 1.8.This figure also helps to illustrate the difference between using independentsets and random conflict for computing the partitions. Notice that eight-core random conflict algorithm reports a better performance than 12-coreindependent set. That is because the cardinality of the maximum independentset for this problem is nine and the number of parallel processes is bounded bythat number, thus voiding the advantage of having three more cores. Randomconflicts allows as many parallel processes as the number of metro nodes inthe problem.

UK instance (Table 1.5 and Figure 1.9). The UK instance is about four timesbigger (with respect to the number of clients) than the Irish instance. Exceptfor the multi-walk approach (where no significance improvement is seen),we observe that the parallel algorithms improve the quality of the solutionswhen increasing the number of cores. Summing up, the observed performancegain ranges from 0.81% to 1.99% (four cores), from 0.73% to 1.81% (eightcores), and 0.84% to 1.89% (12 cores). Moreover, as depicted in Figure 1.9,the parallel algorithm based on single walk also reaches a very high-qualitysolution much faster than the sequential algorithm, and the performanceincreases as the number of cores increases. However, in this case, we observesimilar performances between independent set and random conflict. That isbecause the independent sets are always larger than 12 and therefore bothapproaches exploit parallelism as much as possible.

Italy instance (Table 1.6 and Figure 1.10).The largest performance improve-ment of the parallel algorithm is observed for Italy (Table 1.6). We attributethis to the size of the problem: the larger the problem the better the parallelalgorithms perform. Here we observe a performance gain ranging from 1.76%

Table 1.4: Performance summary of the parallel algorithms (Multi-walk(MW), random conflict (RC) and independent set (IS)), with a 30-minutetime limit (wall time)

Country |M | Seq Alg4 Cores 8 Cores 12 Cores

Cost Gain Cost Gain Cost Gain

18 17155

MW 17110 +0.26 17092 +0.37 17085 +0.41

RC 17293 -0.83 17287 -0.86 17266 -0.77IS 17276 -0.69 17324 -0.76 17307 -0.85

20 16884MW 16841 +0.26 16829 + 0.33 16828 +0.33RC 17001 -0.68 16998 -0.78 17015 -0.75

Ireland IS 17007 -0.73 17014 -0.69 17006 -0.76

|U |=112122 16715

MW 16704 +0.07 16691 +0.14 16686 +0.17RC 16891 -1.01 16888 -1.05 16896 -1.04IS 16886 -1.00 16896 -1.02 16884 -0.98

24 16173MW 16152 +0.13 16148 +0.15 16136 +0.23RC 16315 -0.93 16318 -1.05 16347 -1.05

IS 16327 -0.86 16327 -0.89 16323 -0.86





75 66367

MW 66153 +0.32 66126 +0.36 66093 +0.41

RC 65083 +1.99 64988 +1.73 64972 +1.82

IS 65083 +1.96 64980 +1.81 64981 +1.89

80 65380

MW 65328 +0.08 65282 +0.15 65265 +0.18

RC 64290 +1.65 64227 +1.35 64217 +1.53

UK IS 64328 +1.62 64207 +1.45 64195 +1.55|U |=5393

85 64189

MW 64168 +0.03 64146 +0.07 64131 +0.09

RC 63487 +1.11 63433 +0.97 63421 +1.02IS 63486 +1.10 63403 +1.01 63414 +1.04

90 62763

MW 62726 +0.06 62651 +0.18 62641 +0.19

RC 62210 +0.86 62171 +0.73 62153 +0.84IS 62260 +0.81 62191 +0.73 62140 +0.87




140 90796MW 90669 +0.14 90633 +0.18 90621 +0.19RC 88573 +2.51 88382 +2.71 88332 +2.80

IS 88529 +2.56 88358 +2.74 88330 +2.71

150 89519

MW 89427 +0.1 89357 +0.18 89309 +0.24

RC 87517 +2.27 87411 +2.42 87414 +2.41

Italy IS 87526 +2.26 87462 +2.37 87379 +2.43|U |=10709

160 89537

MW 89421 +0.13 89360 +0.20 89309 +0.26

RC 87679 +2.15 87579 +2.24 87525 +2.29

IS 87666 +2.13 87564 +2.26 87528 +2.31

170 88497

MW 88433 +0.07 88359 +0.16 88359 +0.16

RC 86954 +1.76 86925 +1.83 86862 +1.89

IS 86955 +1.77 86869 +1.86 86869 +1.88

to 2.56 (four cores), 1.83% to 2.74% (eight cores), and 1.88% to 2.80% (12cores). Similarly to the Irish and the UK datasets, Figure 1.10 shows theperformance in time of the parallel algorithm, and once again it can be ob-served that the parallel version reaches very good solutions faster than thesequential algorithm.

As pointed out before, the sequential algorithm obtains better solutionsfor the Irish dataset after the 30-minute time limit, however, the parallelalgorithm computes near-optimal solutions faster than the sequential algo-rithm. For instance, as shown in Table 1.7, the average gain (with respect tothe sequential algorithm) after 100 seconds using four cores is 1.62% (inde-


101 102 1031.5

2

2.5

3

3.5

4x 104

Time (s)

Cost

1 core4 cores indset4 cores randconf8 cores indset8 cores randconf12 cores indset12 cores randconf

Fig. 1.8: Ireland (|M | = 18): Solution cost vs. wall clock time (range [2, 1800]seconds

101 102 1030.5

1

1.5

2

2.5

3

3.5x 105

Time (s)

Cost


Fig. 1.9: UK (|M | = 75): Solution cost vs. wall clock time (range [2, 1800]seconds

pendent set) and 1.69% (random conflict), and after 10 seconds we observe again of up to 4.29% for random conflict and 1.39 % for independent set. Onceagain we observe that when the cardinality of the maximum independent setis small with respect to the number of cores random conflict performs betterthan independent set. Interestingly, the relative gain of the parallel algorithmwith respect to the sequential one is more than 100% on two occasions for theUK (up to 130% for RC with 12 cores and 100 secs) and on eight occasionsfor Italy (up to 182% for IS and RC with 12 cores and 100 secs).


101 102 1030

1

2

3

4

5

6x 105

Time (s)

Cost


Fig. 1.10: Italy (|M | = 140): Solution cost vs. wall clock time (range [2, 1800]seconds

Table 1.7: Average gain with different time settings for each country

CountryTime 4 cores 8 cores 12 cores

Secs IS RC IS RC IS RC

Ireland

10 -3.70 -6.05 1.39 4.00 0.65 4.29100 1.64 1.56 1.62 1.69 1.66 1.64

1000 -0.75 -0.79 -0.82 -0.78 -0.78 -0.821800 -0.82 -0.86 -0.83 -0.93 -0.86 -0.90

UK

10 6.88 5.29 66.53 77.37 125.85 130.32100 3.86 3.82 4.70 4.58 5.06 5.011000 2.13 2.15 2.37 2.34 2.42 2.40

1800 1.37 1.40 1.25 1.19 1.33 1.30

Italy

10 30.99 30.34 78.36 79.13 131.27 128.50100 167.63 168.96 181.79 180.87 182.30 182.17

1000 2.75 2.71 2.94 2.92 3.00 2.99

1800 2.19 2.16 2.30 2.3 2.33 2.34

Finally, Table 1.8 concludes the experiments reporting the average speedupfactor (out of the four scenarios for each country) to reach the best solutionafter a given amount of time, i.e., 10, 100, 1000, 1800 seconds. We recallthat the speedup factor is the gain in the speed of the parallel algorithm.We compute the speedup factor of the parallel algorithm with c cores after tseconds to solve a given instance inst as follows:

Speedup(inst, c, t) = T (C(t,inst,1),inst,1)T (C(t,inst,1),inst,c) .

In Table 1.8 we report ‘–’ in those cases where a parallel solution wasnot obtained within the timeout. In the 10 seconds time-limit scenario, we


Table 1.8: Average speedup factor with different time settings for each coun-try

CountryTime 4 cores 8 cores 12 coresSecs IS RC IS RC IS RC

Ireland

10 0.86 0.79 1.19 1.76 1.06 2.62

100 6.90 5.86 9.45 10.34 8.74 12.781000 – – – – – –1800 – – – – – –

UK

10 1.07 1.02 2.13 2.13 3.37 3.00100 1.86 2.03 3.58 3.54 5.25 5.31

1000 10.02 9.81 18.43 16.80 26.31 24.261800 7.49 7.31 13.84 12.54 18.96 17.94

Italy

10 3.00 3.00 4.50 4.50 4.50 4.5

100 4.00 4.03 7.82 7.68 11.00 10.79

1000 7.71 7.97 15.54 15.65 21.18 22.301800 12.22 11.72 22.98 20.57 28.60 30.38

observe a speedup factor close to 1 for the UK and Italy. The speedup factorimproves considerably after 1000 seconds. It goes up to a factor of 26 for theUK (IS with 12 cores and 1000 secs) to a factor of 30 with eight cores forItaly (RC with 12 cores and 1800 secs).

1.9 Conclusions and Future Work

We have presented an efficient local search algorithm for solving the Edge-disjoint Rooted Distance-Constrained Minimum Spanning-Tree problem. Wepresented two move operators along with their complexities and an incre-mental evaluation of the neighbourhood and the objective function. Further-more, we have proposed a parallelisation scheme for the local search algo-rithm, which significantly reduces the time required by the sequential ver-sion to reach high-quality solutions. Any problem involving tree structurescould benefit from these ideas and the techniques presented are relevant fora constraint-based local search framework where this type of incrementalityis needed for network design problems. The effectiveness of our approach isdemonstrated by experimenting with a set of problem instances taken fromreal-world long-reach passive optical network deployments in Ireland, Italy,and the UK.

In the future we would like to extend ERDCMST with the notion of op-tional nodes, since this extension is a common requirement in several applica-tions of ERDCMST. Effectively this means that we would compute for everyfacility a Minimum Steiner Tree where all clients are covered but the path tothem may follow some optional nodes. We also plan to investigate alterna-


tive heuristics for selecting the most suitable node and subtree for deletionat each iteration of the local search algorithm.

Acknowledgments

This work was supported by DISCUS (FP7 Grant Agreement 318137),and Science Foundation Ireland (SF) Grant No. 10/CE/I1853. The InsightCentre for Data Analytics is also supported by SFI under Grant NumberSFI/12/RC/2289.

References

[1] A. Arbelaez and P. Codognet. Massivelly parallel local search for SAT.In 24th IEEE International Conference on Tools with Artificial Intelli-gence, ICTAI’12, pages 57–64, Athens, Greece, November 2012. IEEEComputer Society.

[2] A. Arbelaez and P. Codognet. From sequential to parallel local searchfor SAT. In 13th European Conference on Evolutionary Computation inCombinatorial Optimisation, EvoCOP’13, volume 7832 of Lecture Notesin Computer Science, pages 157–168. Springer, 2013.

[3] A. Arbelaez and P. Codognet. A survey of parallel local search forsat. In Theory, Implementation, and Applications of SAT Technology.Workshop at JSAI’13, Toyama, Japan, June 2013.

[4] A. Arbelaez and Y. Hamadi. Improving parallel local search for SAT. In5th International Conference on Learning and Intelligent OptimizationLION 5, volume 6683 of Lecture Notes in Computer Science, pages 46–60. Springer, 2011.

[5] A. Arbelaez, D. Mehta, B. O’Sullivan, and L. Quesada. Constraint-based local search for the distance-and capacity-bounded network designproblem. In 26th IEEE International Conference on Tools with ArtificialIntelligence, ICTAI’14, pages 178–185. IEEE, 2014.

[6] A. Arbelaez, D. Mehta, B. O’Sullivan, and L. Quesada. A constraint-based local search for edge disjoint rooted distance-constrained minimumspanning tree problem. In 12th International Conference on Integrationof AI and OR Techniques in Constraint Programming, CPAIOR’15, vol-ume 9075 of Lecture Notes in Computer Science, pages 31–46. Springer,2015.

[7] R. Baraglia, J. I. Hidalgo, and R. Perego. A parallel hybrid heuristic forthe TSP. In Applications of Evolutionary Computing, EvoWorkshops2001: EvoCOP, EvoFlight, EvoIASP, EvoLearn, and EvoSTIM, volume


2037 of Lecture Notes in Computer Science, pages 193–202. Springer,2001.

[8] Y. Caniou, D. Diaz, F. Richoux, P. Codognet, and S. Abreu. Perfor-mance analysis of parallel constraint-based local search. In 17th ACMSIGPLAN Symposium on Principles and Practice of Parallel Program-ming, PPOPP’12, pages 337–338. ACM, 2012.

[9] T. G. Crainic and M. Gendreau. Cooperative parallel tabu search forcapacitated network design. J. Heuristics, 8(6):601–627, 2002.

[10] P. Q. Dung, Y. Deville, and P. Van Hentenryck. Constraint-based lo-cal search for constrained optimum paths problems. In 9th Interna-tional Conference on Integration of AI and OR Techniques in ContraintProgramming for Combinatorial Optimzation Problems, CPAIOR’19,volume 7298 of Lecture Notes in Computer Science, pages 267–281.Springer, 2010.

[11] E. Eaton, C. P. Gomes, and B. C. Williams. Computational sustainabil-ity. AI Magazine, 35(2):3–7, 2014.

[12] C. Gao, Y. Shi, Y. T. Hou, H. D. Sherali, and H. Zhou. Multicastcommunications in multi-hop cognitive radio networks. IEEE Journalon Selected Areas in Communications, 29(4):784–793, 2011.

[13] J. M. Ho and D. T. Lee. Bounded diameter minimum spanning treesand related problems. In Proceedings of the fifth annual symposium onComputational geometry, SCG ’89, pages 276–282, New York, NY, USA,1989. ACM. ISBN 0-89791-318-3.

[14] H. Hoos and T. Stutzle. Stochastic Local Search: Foundations and Ap-plications. Morgan Kaufmann, 2005.

[15] D. K. Hunter, Z. Lu, and T. H. Gilfedder. Protection of long-reachPON traffic through router database synchronization. Journal of OpticalCommunications and Networking, 6(5):535–549, 2007.

[16] M. Leitner, M. Ruthmair, and G. R. Raidl. Stabilized branch-and-pricefor the rooted delay-constrained steiner tree problem. In J. Pahl, T. Rein-ers, and S. Vo, editors, INOC, volume 6701 of Lecture Notes in ComputerScience, pages 124–138. Springer, 2011. ISBN 978-3-642-21526-1.

[17] R. Martins, V. M. Manquinho, and I. Lynce. An overview of parallelSAT solving. Constraints, 17(3):304–347, 2012.

[18] L. Michel, A. See, and P. Van Hentenryck. Parallel and distributed localsearch in comet. Computers and Operations Research, 36:2357–2375,2009.

[19] J. Oh, I. Pyo, and M. Pedram. Constructing minimal spanning/steinertrees with bounded path length. Integration, 22(1-2):137–163, 1997.

[20] S. Pant. Design and Analysis of Power Distribution Networks in VLSICircuits. PhD thesis, The school of Electrical Engineering in The Uni-versity of Michigan, 2008.

[21] D. B. Payne. FTTP deployment options and economic challenges. InProceedings of the 36th European Conference and Exhibition on OpticalCommunication (ECOC 2009), 2009.


[22] A. Roli. Criticality and parallelism in structured SAT instances. In 8thInternational Conference on Principles and Practice of Constraint Pro-gramming, CP’02, volume 2470 of Lecture Notes in Computer Science,pages 714–719, Ithaca, NY, USA, 2002. Springer.

[23] M. Ruffini, L. Wosinska, M. Achouche, J. Chen, N. J. Doran, F. Farjady,J. Montalvo-Garcia, P. Ossieur, B. O’Sullivan, N. Parsons, T. Pfeiffer,X. Qiu, C. Raack, H. Rohde, M. Schiano, P. D. Townsend, R. Wessaly,X. Yin, and D. B. Payne. DISCUS: an end-to-end solution for ubiquitousbroadband optical access. IEEE Communications Magazine, 52(2):24–56, 2014.

[24] M. Ruthmair and G. R. Raidl. A kruskal-based heuristic for the rooteddelay-constrained minimum spanning tree problem. In R. Moreno-Dıaz,F. Pichler, and A. Quesada-Arencibia, editors, EUROCAST, volume5717 of Lecture Notes in Computer Science, pages 713–720. Springer,2009. ISBN 978-3-642-04771-8.

[25] M. Ruthmair and G. R. Raidl. Variable neighborhood search and antcolony optimization for the rooted delay-constrained minimum spanningtree problem. In R. Schaefer, C. Cotta, J. Kolodziej, and G. Rudolph,editors, PPSN (2), volume 6239 of Lecture Notes in Computer Science,pages 391–400. Springer, 2010. ISBN 978-3-642-15870-4.

[26] O. V. Shylo, T. Middelkoop, and P. M. Pardalos. Restart Strategiesin Optimization: Parallel and Serial Cases. Parallel Computing, 37(1):60–68, 2011.

[27] R. Sigua. Fundamentals of Traffic Engineering. University of the Philip-pines Press, 2008.

[28] C. Truchet, A. Arbelaez, F. Richoux, and P. Codognet. Estimatingparallel runtimes for randomized algorithms in constraint solving. J. ofHeuristics, 22(4):613–648, 2016.

[29] P. Van Hentenryck and L. Michel. Constraint-based local search. TheMIT Press, 2009.

[30] M. Verhoeven and E. Aarts. Parallel local search. Journal of Heuristics,1(1):43–65, 1995.

[31] M. Verhoeven and M. Severens. Parallel local search for steiner trees ingraphs. Annals of Operations Research, 90:185–202, 1999.

[32] X. Yuan and A. Saifee. Path selection methods for localized quality ofservice routing. In 10th International Conference on Computer Commu-nications and Networks, ICCCN01, pages 102–107. IEEE, 2001.

Chapter 1 Parallel Constraint-Based Local Search: An ...

Documents

Transcript of Chapter 1 Parallel Constraint-Based Local Search: An ...