A flexible multicast routing protocol for group...

26
A flexible multicast routing protocol for group communication Sudhir Aggarwal a, * , Sanjoy Paul b , Daniel Massey c , Daniela Caldararu d a Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ 07974, USA b Bell Laboratories, Holmdel, NJ 07733, USA c Department of Computer Science, UCLA, Los Angeles, CA 90024, USA d Department of Computer Science, SUNY at Binghamton, NY 13902, USA Abstract Multicast routing, once dominated by a single routing protocol, is becoming increasingly diverse. It is generally agreed that at least three routing protocols, PIM, DVMRP, and CBT will be widely deployed and must interoperate. This signals a shift from the Mbone as one large domain to a collection of administrative domains where each domain selects its own multicast routing protocol. This paper proposes another multicast routing protocol, Conference Steiner Multicast (CSM), that is suited for domains that implement OSPF as the unicast routing protocol. CSM is targeted towards (sparse) multicast confer- encing and online discussion groups. Characteristics of such discussion groups include any member being a speaker or listener and dynamic changes in the group membership. CSM is futhermore well suited for domains with mobile hosts because its basic architecture can support a mobile environment. CSM is based on the use of a shared, heuristic Steiner minimal tree for interconnecting group members. A key component of the design is that it dynamically and reliably shifts to a dierent tree as changes warrant. CSM supports rudimentary entry control for security and permits application assistance over routing decisions (termed Application Assisted Routing). This paper describes the architecture of CSM as well as a prototype implementation. Several CSM routers have been interconnected to form a Multicast Steiner Backbone (Msbone). Standard applications such as vat, vic, and wb [V. Jacobson, Multimedia Conferencing on the Internet, Tutorial 4, ACM SIG-COMM 94, August 1994] have been modified to run on Msbone. CSM is designed to connect to the Mbone via interoperation with DVMRP, and as in- teroperation standards develop it should be capable of implementing these standards. Ó 2000 Elsevier Science B.V. All rights reserved. Keywords: Multicast routing; Conference; Steiner tree; Shared tree; Shortest path tree; Mobile multicast; Inter-domain multicast; Switching tree 1. Introduction Most communication on the Internet is based on a single sender, single receiver model. Examples include applications such as telnet, ftp, and web browsing. Many newer applications, however, do not fit into this model. Any application where the same data is sent to multiple receivers, such as an audio conference, is better supported by a model termed multicast where data is sent to a specific group and is only duplicated for delivery purposes when necessary. In the past decade, there has been an ever increasing interest in the design and Computer Networks 32 (2000) 35–60 www.elsevier.com/locate/comnet * Corresponding author. E-mail addresses: [email protected] (S. Aggar- wal), [email protected] (S. Paul). 1389-1286/00/$ - see front matter Ó 2000 Elsevier Science B.V. All rights reserved. PII: S 1 3 8 9 - 1 2 8 6 ( 9 9 ) 0 0 1 2 3 - 1

Transcript of A flexible multicast routing protocol for group...

Page 1: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

A ¯exible multicast routing protocol for group communication

Sudhir Aggarwal a,*, Sanjoy Paul b, Daniel Massey c, Daniela Caldararu d

a Bell Laboratories, 600 Mountain Avenue, Murray Hill, NJ 07974, USAb Bell Laboratories, Holmdel, NJ 07733, USA

c Department of Computer Science, UCLA, Los Angeles, CA 90024, USAd Department of Computer Science, SUNY at Binghamton, NY 13902, USA

Abstract

Multicast routing, once dominated by a single routing protocol, is becoming increasingly diverse. It is generally

agreed that at least three routing protocols, PIM, DVMRP, and CBT will be widely deployed and must interoperate.

This signals a shift from the Mbone as one large domain to a collection of administrative domains where each domain

selects its own multicast routing protocol.

This paper proposes another multicast routing protocol, Conference Steiner Multicast (CSM), that is suited for

domains that implement OSPF as the unicast routing protocol. CSM is targeted towards (sparse) multicast confer-

encing and online discussion groups. Characteristics of such discussion groups include any member being a speaker or

listener and dynamic changes in the group membership. CSM is futhermore well suited for domains with mobile hosts

because its basic architecture can support a mobile environment.

CSM is based on the use of a shared, heuristic Steiner minimal tree for interconnecting group members. A key

component of the design is that it dynamically and reliably shifts to a di�erent tree as changes warrant. CSM supports

rudimentary entry control for security and permits application assistance over routing decisions (termed Application

Assisted Routing).

This paper describes the architecture of CSM as well as a prototype implementation. Several CSM routers have been

interconnected to form a Multicast Steiner Backbone (Msbone). Standard applications such as vat, vic, and wb [V.

Jacobson, Multimedia Conferencing on the Internet, Tutorial 4, ACM SIG-COMM 94, August 1994] have been

modi®ed to run on Msbone. CSM is designed to connect to the Mbone via interoperation with DVMRP, and as in-

teroperation standards develop it should be capable of implementing these standards. Ó 2000 Elsevier Science B.V. All

rights reserved.

Keywords: Multicast routing; Conference; Steiner tree; Shared tree; Shortest path tree; Mobile multicast; Inter-domain multicast;

Switching tree

1. Introduction

Most communication on the Internet is basedon a single sender, single receiver model. Examples

include applications such as telnet, ftp, and webbrowsing. Many newer applications, however, donot ®t into this model. Any application where thesame data is sent to multiple receivers, such as anaudio conference, is better supported by a modeltermed multicast where data is sent to a speci®cgroup and is only duplicated for delivery purposeswhen necessary. In the past decade, there hasbeen an ever increasing interest in the design and

Computer Networks 32 (2000) 35±60

www.elsevier.com/locate/comnet

* Corresponding author.

E-mail addresses: [email protected] (S. Aggar-

wal), [email protected] (S. Paul).

1389-1286/00/$ - see front matter Ó 2000 Elsevier Science B.V. All rights reserved.

PII: S 1 3 8 9 - 1 2 8 6 ( 9 9 ) 0 0 1 2 3 - 1

Page 2: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

implementation of multicast capabilities[6,11,13,14,32,42,43]. The idea was ®rst imple-mented in the multicast backbone (Mbone) [15,42].The Mbone began as a small virtual network ontop of the existing Internet, workstations served asrouters, and multicast packets were exchanged byencapsulating them in unicast packets, termedtunneling. The Distance Vector Multicast RoutingProtocol (DVMRP) [42] speci®ed the protocol forrouting and delivery of multicast packets. TheMbone today covers thousands of networks withgrowing portions of the Mbone consisting of entireadministrative domains with production routershandling both unicast and multicast routing. It isnow almost universally agreed that the Mbone willnot consist of a single routing protocol. Insteaddi�erent domains will choose to implement di�er-ent multicast routing protocols, just as the case forunicast routing protocols.

DVMRP remains the clearly dominant routingprotocol. DVMRP, and Multicast Open ShortestPath First (MOSPF) [32], are designed for denselydistributed group members. Furthermore, theseschemes were designed in support of separatesource based trees for each sender. Importantprotocols that have focused also on sparsely dis-tributed group members, such as Protocol Inde-pendent Multicast (PIM) [12] and Core-BasedTree (CBT) [5], have received considerable atten-tion. More recently, protocols like EXPRESS [18]and Simple Multicast [7] have simpli®ed the mul-ticast address allocation problem faced by PIMand CBT, respectively.

Into this growing mix of routing protocols, thispaper explores the design and implementation ofConference Steiner Multicast (CSM), ®rst pro-posed in [1]. This protocol is not proposed as thesole routing protocol for multicasting, as no singleprotocol will likely su�ce. Instead CSM is de-signed as a protocol that could be used for a do-main such as a university or company. Theprotocol is well suited for domains that wish toeventually support mobile hosts and that use OpenShortest Path First (OSPF) for unicast routing.We also believe that the protocol warrants con-sideration as a backbone protocol, similar in na-ture to Border Gateway Multicast Protocol(BGMP) [29]. CSM could connect a site to the

Mbone through interoperation with DVMRP.While the standards for multicast routing protocolinteroperations are still undergoing active devel-opment, we anticipate that CSM will be able toimplement the standards as they take shape.

CSM is based on the use of a (shared) heuristicSteiner minimal tree for connecting members of agroup. However, by using our notion of Applica-tion Assisted Routing (AAR), the shared tree forany particular conference can be determinedthrough input from conference members. CSM hasthe following properties:1. It supports group communication that takes

place on a shared tree over which each usercan send or receive. It can be used to supportboth sparse and dense groups although we feelit is more appropriate for the support of sparsegroups.

2. It provides for basic entry control into a discus-sion group.

3. It supports application assisted routing wherethe application (speci®c discussion group) cansuggest the type of tree to be used for multicast.

4. It provides for dynamic joining and leaving ofmembers of the discussion group and the sameprimitives are used to implement the mobilityaspects of the protocol.

5. It provides for a mechanism to asynchronously(with respect to membership changes) switch toa `better' tree when necessary. The ability toseamlessly move to another tree is also a keycomponent for supporting mobility.

6. It can scale to large numbers of multicastgroups.

7. It separates issues of fault tolerance into stan-dard mechanisms for handling failures of rout-ers and links in the Internet and speci®cmechanisms for handling failures of CSM com-ponents.

8. It de®nes how inter-domain multicasting wouldbe handled.The idea of using shared trees for group com-

munication is not novel. For example, the CBT [5]protocol also advocates the use of a shared tree.We argue, however, that it is preferable to useheuristic Steiner minimal trees rather than shortestpath or ad hoc centered trees, as it will likely beessential to conserve network resources as the

36 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60

Page 3: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

number of multicast conferences in the networkincreases dramatically. Our approach is in contrastwith [14] which suggests that the cost of shortestpath source based trees is really not that bad ascompared with heuristic Steiner minimal trees.Furthermore although the cost of using core basedtrees [10] might also be cost e�ective, we see noadvantage to doing so as they may su�er fromconcentration problems unless the cores are cho-sen from a su�ciently wide set.

The idea of switching to another tree has alsobeen previously suggested (see for instance [14]again) but to our knowledge no speci®c mecha-nisms or architectures have been proposed. [10]discusses migration of cores but also does notpropose a precise architecture to e�ect the migra-tion. In this paper, we discuss these mechanisms ingreat detail. We also explore to a greater extentthan usual the issues of reliability, authenticationand interdomain operations.

Our work describes how mobile multicast canbe handled and this has not been proposed byother multicast protocols. The same ideas that areused to implement discussion groups with partici-pants leaving and joining fairly regularly can alsobe used to support multicasting where the appli-cations are mobile. Although we have not deter-mined the overhead of supporting mobility usingthis approach, we believe it is worth exploring.

Our implementation illustrates many of thefeatures of our protocol, shows CSM's ability tosupport existing conferencing applications, as wellas its potential for interoperation with other mul-ticast routing protocols.

In Section 2 of this paper, we discuss why aheuristic Steiner minimal tree is appropriate fordiscussion groups. In Sections 3 and 4, we presentthe CSM protocol architecture and show how itsupports conferencing and AAR. The extensionsnecessary for supporting mobile applications arediscussed in Section 5, resulting in mobile CSM.Implementation of CSM is described in Section 6,where we describe the development of our Msboneusing CSM capable routers. We show how existingMbone applications such as vat, vic, wb, etc. caninteroperate transparently, over both our Msboneand the current Mbone. Section 7 compares ourprotocol with other multicast protocols developed

over the past several years and the ®nal sectionpresents some conclusions and thoughts on futurework. Appendix A contains a list of abbreviationsused in this paper.

2. Multicast using Steiner trees

The problem of determining the minimumweight tree that spans a given subset of nodes D ofa graph G � �V ;E� with edge weights is know asthe Steiner Problem for graphs [17]. A Steinerminimal tree is de®ned as the optimal shared tree ifthe sum of the edge weights is to be minimized.The problem has been well studied and is knownto be an NP-complete problem [24]. Reasonableheuristics exist [26,35,38], however, for ®nding anapproximate solution in polynomial time. See also[31,33] for more e�cient heuristics in the class thatuses shortest path and minimal spanning tree al-gorithms. The time complexity of these heuristicsis typically O�jV j3). For example, we use theTakahashi Matsuyama heuristic [38], which isO�jDkV j2� to determine our heuristic Steiner trees[3]. These heuristics are guaranteed to produce atree whose cost is within twice that of the Steinerminimal tree. Thus, although ®nding a Steinerminimal tree is di�cult, it is fairly straightforwardto ®nd heuristic Steiner trees. See [21,40] for in-teresting surveys on such problems. Ad hoc cen-tered trees that use reasonable heuristics for®nding `good' shared trees may also be useful andare supported by our notion of Application As-sisted Routing. In the rest of this paper, we will usethe term CO Steiner tree to mean an approximate(close-to-optimal) Steiner minimal tree determinedusing a heuristic such as Takahashi Matsuyama.We will use the term Steiner tree to refer to aSteiner minimal tree or a CO Steiner tree.

2.1. Steiner trees and shortest path trees

Our architecture for CSM is only dependent onrunning an algorithm that determines a sharedtree. In fact our routers can run a variety of sharedtree algorithms based on tra�c considerations.Thus we implement Application Assisted Routingwhere the application itself provides information

S. Aggarwal et al. / Computer Networks 32 (2000) 35±60 37

Page 4: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

on the best-shared tree to use. As default, however,we create a CO Steiner tree. For example, if thetra�c was such that a single source was thedominant transmitter and delay was importantthen a heuristic such as [27,28] could be used toobtain a shared tree. Or if it was felt that the tra�cwas likely to come equally from a ®xed number ofparticipants, then an average centered shared tree[39,41] could be used. Heuristics to select cores forcore based trees [10] could also be used for theshared tree. However, as discussed in [10] theproblem of concentration can become a worseproblem for core based trees unless the cores areselected from a wide enough set.

Many computationally tractable heuristic al-gorithms have been developed in the literature todetermine CO Steiner trees. For current multi-casting, however, the dominant algorithms in usecreate a shortest path tree from the source (as-suming a single source) to the receivers [37,44].The shortest path tree is the tree that results whenthe shortest paths to each receiver are combined toform a tree by not duplicating common edges.Steiner trees have been deemed to be unsuitablefor multicasting as compared with shortest pathtrees because of two problems:1. the maximal delay is longer than that for short-

est path trees; and2. it is often argued that tra�c is concentrated on

the Steiner spanning tree links whereas in theshortest path trees the tra�c is more distributed.Shortest path trees are usually optimal if it is

desired to minimize the delay from each confer-ence participant, although even this may not betrue if the additional load imposed by shortestpath trees is signi®cant. If necessary, maximal de-lay bounds can be incorporated into the sharedtree heuristics. We believe that the low cost ofshared trees is a more compelling argument andthat the network delays are greatly dominated bydelays in the software processing of the protocolstack in the end applications. It was also shown by[14] that the maximum path lengths of CO Steinertrees seemed to be at most twice that of shortestpath trees when comparing single sources.

The second argument is actually misleadingwhen one considers the situation of having a largenumber of discussion groups, each of which is a

sparse subset of a larger graph. Actually, the no-tion of concentration is rarely de®ned precisely.Intutively, for a single discussion group it is truethat the tra�c is `concentrated' on the Steiner linksof that discussion group. This concentration,however, simply means that the Steiner tree linkssupport all the tra�c and that the other links inthe network support zero tra�c. For source basedshortest path trees more links in the network aresupporting tra�c related to that discussion group.In fact more tra�c is generated overall because theshortest path trees tend to bifurcate earlier andhence a packet multicast over a Steiner tree thatwould not be duplicated until later, is often du-plicated very early in a shortest path tree. SeeFig. 1. Many studies have shown that there areine�ciencies of shortest path trees over Steinertrees. These ine�ciencies translate into overallload in the network rather than problems of`concentration' when discussing large numbers ofmulticast groups.

In the next subsection we discuss some resultsof a study [9] that compares Steiner trees toshortest path trees when there are a large numberof discussion groups in the network.

2.2. Simulation results of Steiner trees vs. shortestpath trees

The essence of our argument is that as thenumber of discussion groups increases in the net-

Fig. 1. Steiner tree generates less tra�c compared to shortest

path tree.

38 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60

Page 5: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

work, the total tra�c increases more for source-based trees than for Steiner trees. With a largenumber of discussion groups, just as the totaltra�c is distributed throughout the network forsource-based trees, so will the total tra�c be dis-tributed throughout the network for Steiner trees.In fact, using the law of large numbers, one wouldexpect that any particular link would have a nor-mally distributed tra�c load with a mean thatwould be less for Steiner trees. Thus, source-basedtrees lose the much acclaimed advantages of dis-tributing tra�c in a real network with largenumbers of multicast sessions running at the sametime.

In our simulation study, CO Steiner trees usingthe Takahashi Matsuyama (TM) heuristic werecompared with shortest path trees. Although weused the TM heuristic, we could well have usedmore e�cient heuristics such as [31,33]. However,we used TM because it is readily available and ismore commonly used as a Steiner tree heuristic.Both directed and undirected graphs were con-sidered, but we report on only the undirectedcase.

Graphs were generated using a modi®ed versionof Waxman's [43] approach. In order to be closerto real world networks, graphs were generated by®rst generating a number of autonomous systemsand then interconnecting a small percentage of thenodes in each autonomous system with nodes inneighboring systems. The average node degreeranged from 5 to 10 in the study. For each au-tonomous system, the nodes were generated byuniformly distributing them in a rectangular area.The probability of connecting two nodes dependedon the distance between the two nodes:

P � e�ÿd�u;v�=La�;

where d�u; v� is the distance between the twonodes u and v, L the maximum distance betweenany two nodes in the graph, and a is a parame-ters in (0,1]. As a decreases, the density of shortedges relative to long edges increases. The aver-age node degree was used to determine the den-sity of edges in the graph. See Fig. 2 for a typicalgraph of 36 nodes distributed in four autono-mous systems, with an average node degree of 5and a � 0:2.

Conferences were generated based on Wax-man's construction with the ®rst member of theconference being randomly chosen and the othermembers chosen with the probability

P � e�ÿd=Ld�;

where d is the least distance between the newmember to be chosen and existing members in theconference. If d is small (of the order of 0.01)the conference group is called locally dense (all themembers will be concentrated in a small region ofthe ®rst member) while if d is of the order of 0.5,the conference is called sparse (all members will befairly far apart). L is again the maximum distancebetween two nodes in the graph. It was assumedthat in a conference any member was equally likelyto be a speaker. Note that a conference of a givennumber of members (usually between 10 and 20) isgenerated. Shortest path trees were computed foreach speaker to the rest of the group, and theTakahashi Matsuyama heuristic was used to createthe (unique) CO Steiner tree for the group.

Results are reported for a graph (network) with800 nodes in 16 autonomous systems for a sparseconference. The graphs had an average node de-gree of between 5 and 8. The number of membersin a conference was taken as 10. We plotted results

Fig. 2. Graph with 36 nodes and 4 autonomous systems.

S. Aggarwal et al. / Computer Networks 32 (2000) 35±60 39

Page 6: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

as a function of the number of conferences whichranged from 1 to 80. We ®rst describe the resultsfor loading in the network by considering thefollowing metrics computed for both Steiner treesand shortest path trees:· Average load on a link: This is de®ned as the av-

erage of the loads over all the links (edges) in thenetwork. For each conference, each multicastpacket that travels over a link increases the loadof that link by 1 for the Steiner tree. For short-est path trees, with 10 conference members,there are 10 di�erent shortest path trees (onefor each sender in the group) and each shortestpath tree increases the load of any link on itstree by 1/10.

· Maximum load on a link: This is de®ned to bethe maximum load over all the links in thegraph.

· Standard deviation of the load on a link: Thestandard deviation of the load over the n linksof the graph.The average load is shown in Fig. 3. As we have

argued, the average load increases with the num-ber of conferences and is signi®cantly more forshortest path trees as the number of conferencesbecomes large. In a real sense this is a measure ofconcentration in the network.

It is widely stated that shortest path trees gen-erate a lower tra�c concentration as comparedwith shared trees. One possible meaning might bethat the authors believe the maximum load on alink or the standard deviation of load on a link

might be lower for shortest path trees. Figs. 4 and 5show that this is not true as the number of con-ferences increases.

The argument might hold only for a very smallnumber of conferences. In fact our simulation re-sults do indicate this. Interestingly, however, the`small number' is really small. Fig. 6 shows themaximum load in more detail for 1 to 10 confer-ences and even here Steiner trees quickly havelower maximum load performance.

In [4] it was recognized that since the averageload of a typical shortest path tree was more thanfor a shared tree, it might be better to de®neconcentration as the ratio of maximum load toaverage load. Even here it can be seen that as thenumber of conferences increases, Steiner trees

Fig. 3. The average load ± sparse conferences.

Fig. 5. Standard deviation of load per link ± sparse confer-

ences.

Fig. 4. The maximum load ± sparse conferences.

40 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60

Page 7: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

provide better (lower) concentration. In Fig. 7, thetrade-o� occurs after about 30 conferences.

The results above argue that whatever the notionof concentration is de®ned to be, as the number ofconferences increases, Steiner trees becomes sig-ni®cantly more bene®cial as compared with shortedpath trees with respect to load on the network.

As indicated previously, it is certainly true thatthe maximum path delay for a Steiner tree wouldbe greater than for a shortest path tree. This delayis the maximum path length between two nodes inthe Steiner tree and the maximum path lengthfrom the sender to a receiver over all possiblesenders in the shortest path trees for the confer-ence. Since this result does not depend on thenumber of conferences (all results are averaged

over many simulation runs) we plot the maximumdelay as a function of the number of members in aconference. See Fig. 8.

Note that the maximum delay for a Steiner treefor our detailed study of a conference with 10members is about 1.5 times the maximum delay fora shortest path tree. This result is consistent withother such reports in the literature [4,12,14]. Fur-ther results relating to locally dense conferencesand directed graphs can be found in [9]. We nowturn to a description of the CSM protocol design.

3. The conference Steiner multicast protocol

The architecture of CSM is designed for use inan environment where the discussion group typi-cally has more than one active speaker and thespeaker is also a listener, although it is also ap-propriate for the situation of only one speaker andthe rest listeners. The protocol has been designedso that it can scale to use in a network with a largenumber of such discussion groups. The scaling ispossible because the functions of determining theadmission policy and the shared tree for a partic-ular session are distributed throughout the net-work. A basic aspect of the protocol is thatmembers of the group can easily join or leave thediscussion group. This feature of the protocol willbe particularly exploited when we discuss the ex-tensions needed for mobile CSM in Section 5. Theprotocol is most appropriate in support of sparse

Fig. 7. Maximum load/average load ± sparse conferences. Fig. 8. End-to-end delay ± sparse conferences.

Fig. 6. The maximum load for 1±10 conferences ± sparse con-

ferences.

S. Aggarwal et al. / Computer Networks 32 (2000) 35±60 41

Page 8: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

groups, although we believe that the protocolcould also be used for dense groups.

The protocol provides a low level of securitythrough its registration mechanism that can ®ltermembers who are allowed to join the discussiongroup. The security that is provided is of twotypes: ®rst, admission control permits refusing ofjoin requests and secondly, and more importantly,routing tables can only be updated by request fromother routers, thus preventing applications fromcausing havoc by unconstrainted joins or join re-quests. Of course, application level securitymechanisms should still be used for privacy andhigher level security.

A key aspect of the protocol is that it has anadaptive component that periodically determineshow much the current multicast tree di�ers fromthe CO Steiner tree and provides a mechanism toswitch to a more e�cient multicast tree. See [1,3]and also more recently [10]. For example, if thecurrent tree was 20% more costly than the COSteiner tree, it might be preferable to switch. Thisfeature of switching to another tree is done asyn-chronously with respect to the dynamic aspects ofgroup membership modi®cation due to the joiningand leaving of members, and provides highly reli-able delivery of packets during the critical transi-tion phase using the Dualcast protocol [3]. The useof Dualcast during the switch allows a seamlesstransition to a new tree by multicasting on twotrees simultaneously. In the context of mobilenetworks, a notion of multicasting on redundantlinks is used to e�ect hando�s between MobileSupport Stations. This can also be viewed as usinga version of the Dualcast protocol.

In this section we discuss the architecture of thebasic CSM protocol without the mobility aspects.The CSM protocol has three major phases: (1)initialization, in which a new discussion group isregistered and a CO Steiner tree or other sharedtree is determined for that group; (2) ongoing op-erations, in which members of the group sendmulticast messages to other members of the group,and during which time the membership of thegroups changes as members join and leave thegroup; and (3) switching, in which a new sharedtree is determined to be needed and dualcast is usedto e�ect a fault-tolerant transition to a new tree.

Before discussing the details of the protocol, wepresent some assumptions on the network envi-ronment and the notation that we shall use.

Assumptions

1. We describe CSM for an autonomous systemwhere the underlying unicast routing protocolis OSPF. Inter-domain operation can be sup-ported using BGP. We expect that the limita-tions of CSM due to its reliance on anunderlying link state protocol [30] will be lessimportant as an increasing number of domainssupport OSPF.

2. The Msbone is implemented by having cooper-ating routers running a multicast Steiner rout-ing daemon msrouted that implements theCSM protocol. The msrouteds are interconnect-ed using tunnels just as the mrouteds in MBone.The msrouteds are assumed to have the com-plete topology of the autonomous system justas in OSPF.

3. The cooperating router over Msbone could ac-tually run any shared tree algorithm using AARbased on input from the session initiator, butfor the rest of this paper, unless otherwise indi-cated, we assume that the router computes aCO Steiner tree. We use the term multicastrouter or simply router for a cooperating routerover the Msbone that is running msrouted.

4. We assume that there is some support at the ap-plication-layer to ®lter out duplicate packets.For many applications that use a higher levelprotocol such as RTP [36], this support is auto-matically included.

5. We assume that there is an application leveldaemon running in each area that is a multi-cast sessions directory (MSD) server. This ap-plication level daemon is accessible to anyonewishing to determine possible multicast ses-sions, and has the ¯avor of session directory(sdr) in Mbone [23]. This MSD server couldbe built in a decentralized hierarchical man-ner, but we do not address this issue in thispaper.

Notation

1. A discussion group G consists of a set of n ap-plications involved in the multicast session.

42 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60

Page 9: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

Each application can send/receive data multi-cast messages to/from all other applications inthe group. The ith application is Ai , its corre-sponding host node is Hi and its designatedmulticast router is DRi . The CO Steiner treeof our protocol will be a spanning tree coveringthe multiset fDR1; . . . ;DRng. This tree will betermed Tree(G). The DR runs msrouted, andis on the same subnet as the host. An applica-tion will be assumed to be known by an Appli-cation ID that incorporates information such asthe application's email address and UDP/TCPport number.

2. A multicast group becomes `operational' once ithas registered with its area MSD server. TheMSD server maintains information on eachgroup such as the following:

(a) Multicast IP address and port number,(b) Type of group,(c) Membership of the group,(d) Description of the multicast session.

The information entry for group G in MSD istermed Entry. We discuss the speci®c compo-nents of Entry when discussing registration.

3. Corresponding to each registered group G,some application, say A1, initiates the registra-tion process of the group. In the implementa-tion section we shall see that this is done by auser running a software interface tool calledumsd. The ®rst action of creating a group is ac-tually to start up a special application that exe-cutes the registration process and that will beresponsible for authentication of new groupmembers if the group is of a certain type(Closed group). This special application istermed Gatekeeper Application (GA) and usual-ly runs on the same local area network (or eventhe same host) as A1. For simplicity, we shallidentify this gatekeeper application with A1, al-though it can actually be a distinct applicationfrom the initiator. See the implementation sec-tion for more details. The Designated Router(DR) for GA is important and we term thisrouter Gatekeeper Router (GR). The gatekeeperapplication is speci®c to a group G and hence aGA should be ideally represented as GA(G) andthe corresponding GR should be represented asGR(G). However, the terms GA and GR will be

used instead of the terms GA(G) and GR(G) un-less it is necessary to do so.

4. The logical distinguishable entities in our archi-tecture that can send and receive messages arethus Application (A), Router (R), MSD, Desig-nated Router (DR), Gatekeeper Application(GA), and Gatekeeper Router (GR). For sup-porting mobile applications we will use one ad-ditional entity, a MobileSupportStation (Mss).

5. As a matter of convenience, the message prim-itives exchanged between the logical entitiesare named according to a convention. Eachprimitive is of the form Source_Destina-tion_Message (parameters) followed by re-quest/reply where Source is the sending entityand Destination is the target entity. Message isthe name of a message, and parameters are spe-ci®c to a given message. Request typically repre-sents a request for a speci®c service, while replyis the response corresponding to a given re-quest. The Source/Destination part of a mes-sage, or the request/reply part is omitted whenthere is no ambiguity based on the context.For instance, if the message is A_GA_Join_sparse (Aj, DRj ), then the message is from anapplication A (source) to the Gatekeeper Appli-cation GA (destination), the name of the mes-sage is Join_sparse and the parameters are Aj

and DRj.The notation discussed above is summarized in

Table 1.

3.1. Registration

Before a group can become operational, it isnecessary for information about the group sessionto become generally available. As discussed pre-viously, we assume this is done through an MSD

Table 1

Notation

Ai ith application belonging to group G

Hi Host machine running Ai

DRi Designated Router for Hi

Ti Tra�c descriptor corresponding to Ai

GA Application registering a session for group G

GR Multicast router serving GA

MSD Multicast Session Directory

S. Aggarwal et al. / Computer Networks 32 (2000) 35±60 43

Page 10: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

server that can be queried (using the interface toolumsd) to determine the available multicast sessionsand their characteristics. We shall further assumethat there is an MSD server in each domain (calledarea MSD) and that these servers communicatewith one another to insure that information aboutmulticast sessions is current and available to allpotential participants.

Assume that an application wishes to register apotential group G with its area MSD. Because it isthe initiator of a discussion group, this applicationis GA and its designated router is GR. GR plays acrucial role in the CSM protocol. The gatekeeperapplication contacts MSD by sending aGA_MSD_Register request message containingthe IP address of GR and also containing the fol-lowing information:1. Type(G): Type of group. A group can be either

open or closed. An open group is one whereany one can join without any permissionsfrom the gatekeeper. A closed group is onewhere the GA is responsible for admittingnew participants. A group can also be sparseor dense.

2. Membership(G): Membership of the group. Themembership of the group consists of the initialset of applications that will be involved in thediscussion group.

3. Description(G): Description of group. This infor-mation might be needed to allow potential par-ticipants to decide whether or not they want tojoin a session.On receiving the Register request message,

MSD replies by adding the group to its list ofdiscussion groups (creating an Entry) and sendinga MSD GA Register reply message back to thegatekeeper application. The MSD server also de-termines Address(G) which is a multicast IP ad-dress and port at which the group will operate.Thus, an Entry contains at least the following in-formation: Address(G), Type(G), Membership(G),Description(G), GA(G), and GR(G). An entry in atypical MSD table is shown in Table 2.

Once the GA receives the Register reply messagefrom MSD, it sends a GA_GR_Register requestmessage to GR indicating that GR will be re-sponsible for computing a CO Steiner tree for thegroup.

Note that using application assisted routing,GA could also send a code-number (the AAR in-formation provided by the gatekeeper application)to GR indicating the type of shared tree to create.This code-number will be used by GR to executethe appropriate routing algorithm for the group. Aspeci®c AAR code-number might also includeadditional information such as expected tra�c orbandwidth limitations.

To review then, the message primitives forregistration are:1. Register request/reply (between GA and MSD)

and2. Register request/reply (between GA and GR).

3.2. The multicast CO Steiner tree

Given the set of DRs associated with the ap-plications of a group, the gatekeeper router willgenerate the corresponding CO Steiner tree.

As discussed previously, we assume that the GRhas the complete topology of its domain and isthus able to compute a CO Steiner tree using aheuristic such as Takahashi Matsuyama. The in-put to this algorithm is simply all the multicastrouters in the domain and the associated link in-formation, similar to the type of information thata protocol such as OSPF requires, plus the set ofDRs.

Using the AAR code, an approximate Steinerminimal tree heuristic might also use the tra�cinformation previously de®ned to help in de®ningan even more appropriate tree. We have not donea detailed study of the use of such tra�c infor-mation, but we believe that any of the heuristicscan be improved by using this information. Theresulting spanning tree Tree(G) (when using thedefault shared tree algorithm) is a CO Steiner tree

Table 2

MSD table

Entry[i] Address�G� � hClass-D IP Address; Port

NumberiType�G�� Closed, Sparse

Membership�G� � fAigDescription(G)� ``Discussion on Stock Prices''

GA�G� � hHost IP Address; port numberiGR�G� � hRouter IP Address; port numberi

44 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60

Page 11: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

that uses a common set of links over which totransmit packets. See Fig. 9.

In our implementation, the Takahashi Matsu-yama heuristic is used to compute the CO Steinertree. The Takahashi Matsuyama heuristic worksas follows. Start with an arbitrary DR and itstrivial (partial) spanning tree. Iteratively choosethe closest DR to the current tree from the re-maining set of DRs and update the partial span-ning tree by adding the path from this DR to thetree. Note that the path may contain routers whichare not DRs. These routers are needed to inter-connect the DRs in the Steiner tree. The set ofrouters that constitute the CO Steiner tree forgroup G is referred to as Routers(G).

The fact that only the GR computes the COSteiner tree and is responsible for subsequent up-dates is one important element in the scalability ofour protocol. For each group, there will be a GR,but any multicast router can be a GR. Thus, for alarge number of discussion groups, it is expectedthat the function of the GR would be fairly uni-formly spread among the multicast routers. Wehave proposed that the GR for a group would bedetermined as being the closest router to the ap-plication that initially set up the group. However,it is possible that this function of load maintenancemight not be spread out su�ciently among avail-able routers. In such a case, it would certainly bepossible to implement a slightly more sophisticatedalgorithm at the GA to choose some other GR if

necessary. We have not investigated this loadbalancing issue further.

It should also be observed that calculating theCO Steiner tree is at worst similar in polynomialtime complexity to ®nding all shortest paths in anetwork and is done infrequently as compared tothe ongoing operational aspects of the protocol.

In our simulation study, we computed both COSteiner trees and shortest path trees and thecomputation times were comparable.

The GR now sends io-set information 1 (dis-cussed in the next section) to each router inRouters(G) using the GR_R_Ioset_distributionprimitive. This io-set information is di�erent foreach router in Routers(G) and is unicast to Rout-ers(G). We assume that this control informationcan be sent in a reliable manner, either by usingTCP (as in our implementation) or by sending theinformation repeatedly.

Since the io-set information will be sent out inparallel to all the routers in Routers(G), the time todo this distribution will be the maximum one-waydelay between the GR and a router in Routers(G).Typical one-way coast-to-coast delay is on theorder of hundreds of milliseconds. Thus we believethat the io-set distribution time is not something tobe concerned about.

3.3. Steady state CSM protocol operation

The steady state CSM protocol operation fromthe perspective of the Routers(G) is quite simple.Each router in Routers(G) has an entry termed theRentry corresponding to each discussion group G.Rentry contains the IP multicast address for G, anio-set, and a delivery-set. The io-set (for G) is theset of the incoming/outgoing links (called virtuallinks) to other adjacent routers in Tree(G). Thedelivery-set is the set of links (called physical links)to hosts containing applications being servicedby the router. If there is a common interface be-tween the io-set and the delivery-set, it is removedfrom the io-set to prevent looping. See Table 3.

Fig. 9. Steiner tree and related terms. 1 Can be thought of as routing table entries.

S. Aggarwal et al. / Computer Networks 32 (2000) 35±60 45

Page 12: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

As an example, the io-set for router R1 in Fig. 9is {2}, while its delivery-set is {1}. Similarly, the io-set for router R6 in Fig. 9 is {1, 2, 3, 4}, while itsdelivery-set is empty.

Any IP packet arriving on an io link with the IPmulticast address of G is simply forwarded on allthe other io links. For instance, if a packet arrivesfor group G at router R6 through io link 1, it willforward the packet on io links 2, 3 and 4 (Fig. 9).Note that if the router is a leaf node of the Steinertree (that is, the io-set has only one element), thepacket is no longer forwarded to other routers.

If the router is a DR (that is, the delivery-set isnot empty), then it must deliver the packet to oneor more hosts connected to this router by using thedelivery set information, for ®nal delivery to theapplications it is directly serving (i.e., for which itis the closest multicast router). For example, whenR2 receives a packet through io link 2, it forwardsthe packet to application A2 running on host H9

(Fig. 9). We reiterate that only GR is responsiblefor computing the CO Steiner tree for G and forperforming any updates. The Routers(G) operatein an extremely simple manner and maintain onlyminimal information. This contributes to thescalability of CSM.

3.4. Joining the discussion session

Suppose a new application, Aj, on host Hj

wishes to join a discussion group. Aj ®rst queriesits area MSD server to determine if there is adiscussion group of interest by scanning the entriesin MSD. Assume that Aj is interested in joiningdiscussion group G. Aj then obtains the relevantEntry from MSD. It is now necessary for Aj to jointhe discussion session G by having its nearestmulticast router, DRj, join the current shared treevia a path to a router, Rg, that is already a memberof the Tree(G).

We outline two methods, Sparse Set Join andDense Set Join, that will allow Aj to have DRj jointhe shared tree for G by connecting to Rg. We shallassume that the join method is indicated in theEntry of MSD (for example, through the type®eld) and depends on the membership and type ofthe discussion group. Assume furthermore, thatDense Set Join is only used for Open type groups.We address the issue of how this method is de-termined later.

Our Sparse Set Join proposes that GR simplycompute the shortest path from the new joiningnode to any node in the current shared tree. Notethat this computation is also used in the TakahashiMatsuyama heuristic and in fact would generallyrequire only a lookup as GR keeps track of allshortest paths between nodes. In [43], Waxmanproposed a parametrized weighted greedy algo-rithm to determine a path for joining a new nodeto an existing source based tree. The algorithmminimized the following function over all nodes vin the existing tree:

W �v� � �1ÿ w� � distance�newnode; v�� w � distance�v; source�

where the parameter w ranges from 0 to 0.5. OurSparse Set Join heuristic is identical to this withw � 0.

Our Dense Set Join simply connects to the ex-isting tree by trying the shortest path to the GRuntil a node of the current tree is hit. Note that thisis like the other extreme of the weighted greedyheuristic �w � 0:5� or joining in naive multicastrouting [14] (where one determines the shortestpath from the source) with GR playing the role ofsource, but is actually more similar to the join inCBT (®nd the shortest path to the core router astarget) because it is receiver initiated with GR asthe target.

There are many interesting join mechanismsthat have been proposed for dynamically addingnodes to existing trees or for dynamically modi-fying Steiner trees on-line. See for instance Ka-direÕs paper [25] on his Geographic SpreadDynamic Multicast, or work on dynamic Steinertree problems [8,22]. Using AAR, we could, infact, choose any of these algorithms for dynami-

Table 3

Rentry in the routing tables

Rentry [1] Class-D address io-set delivery-set

. . . .

. . . .

. . . .

Rentry [n] Class-D Address io-set delivery-set

46 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60

Page 13: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

cally adding (or deleting) nodes to the currenttree.

Sparse Set Join (Fig. 10)· Aj sends a A_GA_Join_sparse (Aj, DRj) request

message to GA.· GA checks whether Aj is authorized to join

group G, and if so, it sends a GA_GR_Com-pute_src_route(DRj) request message to GR.

· GA sends back a response to Aj using a GA_A_Join_sparse (yes/no) reply message.

· If the reply is `yes', GR computes the shortestpath to a current multicast router in the Tree(G),and updates the io-sets of the relevant routers.For example, in Fig. 10, the GR (router R2), in-forms router R8 that its neighboring routers inTree(G) are R7 and R3. Router R3 will translatethis information into its local io-set link infor-mation {1, 3}.The primitive messages that need to be imple-

mented for a Sparse Set Join are:1. Join_sparse request/reply (between an applica-

tion and GA),2. Compute_src_route request/reply (between GA

and GR), and3. Ioset_distribution (from GR to R, no reply nec-

essary).

Dense Set Join. We do not present the Dense SetJoin algorithm in detail as it is straightforward.The primitive messages that need to be imple-mented for a Dense Set Join are:

1. Join_dense_request (from an application to itsdesignated router, no reply necessary),

2. Join_dense_request (from router to router, noreply necessary), and

3. Application_joining_dense (from router to rout-er, no reply necessary).

3.5. Leaving the discussion session

The method of leaving a discussion group candepend on the exact information that GA ismaintaining about the discussion group. The exactalgorithms for the leave could also depend on thedynamic tree modi®cation algorithm being used.Although we have not discussed the details, forsparse set operation, GA would maintain full in-formation on applications and their closest multi-cast routers. For dense mode operation, however,GA need not maintain information on applica-tions. We assume that routers would poll to de-termine if members on their directly connectedLAN have dropped out.

For the sake of brevity, we only describe theprimitive messages we have used in implementingleave.

Sparse Set Leave. The primitive messages thatneed to be implemented for a Sparse Set Leave are:1. Leave_sparse request (from an application or

DR to the GA, no reply necessary),2. Delete_from_tree sparse request (from GA to

GR, no reply necessary), and3. Ioset_distribution (from GR to R, no reply nec-

essary).

Dense Set Leave. The message primitives needed toimplement the Dense Set Leave are:1. Leave_dense request message (between an appli-

cation and its designated router),2. Leave_dense poll message (between DR and the

corresponding application), and3. Node_leaving_dense request message (between

routers).

3.6. Computing an updated CO Steiner tree

So far, ongoing operations and joining andleaving the discussion group have been described.It is clear that as this continues, the resultingFig. 10. Sparse join.

S. Aggarwal et al. / Computer Networks 32 (2000) 35±60 47

Page 14: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

shared tree will perhaps no longer be a `good' treefor the multicast operations.

Recall that the initial Entry in the MSD servercould have contained expected tra�c informationfor the group G. The GR, however, is clearlyequipped to monitor this tra�c and could have amuch better idea of the tra�c situation on anongoing basis. GR can use this and any informa-tion from the application using AAR to determinewhether the current tree still meets performancerequirements.

We did a simple study that shows the bene®t ofrecomputing the CO Steiner tree periodically aftera certain number of join/leave events. We assumedthat a new member was added to the tree at thenearest node to the current tree. Group member-ship was modi®ed (a member was added or re-moved) randomly but so as to approximatelymaintain the size of the group. We compared thecase in which the shared tree was then reset to theCO Steiner tree (based on the heuristic algorithm)after some number of events as compared with atree that was not reset. The ine�ciency measuresthe percent increased weight of the current sharedtree vs. a CO Steiner tree found from the heuristic.In Fig. 11, the upper graph is the result of notresetting the tree (naive), and the lower graph

(recon®gured) resets every 10 modi®cations to theCO Steiner tree. The graphs are averages of 200runs. Note that periodic transitions to a new treereduces the average ine�ciency from around 24percent to about 3%. These results are similar towhat others have reported [8,9]. In [9], using oursparse join and leave algorithm for the 800-nodegraph discussed in Section 2 with a conference sizeof 10, the ine�ciency approaches 20% only afterabout 50 join/leave events. Thus, resetting to theCO Steiner tree need not be an extremely frequentoperation.

Based on the fact that computing the COSteiner tree can be done fairly fast and that dis-tributing the io-set is also fairly e�cient, we believethat switching to a more optimal tree periodicallyis feasible.

We thus propose that after a certain number ofevents, GR recompute the CO Steiner tree. We callthis the proposed Tree(G) tree, as compared withthe current Tree(G). If the di�erence between thetwo is above a threshold (ine�ciency of 20% forexample), then the gatekeeper initiates an algo-rithm SWITCH to switch from the current tree tothe proposed tree. Notice that the computation ofproposed trees is done asynchronously with otheractivities, and can be done in the background.Although joins and leaves could be acknowledgedduring a switch, we only discuss the case wherejoins and leaves are queued until the SWITCH iscompleted.

Assume therefore that the proposed Tree(G) issu�ciently better than the current Tree(G). TheGR then initiates the SWITCH tree algorithm de-scribed in Section 3.7.

3.7. The Switch Tree algorithm

The Switch Tree algorithm is intended to switchover from the current shared tree to the proposedCO Steiner tree.

The basis of the SWITCH Tree algorithm isthat during the transition to the new tree, routerswill be forwarding messages on both the currentand the proposed tree. This provides a fault-tol-erance to packets not being propagated correctlyfor some reason during the transition period. SeeFig. 12.Fig. 11. E�ect of periodically computing CO Steiner tree.

48 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60

Page 15: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

Routers on the current tree will be termedCurrent_G routers, and routers on the proposedtree will be termed Proposed_G routers. Of course,this set may have a large intersection. Notice that aproposed router that is also a current router mighthave a di�erent io-set in the proposed tree. Call arouter that is a current router but will not be aproposed router a Conly router. A router that is nota current router but is a proposed router is a Ponly

router. A router that is both a current and a pro-posed router is a C and P router. For example, inFig. 12, R3 and R4 are Conly routers, R5 is a Ponly

router, and R1, R2, R6 are C and P routers. TheSWITCH algorithm operates as follows.1. GR sends io-set information to all proposed

routers using GR_R_Ioset_distribution mes-sage as discussed previously in Section 3.2.Ponly routers create a new Rentry , whereas Cand P routers maintain two entries, one re¯ect-ing the current io-set and the other re¯ecting theproposed io-set. For instance, the io-set for R5,which is a Ponly router, is {1}. The io-set for R4,which is a Conly router, is {1}. R6, which is a Cand P router, has two io-sets: {1,2,3,4} for thecurrent tree and {1,2} for the proposed tree.

2. GR sends a GR_R_Start_switch message torouters associated with group G. Note thatthese will be both current and proposed routers.For reliability, this message is sent via TCP.

3. On receiving a GR_R_Start_switch message forG over an incoming link, a Conly and a Ponly rout-er propagates the message as usual on all otherio links. The ®rst C and P router getting a datapacket needs to duplicate the packet and propa-gate it on each of the two trees by checking eachof the two current and proposed Rentry Õs and act-ing on each. It will also mark the data packetwith an indication that it should go on the cur-rent or proposed tree. Subsequent C and P rout-ers will simply propagate the packet on thecorrect tree based on the data packet marking.A simpler scheme, which we implement, doesnot require data packets to be identi®ed andmarked but rather has each C and P router sim-ply forwarding the packets using both Rentry Õs.This could potentially create some cycles in thegraph during the switching time period but thecycles will be broken anyway after the switchingis over and the circulating packets will soon beeliminated using the TTL value of IP packets.

4. After a su�cient time has elapsed such that allrouters have received the message, GR sends anGR_R_End_switch message to routers ofgroup G.

5. On receiving an GR_R_End_switch message, aConly router deletes its Entry. A C and P routerdeletes the Entry corresponding to current tree.Basically, the period between GR_R_Start_

switch and GR_R_End_switch is when multicastsare being carried by both the current tree as well asthe proposed tree. Other algorithms are also pos-sible for synchronizing the switching period moreclosely if accurate global clocks at each router areassumed to exist.

The message primitives needed to implementthe switch tree algorithm are:1. Ioset_distribution message (from GR to Rout-

ers(G))2. Start_switch request message (from GR to Rout-

ers(G))3. End_switch request message (from GR to Rout-

ers(G))

4. Gatekeeper failures and inter-domain operations

In this section we brie¯y outline our approachto handling failures during the operation of the

Fig. 12. Switching from current to proposed tree.

S. Aggarwal et al. / Computer Networks 32 (2000) 35±60 49

Page 16: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

CSM protocol. We also discuss extensions that areneeded to the protocol for inter-domain opera-tions.

4.1. Gatekeeper failures

In order to make the CSM protocol tolerant tofaults, we follow the methodology discussed in [19]in which a set of software facilities are provided tohelp an application raise its level of fault tolerancein terms of the availability and consistency of thedata. This approach to fault tolerance is di�erentthan simply providing, for instance, a primary andsecondary domain name server. In CSM, the roleof the gatekeeper (GA and GR) is very important.We advocate using a modi®ed primary site ap-proach to sofware fault tolerance in which theservice that is to be made fault tolerant is repli-cated on one or more backup nodes. Thus wewould implement a primary and backups for bothGA and GR. The primary periodically check-points its state on the backups, and when theprimary fails, a backup takes over as the primary.This approach has been successfully used in manypractical distributed computing environmentsusing reusable components [20] that provide theneeded fault tolerance without extensive progam-ming e�ort. The computational overhead forproviding this kind of fault-tolerance varies from1% to 7% [20]. More details on our approach canbe found in [2]. Note that the backup, by watchingits primary, relieves an application from having toitself try the secondary when the primary fails. Theburden of implementing the fault tolerance is thusnot on the application.

4.2. Inter-domain operations

Our approach to extending CSM to operate inan inter-domain environment is to duplicate andcoordinate all the basic functions on an as neededbasis in all the domains that have members ofgroup G. There is a primary domain and a numberof secondary domains each with a MSD, GA andGR. The entities in the secondary domains arecreated on demand and they go away if the nomembers remain in the domain. GR in each sec-ondary domain computes its own tree and grafts

that tree to the primary tree through the borderrouter of the domain. Due to space considerations,we refer the interested reader to [2] for additionaldetails.

5. Mobile CSM

As discussed in the introductory sections, one ofour goal in designing CSM was to make it fairlyeasy to develop a multicast protocol that involvesapplications that are mobile. See also [34]. We nowshow how our CSM architecture and the primitivemessages we have previously de®ned can be ex-tended to an architecture for mobile CSM (calledMCSM in this section). For concreteness of thediscussion, imagine a mobile application to be anapplication that is running on a computer in a caror running on a portable pc that an individual iscarrying around.

The logical distinguishable entities that wereable to send and receive messages in CSM wereApplication, Router, MSD, Designated Router,Gatekeeper Application, and Gatekeeper Router.We have one new logical entity in CSM that weterm MobileSupportStation (Mss). An Mss is a®xed entity on the network that provides necessarysupport to a mobile application. An Mss can beequated to a Base Station in the cellular architec-ture. Our protocol related to hando� is based onthe assumption that there is substantial overlap-ping between adjacent cells [16], such that a mobilemay be assumed to have adequate communicationwith more than one base station during transitionfrom an old to a new base station.

The Mobile CSM protocol will be described inreference to the architecture given in Fig. 13, inwhich there are several mobile support stations,connected to a ®xed packet-switched backbonenetwork. Each cell (represented by a hexagon) hasa Mss which acts as the access point for the mo-biles in the cell. In Fig. 13, Mj is a mobile hostrunning an application Aj and the mobile hostaccesses the backbone network through the mobilesupport station Mss1. As long as the mobile hostMj remains in the cell of Mss1, it interacts withMss1 exactly in the same way as host Hj does withits designated router DRj in the non-mobile con-

50 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60

Page 17: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

ferencing mode. As the mobile Mj starts to moveaway from Mss1, there is a period, when it hasadequate communication [16] with both Mss1 andsay, Mss2, which is a neighbor of Mss1 (Fig. 14).

During this hando� period, Mj will inform thenew mobile support station Mss2 that it wants tojoin the multicast session, and that control infor-mation will be propagated through the routersresulting in the addition of a new branch from R5

to Mss2 in the existing shared tree (Fig. 14). Notethat during this hando� period, the mobile Mj hastwo paths, one through Mss1 and the otherthrough Mss2, for receiving (sending) packets from(to) the multicast session in progress. Thus the

notion of redundancy is used for reliable deliveryof packets to the mobile station.

After the mobile Mj moves away from Mss1,and starts to communicate only through Mss2, thecontrol information is propagated through therouters, resulting in the removal of the branchR5±R6±Mss1 (Fig. 15).

Thus the multicast tree gets altered dynamicallyas the mobiles move in and out of cells. In par-ticular during hando�, the existing tree is aug-mented with new branches to provide redundancy,and after hando� is over, unnecessary branches arepruned o� the multicast tree.

Additional details of the Mobile CSM archi-tectue can be found in [2]. Although the logicaldesign is straightforward using our approach,without an implementation and experimentation,it is not possible to assess overhead and e�ciencyissues. We expect that the level of dynamicity ofmembers leaving and joining will a�ect the e�-ciency of Mobile CSM. For example, if mobilesare very highly mobile, it could be wise to antici-pate this and prepare all neighboring cells for re-ceiving the multicast by setting up a tree thatpretends that there are mobiles in all adjacent cells.

6. Implementation

Using the CSM design, a virtual network calledMsbone was developed for use in the Internet. In

Fig. 13. Architecture for mobile CSM.

Fig. 14. Redundant connections during hando�. Fig. 15. Switching to a new cell after hando�.

S. Aggarwal et al. / Computer Networks 32 (2000) 35±60 51

Page 18: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

the Msbone, group admission and routing ishandled in a manner described by the CSM pro-tocol. The Msbone supports both CSM style andtraditional multicast applications and can inter-operate with the Mbone. In our implementation,we developed two major software components: (1)umsd (user multicast session directory) and (2)msrouted (multicast Steiner routing daemon). Theumsd is the user's interface to CSM and permitsusers to create (and delete) group sessions and joinand leave sessions, and to launch (multicast) ap-plications such as audio/video conferencing tools.The msrouted implements the routing, communi-cations, and GR functions of CSM. In this section,we discuss our CSM implementation and discusshow multicast applications can be implementedover CSM. Since Msbone has been implementedusing UDP tunnels and corporate ®rewalls do notlet UDP tra�c in or out, our experimentation hasso far been limited to within the Lucent intranet.Hence performance comparison with MBone hasnot yet been done.

To form the virtual network, each LAN selectsone host to serve as the designated ``multicastrouter''. This host runs msrouted. Msrouted isresponsible for forwarding multicast packets toand from the LAN. MsroutedÕs exchange multi-cast packets by encapsulating them within unicastpackets. Two msroutedÕs which exchange packetvia this encapsulation method are said to be con-nected by a tunnel. The collection of LANs,msrouted's, and tunnels constitute the Msbone.Note that in our implementation, there is a one-to-one identi®cation with a CSM capable LAN and amsrouted. See Fig. 16.

6.1. Msbone applications

Unlike other multicast schemes, the CSM de-sign allows a multicast application to assert a greatdeal of in¯uence over the group via the GatekeeperApplication concept. Applications designed forCSM can utilize this additional in¯uence for im-proved operation. However, a large base of ap-plications already exist that do not incorporateCSM style concepts. Msbone was designed tosupport both types of applications.

In the CSM model, there is the notion of aGatekeeper Application, GA, that controls ad-mission to the group and in¯uences the group'srouting. Typically the group's initiator acts as orinitiates the GA that registers the group withMSD. The GAÕs (unicast) address is advertisedalong with the group address and to join a group,a user sends a request (via unicast) to the GA (seeFig. 17). If the request is approved, the GA mustinform the Gatekeeper Router so a route will beestablished to the new member (see Fig. 18).

While the gatekeeper concept adds to thecomplexity of creating and joining a group, it alsoprovides substantial advantages. The gatekeeper

Fig. 16. Msbone architecture.

Fig. 17. Registration of a new session.

52 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60

Page 19: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

application can assert control over group mem-bership. In many applications, there is a desire toknow and perhaps restrict the group membership.With the GA concept, membership control ismanaged cooperatively within the network and theapplication can be assured that the routes are be-ing computed based only on the valid members.The GA can also control when new members areallowed to join the group by delaying when thejoin is sent to the network. This allows the multi-cast application to enforce a rule that only allowsnew members to enter at speci®ed times, if so de-sired. In addition to membership control, the GAcan in¯uence routing decisions. At any time theGA can instruct the network to recompute theroutes or even switch to an entirely new algorithmfor computing routes. In a multicast for twospeakers, the GA could ®rst optimize the routesfor the initial speaker and later change the routesto optimize for the second speaker using AAR.

This GA functionality can be incorporated intothe multicast application itself, it can be handledby a separate GA process, or it can be assigned toone of several default Gatekeeper Applicationsthat run at well known addresses throughout thenetwork.

6.1.1. Applications with built-in gatekeepersAn application designer could chose to build

the Gatekeeper Application functionality directly

into the multicast application code. A simplecommand line option could indicate the applica-tion is to be run in `gatekeeper mode'. When theapplication is run in non-gatekeeper mode, it couldautomatically generate a join request and send it tothe `gatekeeper mode' application. This join pro-cedure can occur without the user's knowledge orcould pop up windows asking for informationsuch as passwords to complete the join. Someapplications naturally have the concept of a gate-keeper and these concepts are not only easy for theapplication designer to implement, but also highlydesirable. For example, an application to holdcommittee meetings may naturally want to desig-nate one user as the chairperson and the chair-person's application should allow control overwho attends the meeting.

6.1.2. Separate gatekeeper application processesSome application designers may ®nd it desirable

to locate the GA functions in a separate process orprogram. Typically the group's initiator wouldstart this GA process. The application could bedesigned so it contacts the GA process automati-cally or the user might ®rst have to run a joinprocess or wrapper that only starts the main ap-plication after the join has succeeded. The GAprocess can de®ne its own packet formats andcontrol scheme or it could make use of MsboneGA control packet formats. The Msbone de®nespacket formats for an open, closed, and secureadmission control scheme. Open format allowsany host to join, closed restricts membership to aset of hosts, and secure requires a password to jointhe group. No restrictions are placed on how ad-mission control is to be handled. The only re-quirement is that the GA and potential membershave some agreed upon message format for joiningthe group.

6.1.3. Support for standard applicationsApplications such as vat, vic, wb, and nevot

[23] have become network standards for multicastconferences. These type of applications do notsupport the concept of a GA, but can still be usedon the Msbone. Several default Gatekeeper Ap-plications can run at well known sites and can

Fig. 18. Joining an existing session.

S. Aggarwal et al. / Computer Networks 32 (2000) 35±60 53

Page 20: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

serve as the GA for any application that does notwish to provide its own GA. This allows non-CSM applications to operate on the Msbone andbene®t from Msbone admission control. An ap-plication launching program is used to ®rst con-tact a default GA (or ®rst create the groupthrough the GA). Once the GA has been con-tacted and a join succeeds, the launching programcan exit and the application is run as it normallywould be. When creating the group, the defaultGA can be instructed to use open, closed, or secureadmission control. This allows tools such as vat tomake use of the Msbone's admission control ifdesired.

Mbone users will be accustomed to launchingmulticast applications via the sd or sdr tool.Msbone has a similar tool, umsd for User Multi-cast Session Directory (see Fig. 17). With umsd auser can create a group and join a group in afashion nearly identical to that of Mbone and sd.The user need not even be aware of the GA con-cept. The umsd tool handles all the details relatingto the GA and then starts the multicast applica-tion.

6.1.4. Designing a gatekeeper applicationApplication designers that want to design their

own GA may be uncomfortable having to dealwith the network level multicast issues. Admissioncontrol is something the application designer islikely to be aware of, but multicast routing is oftennot an area of expertise for the application de-signer. The GA is responsible for communicatingmembership to the network (via the GatekeeperRouter concept of CSM) and application designersmay ®nd this troublesome.

To solve this problem, Msbone provides de-signers with a network interface library. Adding amember to the group is as simple as calling afunction called `addmember'. Various routing al-gorithms are listed by number and selected via a`recomputetree' function. Much the way a socket iscontrolled by simple ¯ags, multicast routing canalso be controlled by well-de®ned functions andoption ¯ags. The application designer need notunderstand the complexities of the multicastrouting problems and need only call functions

from the multicast control library. We omit thedetails of this control library.

6.2. Msbone routing

Since modifying existing routers is often notpractical, Msbone uses host software to performrouting tasks. A LAN selects one host to runmsrouted. Each msrouted is con®gured to ex-change packets with a ®xed set of remotemsrouted's. These remote sites are speci®ed in acon®guration ®le that is read when msrouted isstarted.

The current release of msrouted runs in userlevel space without requiring any special permis-sions. Better performance can be achieved byrunning with superuser privileges, but the proto-type was chosen to run without this for easierinitial deployment. Readers familiar with encap-sulation methods should note that this requiresthat the msrouted tunnels encapsulate multicastpackets in IP and UDP headers rather than just IPheaders. Changing to only IP encapsulation (andthus also superuser privileges) is primarily a tech-nicality and presents no fundamental changes tothe Msbone design.

Msrouted is logically divided into four modules:(1) Tunnel Message Handler, (2) Control MessageHandler, (3) Gatekeeper Router Module, and (4)Topology Learning Module. See Fig. 19.

Fig. 19. Msrouted architecture.

54 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60

Page 21: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

6.2.1. Tunnel message handlerThe Tunnel Message Handler focuses on

reading and writing local multicast packets andexchanging packets with other msroutedÕs. Oncean application has joined a group via the GA anda route has been constructed, the applicationsimply writes a multicast packet onto the LAN.The multicast packet is received by all local groupmembers as well as msrouted. Msrouted looks upthe groupÕs entry in the route table and deter-mines which adjacent msrouted's should receivethe packet. The multicast packet is encapsulatedwithin an IP and UDP header and sent (unicast)to the appropriate adjacent msrouteds. When anencapsulated packet arrives at an msrouted,msrouted checks the multicast address of the en-capsulated packet and forwards it to the appro-priate (adjacent) msrouteds. If there are localgroup members, msrouted decapsulates the packetand multicasts it locally on the network. Any lo-cal members receive this multicast as if they wereon the same LAN as the original sender. SeeFig. 20.

6.2.2. Control message handlerThe Control Message Handler module of

msrouted will receive route updates from Gate-keeper Routers. In CSM terminology, these routeupdates are referred to as io-sets. An io-set is as-

sociated with a particular group and tells themsrouted which neighboring msrouteds are part ofa groupÕs multicast routing tree. Route entries maybe explicitly cleared by receiving a NULL io-setfor a group or the entry will expire if refreshmessages are not periodically received. The controlunit is responsible for processing io-sets, main-taining the route table, and periodically expiringold entries.

6.2.3. Gatekeeper router moduleRecall that CSM designates a speci®c router

(per group) known as the Gatekeeper Router (GR)that computes and distributes the route entries (io-sets) for that group. As discussed previously, forscalability, any msrouted may act as a GR andmay serve many groups simultaneously. Thechoice of gatekeeper router is typically the nearestrouter to the GA that registers the group, but canreally be any router in the Msbone. The Gate-keeper Router Module performs these functions. Itreceives group membership updates from the GAand computes a route accordingly. This moduleimplements application assisted routing and anumber of shared tree heuristics can be used tocompute the tree. Each algorithm is associatedwith a code-number and the GA decides whichalgorithm number to use. These modules are de-signed so that additional algorithms can easily beadded to msrouted. Adding a new algorithm is assimple as adding the algorithm function name toan included ®le, recompiling, and linking in theobject code for the function.

6.2.4. Topology learning moduleSince CSM is based on computing CO Steiner

trees, topology information about the network isessential. This fourth msrouted unit is responsiblefor maintaining topology information. The precisealgorithm for doing this is not fundamental to thedesign and could be replaced by OSPF. All that isrequired is that the Topology Learning Moduleprovide up-to-date topology information to theGatekeeper Router Module for computing theshared trees. The Topology Learning Module alsonoti®es the Gatekeeper Router Module in theevent of topology changes.Fig. 20. Multicast packet forwarding in Msbone.

S. Aggarwal et al. / Computer Networks 32 (2000) 35±60 55

Page 22: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

6.3. Interoperation

It is increasingly unlikely that any one multicastalgorithm/architecture will gain complete accep-tance. Interoperation between algorithms is es-sential. The current Msbone can interoperate withany other system such as Mbone in the followingbasic manner. An Msbone `gateway' runs on aLAN with a non-Msbone router. For example, theMsbone gateway can run on any LAN in theMbone. The gateway acts as a proxy groupmember. If any Msbone host joins a Mbonegroup, the gateway also joins the same group.The non-Msbone algorithm will deliver packets tothe gateway just as it would any other host. Thegateway then routes the packets onto the Msboneas if it had generated the packet. Packets leave theMsbone by arriving at the Msbone gateway andthen being multicast onto the gatewayÕs LANwhere the other (non-Msbone) multicast routerwill see the packets and distribute them as if theyhad been issued by any local group member. In asimilar way, Msbone applications could allownon-Msbone participants to join Msbone sessions.

An initial version of Msbone is operational inthe Bell Labs research network, and can be used tocreate new multicast sessions or join existing ses-sions in a manner similar to Mbone. Each sessioncan run Mbone applications, namely, vat, vic, andwb [23] at present. Work is ongoing to develop amore complete prototype that can be used forexperimental studies.

7. Related protocols

Two main protocol architectures have beendeveloped for dense multicast groups. DVMRP isan extension of a distance vector unicast protocoland MOSPF is an extension of a link state uni-cast protocol. DVMRP has been successfullydeployed over the Mbone. Both of these proto-cols were designed for multicast routing usingsource-based trees where the unique sender is thesource. The problems with these protocols withrespect to scaling and use for sparse groups arewell known and have been discussed by severalauthors. In particular, the mechanism that

DVMRP employs to prune and regrow trees isnot practical for sparse groups. MOSPF requiresthat each router must receive and store mem-bership information for each multicast group andrequires that each router compute the deliverytree for each group. Thus, the information stor-age and processing requirements for MOSPFcause di�culty when scaling to a large number ofmulticast groups. It can be seen that CSM doesnot have these scaling problems as it uses ashared tree and handles joins and leaves di�er-ently from these protocols.

Two major protocol architectures have beenproposed for sparse multicast groups. CBT pro-poses the use of a shared tree centered at a core[5]. CSM is similar to CBT in that it scales in thesame manner as CBT and also uses a shared treefor the group multicast. However, CSM preciselyde®nes how the shared tree is set up and modi®edas necessary by the gatekeeper for the group. Incontrast, CBT does not discuss the placement ofthe core and initially suggests to either use stati-cally con®gured cores or to have the nearest routerto a host act as a core. It should be clear thatalthough the notion of a gatekeeper in CSM issomewhat akin to a core router, the gatekeeperplays the crucial role in CSM of computing theclose-to-optimal shared tree and is unlikely to be a`core router' in the sense of CBT. Furthermore,the fact that the initiator of a group is usually theGA for the group and its DR is the GR for thegroup implies that any router can become a GR.Therefore, for a large number of multicast groups,the gatekeepers would likely be distributed uni-formly across the network of multicast routers,and there is no single router which will become thebottleneck. This is the key to the scalability ofCSM.

A second major di�erence in CSM as comparedwith CBT is that CSM discusses in detail how toswitch to another more optimal tree and providesa detailed method as to how to do so reliably.Other di�erences include details on how joins andleaves are implemented.

CSM also di�ers from CBT in its handling offailure recovery. CBT attempts to integrate failurerecovery directly with the CBT protocol. In con-trast, CSM separates the failure recovery issue into

56 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60

Page 23: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

two parts. Failures of gatekeepers are handled by amechanism discussed in this paper. Failures oflinks and nodes, on the other hand, are assumed tobe handled by the network of multicast routers in amanner similar to the way failures are currentlyhandled by the Internet routers. (See for instanceour implementation of the msrouted TopologyLearning Module.) Thus, multicast routers wouldexchange link state information assuming anOSPF type of protocol was implemented at thatlevel. Finally, CSM de®nes extensions to multipleadministrative domains.

The second major architecture that has beenproposed for sparse multicast groups is PIM [12].PIM describes the use of rendezvous points (RP's)that initially allow the creation of a shared tree.PIM then discusses how to move to a shortest pathsource based tree if desired. In contrast, CSMstarts out with a CO Steiner tree (if an initial set ofparticipants is known) and then continuouslyevaluates the membership set to see if it is wise tomove to another shared tree. CSM precisely de-®nes how the shared tree is initially setup by thegatekeeper and CSM also clearly allows for achoice of shared tree algorithms to be used by agatekeeper. CSM also di�ers from PIM in that themulticast routers in CSM have a simpler archi-tecture for ongoing multicast sessions. CSM onlyuses the multicast address and the ioset to deter-mine how to route a packet (it uses a delivery set ifa locally attached host wants that packet). Incontrast PIM requires more processing by therouters, requiring them to maintain informationon incoming and outgoing links for each group aswell as the RP addresses. Furthermore, additionalcomplexity is entailed by having the routers re-sponsible for switching from a shared tree to asource-based tree. Although PIM describes the useof a shared tree, it is clearly designed to quicklyswitch to a source-based tree. The premise of CSMis to always be operating on an appropriate sharedtree.

PIM is also di�erent from CSM in separatingthe senders from receivers. Although for ®xedsender sources this may be advantageous, there isno simple way in PIM of converting a sender into areceiver in an existing multicast tree. Therefore,PIM may have trouble with applications such as

conferencing and chat groups, in which the sendersand receivers interchange their roles very often inthe middle of a session.

CSM, as contrasted with PIM and CBT sup-ports application assisted routing and clearly dis-cusses extensions that allow it to be used in amobile environment.

8. Conclusion and future work

In this paper we presented a ¯exible multicastrouting protocol targeted towards conferencingand online discussion groups. We have discussedhow this new protocol architecture di�ers fromother architectures previously proposed. Some ofthe key ideas of the paper are the distributed dy-namic computations of the CO Steiner trees forthe multicast groups and the support for applica-tion assisted routing, the fault-tolerant transitionfrom an old tree to a new tree, extension of thebasic protocol to an inter-domain environment,and the introduction of the watchdog concept tomake the basic protocol robust in a hostile net-work environment. The basic protocol primitivescan be also be used to support mobile multicast-ing. Hando�s in mobile CSM are also handled in anovel way as compared to other mobile protocolapproaches.

Our current implementation of CSM o�ers thefollowing features:· Group admission control in cooperation with

the network.· Application dependent group admission

schemes.· Ability for each group to select its own routing

algorithm.· Application driven recomputation of routes.· Application control over adjusting routes for

new members.· Ability to easily add new routing algorithms to

the system.The current implementation is local to the Lu-

cent intranet. In future work, we intend to deployMsbone more broadly and do a detailed perfor-mance analysis.

We believe a key lesson learned from the con-struction of the Msbone is that the application and

S. Aggarwal et al. / Computer Networks 32 (2000) 35±60 57

Page 24: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

network can be integrated in a simple way. Thisallows the application to gain better performanceand prevents situations where the network mustguess the application's desires.

Acknowledgements

The authors are grateful to Amritansh Raghavfor several early discussions on multicasting.

Appendix A. List of abbreviations used in the paper

References

[1] S. Aggarwal, S. Paul, A ¯exible protocol architecture for

multi-party conferencing, in: Proceedings of the

ICCCNÕ96, October 1996, pp. 81±91.

[2] S. Aggarwal, S. Paul, D. Massey, D. Calderaru, A ¯exible

protocol architecture for multi-party conferencing: from

design to implementation, Technical Memorandum, Lu-

cent Technologies, April 1998.

[3] S. Aggarwal, A. Raghav, DUALCAST: a scheme for

reliable multicasting, in: Proceedings of the ICNPÕ94,

October 1994, pp. 15±21.

[4] T. Billhartz, J.B. Cain, E. Farrey-Goudreau, D. Fieg, S.G.

Batsell, Performance and resource cost comparisons for the

CBT and the PIM multicast routing protocols, IEEE

Journal on Selected Areas in Communications 15 (3) (1997)

304±315.

[5] A.J. Ballardie, P.F. Francis, J. Crowcroft, Core based

trees, in: Proceedings of the ACM SIGCOMM, 1993.

[6] K. Bharat-Kumar, J.M. Ja�e, Routing to multiple desti-

nations in computer networks, IEEE Transactions on

Communications COM-31 (3) (1983) 343±351.

[7] T. Ballardie, R. Perlman, C. Lee, J. Crowcroft, Simple

scalable internet multicast, Technical Report, University

College, London, April 1999.

[8] F. Bauer, A. Varma, ARIES: A rearrangable inexpensive

edge-based on-line Steiner algorithm, IEEE Journal on

Selected Areas in Communications 15 (3) (1997) 382±

397.

[9] D. Caldararu, Heuristic Steiner trees for networks with

large numbers of multicast conferences, M.S. Thesis,

Department of Computer Science, SUNY, Binghamton,

August 1997.

[10] J. Donahoo, K.L. Calvert, E.W. Zegura, Center selection

and migration for wide-area multicast routing, Journal of

High-Speed Networks 6 (2) (1997).

[11] S. Deering, Host Extensions for IP multicasting, RFC

1112, May 1988.

[12] S. Deering, D. Estrin, D. Farinacci, V. Jacobson, C. Liu,

L. Wei, An architecture for wide-area multicast routing,

in: Proceedings of the ACM SIGCOMM, August 1994,

pp. 126±135.

[13] S. Deering, D. Cheriton, Multicast routing in datagram

internetworks and extended LANs, ACM Transactions on

Computer Systems (May 1990) 85±111.

[14] M. Doar, I. Leslie, How bad is naive multicasting, in:

Proceedings of the IEEE INFOCOM, 1993, pp. 82±89.

[15] H. Eriksson, Mbone: the multicast backbone, Communi-

cations of the ACM (August 1994) 54±60.

[16] L.G. de R. Guedes, M.D. Yacoub, Overlapping cell area in

di�erent fading conditions, in: Proceedings of the 1995

IEEE 45th Vehicular Technology Conference, July 1995,

pp. 380±383.

[17] S. Hakimi, Steiner's problem in graphs and its applications,

Networks 1 (1971) 113±133.

[18] H. Holbrook, D. Cheriton, IP multicast channels: EX-

PRESS support for large-scale single-source applications,

in: Proceedings of the ACM SIGCOMM, August 1999,

pp. 65±78.

[19] Y. Huang, C. Kintala, Software implemented fault toler-

ance: technologies and experience, in: Proceedings of the

23rd International Symposium on Fault Tolerant Com-

puting (FTCS-23), June 1993, pp. 2±9.

[20] Y. Huang, C. Kintala, Software fault tolerance in appli-

cation layer, in: M. Lyu (Ed.), Software Fault Tolerance,

Wiley, New York, 1995, Ch. 10.

Abbreviation Expansion

AAR Application-Assisted RoutingBGP Border Gateway ProtocolCBT Core-Based TreeCO Close-to-optimalCSM Conference Steiner MulticastDR Designated RouterDVMRP Distance Vector Multicast

Routing ProtocolGA Gatekeeper ApplicationGR Gatekeeper RouterMbone Multicast BackboneMOSPF Multicast Open Shortest Path

FirstMSBone Multicast Steiner BackboneMSD Multicast Sessions DirectoryMSS Mobile Support StationOSPF Open Shortest Path FirstPIM Protocol Independent MulticastRTP Real-Time Protocol

58 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60

Page 25: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

[21] F.K. Hwang, D.S. Richards, Steiner tree problems, Net-

works 22 (1992) 55±89.

[22] M. Imase, B.M. Waxman, Dynamic Steiner tree problem,

SIAM Journal of Discrete Math 4 (3) (1991) 369±384.

[23] V. Jacobson. Multimedia conferencing on the internet,

Tutorial 4, ACM SIG-COMM 94, August 1994.

[24] R.M. Karp, Reducibility among combinatorial problems,

in: R.E. Miller, J.W. Thatcher (Eds.), Complexity of

Computer Computations, Plenum, New York, 1972.

[25] J. Kadirire, G. Knight, Comparison of dynamic multicast

routing algorithms for wide-area packet-switched (ATM)

networks. in: Proceedings of the IEEE INFOCOM, 1995,

pp. 212±219.

[26] L. Kou, G. Markowski, L. Berman, A fast algorithm for

Steiner trees, Acta Informatica 15 (1981) 141±145.

[27] V.P. Kompella, J.C. Pasquale, G.C. Polyzos, Multicasting

for multimedia applications. in: Proceedings of the IEEE

INFOCOM, 1992, pp. 2078±2085.

[28] V.P. Kompella, J.C. Pasquale, G.C. Polyzos, Multicast

routing for multimedia communication, IEEE/ACM

Transactions on Networking 1 (3) (1993) 286±292.

[29] S. Kumar, P. Radoslavov, D. Thaler, C. Alaettinoglu, D.

Estrin, M. Handley. The MASC/BGMP architecture for

inter-domain multicast routing, in: Proceedings of the

ACM SIGCOMM, August 1998, pp. 93±104.

[30] J.J. Garcia-Luna-Aceves, J. Behrens, Distributed, scalable

routing based on vectors of link states, IEEE Journal on

Selected Areas in Communications 13 (8) (1995) 1383±

1395.

[31] L. Lou, K. Makki, An even faster approximation algo-

rithm for the Steiner tree problem in graphs, Congressus

Numerantium 59 (1987) 149±154.

[32] J. Moy, Multicast routing extensions for OSPF, Comm.

ACM 37 (8) (1994) 61±66.

[33] K. Makki, N. Pissinou, O. Frieder, E�cient solutions to

multicast routing in communications networks, Mobile

Networks and Applications 1 (1996) 221±232.

[34] C. Perkins, IP Mobility Support', RFC-2002.

[35] V. Rayward-Smith, The computation of nearly minimal

Steiner trees in graphs, International Journal of Mathe-

matics and Educational Science and Technology 14 (1983)

15±23.

[36] RTP: A Transport Protocol for Real-Time Applications,''

Internet Draft, March 1995, hdraft-ietf-avc-svc-rtp-07.txti,October, 1995.

[37] H.F. Salama, D.S. Reeves, I. Viniotis, Comparison of

multicast routing algorithms for high-speed networks, IBM

Technical Report, September 1994.

[38] H. Takahashi, A. Matsuyama, An approximate solution

for the Steiner problem in graphs, Mathematics Japonica 6

(1980) 573±577.

[39] D.W. Wall, Mechanisms for broadcast and selective

broadcast, Ph.D. Thesis, Stanford University, June

1980.

[40] P. Winter, Steiner problem in networks: a survey, Net-

works 17 (1987) 129±167.

[41] D.W. Wall, S. Owicki, Construction of centered shortest

path trees, Networks 13 (1983) 207±231.

[42] D. Waitzman, C. Partridge, S. Deering, Distance Vector

Multicast Routing Protocol, RFC 1075, 1988.

[43] B.M. Waxman, Routing of multipoint connections, IEEE

Journal on Selected Areas in Communication, 6±9 Decem-

ber 1988, 1617±1622.

[44] L. Wei, D. Estrin, A comparison of multicast trees and

algorithms, CS Department Technical Report, University

of Southern California, September 1993.

Sudhir Aggarwal is Head of the Inter-net Systems Research Department atBell Laboratories, Lucent Technolo-gies, in Murray Hill, NJ. Prior to hiscurrent position, he was professor andchairman of the Computer ScienceDepartment at the State University ofNew York, Binghamton, NY. His in-terests are in computer networks, de-sign and development of distributedsoftware systems and communicationprotocols, real-time systems, and dis-tributed interactive simulation. He re-ceived the Ph.D. degree from the

University of Michigan in Computer and Communication Sci-ences in 1975, and M.S. and B.S degrees in Mathematics fromthe University of Michigan and Stanford University in 1971and 1969, respectively.

Sanjoy Paul (M'92-SM'97) received aB.Tech. degree from Indian Instituteof Technology, Kharagpur, India,followed by M.S and Ph.D. degreesfrom University of Maryland, CollegePark, in 1988 and 1992 respectively.He joined AT&T Bell Laboratoriesafter graduating from University ofMaryland. He is currently a principalinvestigator and a distinguished mem-ber of technical sta� in networkingsoftware research department in BellLaboratories. His research interestsinclude caching; multimedia stream-

ing; multicasting; transport, network, and link-layer protocolissues; formal methods; security; and mobile networking. Dr.Paul holds nine US patents, has published widely in interna-tional conferences and journals, and has been on the programcommittees of several IEEE conferences. Dr. Paul served as theguest editor of IEEE Network Magazine Special Issue onMulticasting. He is the co-recipient of 1997 William R. Bennettaward from IEEE Communications Society for the best originalpaper in IEEE/ACM Transactions on Networking in 1996. Dr.Paul is author of the book ``Multicasting on the Internet and itsApplications'', which was published by Kluwer Academic Pressin May 1998. He is an adjunct faculty of the Computer ScienceDepartment and the Internet Technical Institute of RutgersUniversity. Dr. Paul is a senior member of IEEE and a votingmember of the ACM.

S. Aggarwal et al. / Computer Networks 32 (2000) 35±60 59

Page 26: A flexible multicast routing protocol for group communicationmassey/pubs/journal/massey_networ… · A flexible multicast routing protocol for group communication Sudhir Aggarwal

Daniel Massey received a B.A. inMathematics/Computer Science andan M.A. in applied mathematics fromU.C. San Diego. He expects to com-plete a Ph.D. in Computer Sciencefrom UCLA in summer 2000. He pre-viously worked at the San Diego Su-percomputer Center and has been asummer researcher at Bell Labs andXerox PARC. His research interestsinclude routing protocols, infrastruc-ture fault tolerance, and network se-curity.

Daniela Caldararu was born in Bu-charest, Romania. She holds a B.Sc. inMathematics and Computer Sciencefrom Bucharest University, Romania.In 1997 she received her M.Sc. inComputer Science from SUNY Bing-hamton, with a thesis in DistributedSystems. She is currently working as aconsultant for New Era of Networks.

60 S. Aggarwal et al. / Computer Networks 32 (2000) 35±60