Post on 30-Dec-2015
description
1
Exploring Efficient and Scalable Multicast Routing in Future Data Center Networks
Dan Li, Jiangwei Yu, Junbiao Yu, Jianping WuTsinghua University
Presented by DENG Xiang
Outline
I Introduction and backgroundII Build an efficient multicast treeIII Make multicast routing scalableIV EvaluationV Conclusion
Data Centers the core of cloud servicesonline cloud applicationsback-end infrastructural computationsservers and switchespopularity of group communication
Introduction and background
When Multicast meets data center networks...
Problem A:
Data center topologies usually expose high link
density and traditional technologies can result in
severe link waste.
Problem B:
Low-end commodity switches are largely used in
most data center designs for economic and scalability
consideration.
Data Center Network Architecture
BCubePortlandVL2 (similar to Portland)
Build an efficient Multicast tree
BCubeconstructed recursively: BCube(n,0), BCube(n,1)...BCube(n,k)each server has k+1 portseach switch has n portsnumber of servers: nk+1
Portland three-level and n pods aggregation level and edge level: n/2 switches with n ports core level: (n/2)2 switches with n ports number of servers: n3/4
Consistent themes lie in them use low-end switches in the view of expense high link density exists data center structure is built in a hierarchical and re
gular way
In order to save network traffic, how to build an
efficient Multicast tree traditional receiver-driven Multicast routing protocols originally for the Internet, such as PIM
approximate algorithm of Steiner tree Steiner tree problem: to build a Multicast tree with the lowest cost cov
ering the given nodes
source-driven tree building algorithm
the proposed algorithm
group spanning grapheach hop is a stagestage 0 includes the sender onlystage d is composed of receiversd is the diameter of data center topology
Build Multicast tree in a source-to-receiver expansion way upon the group spanning graph, with the tree node set from each stage strictly covering downstream receivers
definition of cover: A covers B if and only if for each node in B, there exists a d
irected path from a node in A A strictly covers B when A covers B and any subset of A d
oes not cover B.
algorithm details in BCube:a) select the set of servers(assume the set is E) from
stage 2 which are covered by sender s and a single switch in stage 1(assume it is W)
b) |E| of the BCube(n,k-1)s has a server in E as the source p, and the receiver set in stage 2*(k+1) covered by p.
c) the other BCube(n,k-1) has s as the source and receivers in stage 2*k covered by s but not by W as the receiver set
algorithm details in Portland:a) From the first stage to the stage of core-level switches, any
single path can be chosen, because any single core-level switch can cover the downstream receivers.
b) From the stage of core-level switches to the final stage of receivers, the paths are fixed due to the interconnection rule in PortLand.
a mechanism of packet forward to support
massive Multicast group is necessary: in-packet Bloom Filter
For only in-packet Bloom Filter, bandwidth waste is significant for large groups.
in-switch forwarding table
For only in-switch forwarding table, very large memory space is needed.
Make Multicast routing scalable
The bandwidth waste of in-packetBloom Filter comes from: the Bloom Filter field in the packet brings net
work bandwidth cost. false-positive forwarding by Bloom Filter cau
ses traffic leakage. switches receiving packets by false-positive f
orwarding may further forward packets to other switches, incurring not only additional traffic leakage but also possible loops.
we define Bandwidth Overhead Ratio r to decribe in-
packet Bloom Filter:
p--the packet length (including the Bloom Filter field) f--the length of the in-packet Bloom Filter fieldt--the number of links in the Multicast tree c--the numberof actual links covered by Bloom Filter based forwarding
with the packet size as 1500 bytes, the relation
among r, f and group size:
BCube(8,3) Portland with 48-port switches
In-packet Bloom Filter does not accommodate large-size group. So a combination routing scheme is proposed.
a) in-packet Bloom Filters are used for small-sized groups to save
routing space in switches, while routing entries are installed into
switches for large groups to alleviate bandwidth overhead.
b) Intermediate switches/servers receiving the Multicast packet check a special TAG in the packet to determine whether to forward the packet via in-packet Bloom Filter or looking up the in-switch forwarding table.
two ways of in-packet Bloom Filternode-based encoding elements are the tree nodes, including switches and serve
rs and it is chosen.
link-based encoding
elements are the directed physical links
false-positive forwarding caused by in-packet
Bloom Filter may result in loops.
the solution:When a node only forwards the packet to its
neighboring nodes (within the Bloom Filter) whose
distances to source are larger than itself.
Evaluation
evaluation of souce-driven tree buiding algorithm:
BCube(8,3) and 48-port-switch Portland; 1Gbps link speed; 200 random-sized groups; number of links in the tree computation time
Conclusion
Efficient and Scalable Multicast Routing in
Future Data Center Networks
an efficient Multicast tree building algorithm a combination forwarding scheme for salable
Multicast routing