Exploiting Route Redundancy via Structured Peer to Peer Overlays
description
Transcript of Exploiting Route Redundancy via Structured Peer to Peer Overlays
Exploiting Route Redundancy via Structured Peer to Peer OverlaysBen Y. Zhao, Ling Huang, Jeremy Stribling,
Anthony D. Joseph, and John D. Kubiatowicz
University of California, BerkeleyICNP 2003
November 7, 2003 [email protected] 2003
Challenges Facing Network Applications Network connectivity is not reliable
Disconnections frequent in the wide-area Internet IP-level repair is slow
Wide-area: BGP 3 mins Local-area: IS-IS 5 seconds
Next generation network applications Mostly wide-area Streaming media, VoIP, B2B transactions Low tolerance of delay, jitter and faults Our work: transparent resilient routing infrastructure that
adapts to faults in not seconds, but milliseconds
November 7, 2003 [email protected] 2003
Talk Overview Motivation Why structured routing Structured Peer to Peer overlays Mechanisms and policy Evaluation Summary
November 7, 2003 [email protected] 2003
Routing in “Mesh-like” Networks Previous work has shown reasons for long
convergence [Labovitz00, Labovitz01] MinRouteAdver timer
Necessary to aggregate updates from all neighbors Commonly set to 30 seconds
Contributes to lower bound of BGP convergence time Internet becoming more mesh-like [Kaat99,labovitz99]
Worsens BGP convergence behavior Question
Can convergence be faster in context of structured routing?
November 7, 2003 [email protected] 2003
Resilient Overlay Networks (MIT) Fully connected mesh Allows each node full
knowledge of network Fast, independent calculation
of routes Nodes can construct any
path, maximum flexibility Cost of flexibility
Protocol needs to choose the “right” route/nodes
Per node O(n) state Monitors n - 1 paths
O(n2) total path monitoring is expensive
S
D
November 7, 2003 [email protected] 2003
Leveraging Structured Peer-to-Peer Overlays Key based routing (IPTPS 03)
Large sparse ID space N (160 bits: 0 – 2160)
Nodes in overlay network have nodeIDs N
Given some key k N, overlay deterministically maps k to its root node (live node in the network)
route message to root (k)
0
root(k)
k
source
Distributed Hashtables (DHT) is interface on KBR Key is leveraging underlying routing mesh
November 7, 2003 [email protected] 2003
Proximity Neighbor Selection PNS = network aware overlay construction
Within routing constraints, choose neighbors closest in network distance (latency)
Generally reduces # of IP hops Important for routing
Reduce latency Reduce susceptibility to faults
Less IP links = smaller chance of link/router failure Reduce overall network bandwidth utilization
We use Tapestry to demonstrate our design P2P protocol with PNS overlay construction Topology-unaware P2P protocols will likely perform worse
November 7, 2003 [email protected] 2003
v
v
vv
v v
v
v
v
v
v
v
v
O V E R L A Y
System Architecture
Locate nearby overlay proxy Establish overlay path to destination host Overlay traffic routes traffic resiliently
Internet
November 7, 2003 [email protected] 2003
B
Traffic TunnelingLegacyNode A
LegacyNode B
ProxyProxy
registerregister
Structured Peer to Peer Overlay
put (hash(B), P’(B))
P’(B)
get (hash(B)) P’(B)
A, B are IP addresses
put (hash(A), P’(A))
P’(A) = A P’(B) = B
Store mapping from end host IP to its proxy’s overlay ID Similar to approach in Internet Indirection Infrastructure (I3)
November 7, 2003 [email protected] 2003
Tradeoffs of Tunneling via P2P Less neighbor paths to monitor per node: O(log(n))
Large reduction in probing bandwidth: O(n) O(log(n)) Increase probing frequency Faster fault detection with low bandwidth consumption
Actively maintain path redundancy Manageable for “small” # of paths Redirect traffic immediately when a failure is detected Eliminate on-the-fly calculation of new routes Restore redundancy when a path fails
End result Fast fault detection + precomputed paths = increased
responsiveness to faults Cons
Overlay imposes routing stretch (more IP hops), generally < 2
November 7, 2003 [email protected] 2003
Some Details Efficient fault detection
Use soft-state to periodically probe log(n) neighbor paths “Small” number of routes reduced bandwidth Exponentially weighted moving average
in link quality estimation Avoid route flapping due to short term loss artifacts Loss rate Ln = (1 - ) Ln-1 + p p = instantaneous loss rate, = hysteresis factor
Maintaining backup paths Each hop has flexible routing constraint
Create and store backup routes at node insertion Restore redundancy via “intelligent” gossip after failures Simple policies to choose among redundant paths
November 7, 2003 [email protected] 2003
First Reachable Link Selection (FRLS) Use estimated loss results to
choose shortest “usable” path Sort next hop paths by latency Use shortest path with
minimal quality > T Correlated failures
Reduce with intelligent topology construction
Key is to leverage redundancy available
2046
1111
2281 2530
2299 2274 2286
2225
November 7, 2003 [email protected] 2003
Evaluation Metrics for evaluation
How much routing resiliency can we exploit? How fast can we adapt to faults? What is the overhead of routing around a failure?
Proportional increase in end to end latency Proportional increase in end to end bandwidth used
Experimental platforms Event-based simulations on transit stub topologies
Data collected over different 5000-node topologies PlanetLab measurements
Microbenchmarks on responsiveness Bandwidth measurements from 200+ node overlays Multiple virtual nodes run per physical machine
November 7, 2003 [email protected] 2003
Exploiting Route Redundancy (Sim)
Simulation of Tapestry, 2 backup paths per routing entry Transit-stub topology shown, results from TIER and AS graphs similar
00.10.20.30.40.50.60.70.80.9
1
0 0.05 0.1 0.15 0.2Proportion of IP Links Broken
% o
f All
Pairs
Rea
chab
le
Instantaneous IP Tapestry / FRLS
November 7, 2003 [email protected] 2003
Responsiveness to Faults (PlanetLab)
Response time increases linearly with probe period Minimum link quality threshold T = 70%, 20 runs per data point
0
500
1000
1500
2000
2500
0 200 400 600 800 1000 1200
Link Probe Period (ms)
Tim
e to
Sw
itch
Rou
tes
(ms)
alpha=0.2alpha=0.4
300
660
November 7, 2003 [email protected] 2003
Link Probing Bandwidth (Planetlab)
0
1
2
3
4
5
6
7
1 10 100 1000
Size of Overlay
Ban
dwid
th P
er N
ode
(KB
/s)
PR=300msPR=600ms
Medium sized routing overlays incur low probing bandwidth Bandwidth increases logarithmically with overlay size
November 7, 2003 [email protected] 2003
Related Work Redirection overlays
Detour (IEEE Micro 99) Resilient Overlay Networks (SOSP 01) Internet Indirection Infrastructure (SIGCOMM 02) Secure Overlay Services (SIGCOMM 02)
Topology estimation techniques Adaptive probing (IPTPS 03) Peer-based shared estimation (Zhuang 03) Internet tomography (Chen 03) Routing underlay (SIGCOMM 03)
Structured peer-to-peer overlays Tapestry, Pastry, Chord, CAN, Kademlia, Skipnet, Viceroy,
Symphony, Koorde, Bamboo, X-Ring…
November 7, 2003 [email protected] 2003
Conclusion Benefits of structure outweigh costs
Structured routing lowers path maintenance costs Allows “caching” of backup paths for quick failover
Can no longer construct arbitrary paths Structured routing with low redundancy gets very close to ideal in
connectivity Incur low routing stretch
Fast enough for highly interactive applications 300ms beacon period response time < 700ms On overlay networks of 300 nodes, b/w cost is 7KB/s
Future work Deploying a public routing and proxy service on PlanetLab Examine impact of
Network aware topology construction Loss sensitive probing techniques
November 7, 2003 [email protected] 2003
Questions… Related websites:
Tapestryhttp://www.cs.berkeley.edu/~ravenben/tapestry Pastryhttp://research.microsoft.com/~antr/pastry Chordhttp://lcs.mit.edu/chord
Acknowledgements Thanks to Dennis Geels and Sean Rhea for their
work on the BMark benchmark suite
November 7, 2003 [email protected] 2003
Backup Slides
November 7, 2003 [email protected] 2003
Another Perspective on Reachability
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
0 0.05 0.1 0.15 0.2
Proportion of IP Links Broken
Prop
ortio
n of
All
Path
sPortion of all pair-wise paths where
no failure-free paths remain
Portion of all paths where IP and FRLS both
route successfully
A path exists, but neither IP nor
FRLS can locate the path
FRLS finds path, where
short-term IP routing fails
November 7, 2003 [email protected] 2003
Constrained Multicast Used only when all paths are below
quality threshold Send duplicate messages on multiple
paths Leverage route convergence
Assign unique messageIDs
Mark duplicates Keep moving window of IDs Recognize and drop duplicates
Limitations Assumes loss not from congestion Ideal for local area routing
2046
1111
2281 2530
2299 2274 2286
2225
? ? ?
November 7, 2003 [email protected] 2003
Latency Overhead of Misrouting
00.10.20.30.40.50.60.70.80.9
1
20 40 60 80 100 120 140 160 180
Latency from source to destination (ms)
Prop
ortio
nal L
aten
cy O
verh
ead
Hop 0 Hop 1 Hop 2 Hop 3 Hop 4
November 7, 2003 [email protected] 2003
Bandwidth Cost of Constrained Multicast
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
20 40 60 80 100 120 140 160
Latency from source to destination (ms)
Prop
ortio
nal B
andw
idth
Ove
rhea
d
Hop 0 Hop 1 Hop 2 Hop 3 Hop 4
November 7, 2003 [email protected] 2003
Challenge 2: Tunneling Application Traffic Basic idea
Tunnel “legacy” traffic via overlay proxies Should be protocol independent
Desired properties Traffic redirection transparent to end hosts Stable mapping between a proxy and its legacy nodes Incremental deployment
Details Making a connection to D
Determine if D is reachable from overlay If so, retrieve D’s overlay ID from storage in overlay Redirect traffic through overlay to D’s overlay ID
Assign legacy nodes IDs in the overlay Routing to IDs automatically reach respective proxies
November 7, 2003 [email protected] 2003
Assigning Overlay Identifiers A proxy close by in network
assigns a legacy node a non-random overlay ID
Assign closest possible ID to proxy’s ID such that: root (ID) = proxy Chord/Tapestry: ID = proxyID – 1
Stable ID mapping Probability of a new node
“hijacking” legacy node very low 1 million nodes,
1000 nodes / proxy, 160 bit names, Ph < 10-40
0
sourcenode ID
proxy
proxy
destinationnode ID
November 7, 2003 [email protected] 2003
Unused Slides
November 7, 2003 [email protected] 2003
Summary Highly adaptive to link failures
Fast adaptation time w/ low monitoring BW cost 300ms beacon period, 7KB/s (300 overlay nodes),
response time < 700ms Low latency and BW overhead for misrouting
Limitation: less useful for congested IP links Constrained multicast can exacerbate congestion Can utilize more physically-aware overlay construction Issue: using loss as congestion indicator
Can fix with more intelligent protocols such as XCP
November 7, 2003 [email protected] 2003
Possible Approaches Modify basic IP protocols
BGP, OSPF, IS-IS Millions of deployed routers Significant
deployment challenge “Helper” infrastructures to assist protocols
Manipulate input to improve performance without modification
Deploy intelligent protocols above IP layer Overlay approach taken by many Relies on basic point to point IP routing