Virtual ROuters On the Move (VROOM): Live Router Migration as a Network-Management Primitive Yi...
-
date post
22-Dec-2015 -
Category
Documents
-
view
216 -
download
0
Transcript of Virtual ROuters On the Move (VROOM): Live Router Migration as a Network-Management Primitive Yi...
Virtual ROuters On the Move (VROOM):Live Router Migration as
a Network-Management Primitive
Yi Wang, Eric Keller, Brian Biskeborn, Kobus van der Merwe, Jennifer Rexford
Virtual ROuters On the Move (VROOM)
• Key idea– Routers should be free to roam around
• Useful for many different applications– Simplify network maintenance– Simplify service deployment and evolution– Reduce power consumption– …
• Feasible in practice– No performance impact on data traffic– No visible impact on control-plane protocols
2
The Two Notions of “Router”• The IP-layer logical functionality, and the physical
equipment
3
Logical(IP layer)
Physical
The Tight Coupling of Physical & Logical• Root of many network-management challenges (and
“point solutions”)
4
Logical(IP layer)
Physical
VROOM: Breaking the Coupling• Re-mapping the logical node to another physical node
5
Logical(IP layer)
Physical
VROOM enables this re-mapping of logical to physical through virtual router migration.
Case 2: Service Deployment & Evolution
• VROOM guarantees seamless service to existing customers during the migration
10
Virtual Router Migration: the Challenges
15
1. Migrate an entire virtual router instance• All control plane & data plane processes / states
Virtual Router Migration: the Challenges
16
1. Migrate an entire virtual router instance2. Minimize disruption
• Data plane: millions of packets/second on a 10Gbps link• Control plane: less strict (with routing message retrans.)
Virtual Router Migration: the Challenges
17
1. Migrating an entire virtual router instance2. Minimize disruption3. Link migration
Virtual Router Migration: the Challenges
18
1. Migrating an entire virtual router instance2. Minimize disruption3. Link migration
• Key idea: separate the migration of control and data planes
1.Migrate the control plane2.Clone the data plane3.Migrate the links
20
VROOM’s Migration Process
• Leverage virtual server migration techniques• Router image
– Binaries, configuration files, etc.
21
Control-Plane Migration
• Leverage virtual migration techniques• Router image• Memory
– 1st stage: iterative pre-copy– 2nd stage: stall-and-copy (when the control plane
is “frozen”)
22
Control-Plane Migration
• Leverage virtual server migration techniques• Router image• Memory
23
Control-Plane Migration
Physical router A
Physical router B
DP
CP
• Clone the data plane by repopulation– Enable migration across different data planes– Eliminate synchronization issue of control & data
planes
24
Data-Plane Cloning
Physical router A
Physical router B
CP
DP-old
DP-newDP-new
• Data-plane cloning takes time– Installing 250k routes takes over 20 seconds*
• The control & old data planes need to be kept “online”• Solution: redirect routing messages through tunnels
25
Remote Control Plane
*: P. Francios, et. al., Achieving sub-second IGP convergence in large IP networks, ACM SIGCOMM CCR, no. 3, 2005.
Physical router A
Physical router B
CP
DP-old
DP-new
• Data-plane cloning takes time– Installing 250k routes takes over 20 seconds*
• The control & old data planes need to be kept “online”• Solution: redirect routing messages through tunnels
26
Remote Control Plane
*: P. Francios, et. al., Achieving sub-second IGP convergence in large IP networks, ACM SIGCOMM CCR, no. 3, 2005.
Physical router A
Physical router B
CP
DP-old
DP-new
• Data-plane cloning takes time– Installing 250k routes takes over 20 seconds*
• The control & old data planes need to be kept “online”• Solution: redirect routing messages through tunnels
27
Remote Control Plane
*: P. Francios, et. al., Achieving sub-second IGP convergence in large IP networks, ACM SIGCOMM CCR, no. 3, 2005.
Physical router A
Physical router B
CP
DP-old
DP-new
• At the end of data-plane cloning, both data planes are ready to forward traffic
28
Double Data Planes
CP
DP-old
DP-new
• With the double data planes, links can be migrated independently
29
Asynchronous Link Migration
A
CP
DP-old
DP-new
B
• Control plane: OpenVZ + Quagga• Data plane: two prototypes
– Software-based data plane (SD): Linux kernel– Hardware-based data plane (HD): NetFPGA
• Why two prototypes?– To validate the data-plane hypervisor design (e.g.,
migration between SD and HD)
30
Prototype Implementation
• Performance of individual migration steps• Impact on data traffic• Impact on routing protocols
• Experiments on Emulab
31
Evaluation
• Performance of individual migration steps• Impact on data traffic• Impact on routing protocols
• Experiments on Emulab
32
Evaluation
• SD router w/ separate migration bandwidth– Slight delay increase due to CPU contention
• HD router w/ separate migration bandwidth– No delay increase or packet loss
34
Impact on Data Traffic
• Introduce LSA by flapping link VR2-VR3– Miss at most one LSA– Get retransmission 5 seconds later (the default LSA
retransmission timer)– Can use smaller LSA retransmission-interval (e.g., 1
second)
36
Core Router Migration: OSPF Only
• Average control-plane downtime: 3.56 seconds– Performance lower bound
• OSPF and BGP adjacencies stay up• Default timer values
– OSPF hello interval: 10 seconds– BGP keep-alive interval: 60 seconds
37
Edge Router Migration: OSPF + BGP
Where To Migrate
• Physical constraints– Latency
• E.g, NYC to Washington D.C.: 2 msec– Link capacity
• Enough remaining capacity for extra traffic– Platform compatibility
• Routers from different vendors– Router capability
• E.g., number of access control lists (ACLs) supported• The constraints simplify the placement problem
38
Conclusions & Future Work
• VROOM: a useful network-management primitive– Separate tight coupling between physical and logical– Simplify network management, enable new applications– No data-plane and control-plane disruption
• Future work– Migration scheduling as an optimization problem– Other applications of router migration
• Handle unplanned failures• Traffic engineering
39
Packet-aware Access NetworkPseudo-wires (virtual circuits) from CE to PE
42
PECE
P/G-MSS: Packet-aware/Gateway Multi-Service SwitchMSE: Multi-Service Edge
Events During Migration
• Network failure during migration– The old VR image is not deleted until the
migration is confirmed successful
• Routing messages arrive during the migration of the control plane– BGP: TCP retransmission– OSPF: LSA retransmission
43
3. Migrate links affixed to the virtual routers• Enabled by: programmable transport networks
– Long-haul links are reconfigurable• Layer 3 point-to-point links are multi-hop at layer 1/2
Flexible Transport Networks
44
Chicago
New York
Washington D.C.
: Multi-service optical switch (e.g., Ciena CoreDirector)
Programmable Transport Network
Requirements & Enabling Technologies
3. Migrate links affixed to the virtual routers• Enabled by: programmable transport networks
– Long-haul links are reconfigurable• Layer 3 point-to-point links are multi-hop at layer 1/2
45
Chicago
New York
Washington D.C.
: Multi-service optical switch (e.g., Ciena CoreDirector)
Programmable Transport Network
Requirements & Enabling Technologies
4. Enable edge router migration• Enabled by: packet-aware access networks
– Access links are becoming inherently virtualized• Customers connects to provider edge (PE) routers via
pseudo-wires (virtual circuits) • Physical interfaces on PE routers can be shared by
multiple customers
46
Dedicated physical interfaceper customer
Shared physical interface
• With programmable transport networks, long-haul links are reconfigurable– IP-layer point-to-point links are multi-hop at transport layer
• VROOM leverages this capability in a new way to enable link migration
Link Migration in Transport Networks
47
2. With packet-aware transport networks– Logical links share the same physical port
• Packet-aware access network (pseudo wires)• Packet-aware IP transport network (tunnels)
Link Migration in Flexible Transport Networks
48
49
The Out-of-box OpenVZ Approach Packets are forwarded inside each VE When a VE is being migrated, packets are
dropped
50
Putting It Altogether: Realizing Migration
1. The migration program notifies shadowd about the completion of the control plane migration
51
Putting It Altogether: Realizing Migration
2. shadowd requests zebra to resend all the routes, and pushes them down to virtd
52
Putting It Altogether: Realizing Migration
3. virtd installs routes the new FIB, while continuing to update the old FIB
53
Putting It Altogether: Realizing Migration
4. virtd notifies the migration program to start link migration after finishing populating the new FIB
5. After link migration is completed, the migration program notifies virtd to stop updating the old FIB
Power Consumption of RoutersVendor Cisco Juniper
Model CRS-1 12416 7613 T1600 T640 M320
Power
(watt)10,920 4,212 4,000 9,100 6,500 3,150
A Synthetic large tier-1 ISP backbone 50 POPs (Point-of-Presence) 20 major POPs, each has:
6 backbone routers, 6 peering routers, 30 access routers
30 smaller POPs, each has: 6 access routers
55
Future Work• Algorithms that solve the constrained optimization
problems• Control-plane hypervisor to enable cross-vendor migration
56
Performance of Migration Steps
Memory copy time With different
numbers of routes (dump file sizes)
0
1
2
3
4
5
6
0 10k 100k 200k 300k 400k 500k
Number of routes
Tim
e (
secon
ds)
Suspend + dump Copy dump file Undump + resume Bridging setup
57
Performance of Migration Steps
FIB population time Grows linearly w.r.t. the number of route entries Installing a FIB entry into NetFPGA: 7.4 microseconds Installing a FIB entry into Linux kernel: 1.94
milliseconds
• FIB update time: time for virtd to install entries to FIB• Total time: FIB update time + time for shadowd to send routes to virtd