Microsoft Office SharePoint Server(Moss) (Basics) Ashok Kandula, Pune.
Responsive Yet Stable Traffic Engineering Srikanth Kandula Dina Katabi, Bruce Davie, and Anna...
-
Upload
brooke-fitzgerald -
Category
Documents
-
view
219 -
download
0
Transcript of Responsive Yet Stable Traffic Engineering Srikanth Kandula Dina Katabi, Bruce Davie, and Anna...
High PerformanceSwitching and RoutingTelecom Center Workshop: Sept 4, 1997.
Responsive Yet Stable Traffic Engineering
Srikanth Kandula
Dina Katabi, Bruce Davie, and Anna Charny
• Good mapping requires Load Balancing
• Good Mapping Good Performance & Low Cost
• ISPs needs to map traffic to underlying topology
100%
Ingress
Egress
100%
Ingress
Egress
More, ISPs want to re-balance load when an unexpected event causes congestion
100%
Ingress
EgressTraffic Change
More, ISPs want to re-balance load when an unexpected event causes congestion failure, BGP reroute, flash crowd, or attack
More, ISPs want to re-balance load when an unexpected event causes congestion failure, BGP reroute, flash crowd, or attack
100%
Ingress
EgressTraffic Change
Move Traffic
• Need to rebalance load ASAP Remove congestion before it affects user’s performance
• But, moving quickly may overshoot congestion on a different path more drops …
But, rebalancing load in realtime is risky
Congestion!
Ingress1
Ingress2Traffic Change
• Need to rebalance load ASAP Remove congestion before it affects user’s performance
• But, moving quickly may overshoot congestion on a different path more drops …
But, rebalancing load in realtime is risky
Ingress1
Ingress2Traffic Change
Congestion!
Problem: How to make Traffic Engineering:
• Responsive: reacts ASAP
• Stable: converges to balanced load without overshooting or generating new congestion
Problem: How to make Traffic Engineering:
• Responsive: reacts ASAP
• Stable: converges to balanced load without overshooting or generating new congestion
• Need to rebalance load ASAP Remove congestion before it affects user’s performance
• But, moving quickly may overshoot congestion on a different path more drops …
But, rebalancing load in realtime is risky
Current ApproachesOffline TE (e.g., OSPF-TE)• Avoids the risk of instability caused
by realtime adaptation, but also misses the benefits
• Balances the load in steady state• Deal with failures and change in
demands by computing routes that work under most conditions
• Overprovision for unanticipated events
Online TE (e.g., MATE)• Try to adapt to unanticipated events• But, can overshoot causing drops and instability
Long-Term
Demands
Long-Term
Demands
Link WeightsLink Weights
OSPF-TE
This Talk• TeXCP: Responsive & Stable Online TE
• Idea: Use adaptive load balancing But add explicit-feedback congestion
control to prevent overshoot and drops
• TeXCP keeps utilization always within a few percent of optimal
• Compare to MATE and OSPF-TE, showing that TeXCP outperforms both
Typical Formalization of the TE Problem
Find a routing that:
Min Max-Utilization
• Removes hot spots and balances load
• High Max-Utilization is an indicator that the ISP should upgrade its infrastructure
Online TE involves solving 2 sub-problems
1. Find the traffic split that minimizes the Max-Utilization
2. Converge to the balanced traffic splits in a stable manner
Also, an implementation mechanism
to force traffic to follow the desired splits
• A TeXCP agent per IE, at ingress node
• ISP configures each TeXCP agent with paths between IE
• Paths are pinned (e.g., MPLS tunnels)
Force traffic along the right pathsImplementation:
IngressEgress
TeXCP Agent
Solution:
• Periodically, TeXCP agent probes a path for its utilization
Distributedly, TeXCP agents find balanced traffic splits
Sub-Problem:
Solution: TeXCP Load Balancer
Ingress
Egress
x
U1 = 0.4
U2 = 0.7
U1 = 0.4
U2 = 0.7
Probes follow the slow path like ICMP messages
• Periodically, TeXCP agent probes a path for its utilization
Sub-Problem:
Solution:
)()( tututrr ppp
• A TeXCP agent iteratively moves traffic from over-utilized paths to under-utilized paths rp is this agent’s traffic on path p
• Deal with different path capacity
• Deal with inactive paths (rp =0)
TeXCP Load Balancer
Distributedly, TeXCP agents find balanced traffic splits
• Periodically, TeXCP agent probes a path for its utilization
Sub-Problem:
Solution:
• A TeXCP agent iteratively moves traffic from over-utilized paths to under-utilized paths rp is this agent’s traffic on path p
• Deal with different path capacity
• Deal with inactive paths (rp =0)
TeXCP Load Balancer
Proof in paper
i
ii
r
uru )()(ˆ tututrr ppp
Distributedly, TeXCP agents find balanced traffic splits
Converge to balanced load in a stable way
Sub-Problem:
Solution: Use Experience from Congestion Control (XCP)
Congestion Control• Flow from sender to
receiver
• Senders share the bottleneck; need coordination to prevent oscillations
Online TE• Flow from ingress to
egress
• TeXCP agents share physical link; need coordination to prevent oscillations
Move in really small increments No Overshoot! Challenge is to move traffic quickly
w/o overshoot
Move in really small increments No Overshoot! Challenge is to move traffic quickly
w/o overshoot
Load
Balancer
Load
Balancer
Path
Controller
Path
ControllerPath
Controller
Path
Controller
Ingress-Egress Demands
Congestion Management layer
Analogous to an Application
Congestion Management Layer between Load Balancer and Data Plane Set of light-weight per-path congestion controllers
Unlike prior online TE, Load Balancer can push a decision to the data plane only as fast as the Congestion Management Layer allows
it
Unlike prior online TE, Load Balancer can push a decision to the data plane only as fast as the Congestion Management Layer allows
it
• Explicit feedback from core routers (like XCP)
• Periodically, collects feedback in ICMP-like probes
Per-Path Light-Weight Congestion Controller
Ingress Egress
Utilization
Feedback
Probe
• Explicit feedback from core routers (like XCP)
• Periodically, collects feedback in ICMP-like probes
Per-Path Light-Weight Congestion Controller
Ingress Egress
U = .8
F = 100kbps
Probe
• Explicit feedback from core routers (like XCP)
• Periodically, collects feedback in ICMP-like probes
Per-Path Light-Weight Congestion Controller
Ingress Egress
U = .8
F = 100kbps
Probe
U = .2
F = 500kbps
U = .2
F = 500kbps
• Explicit feedback from core routers (like XCP)
• Periodically, collects feedback in ICMP-like probes
Per-Path Light-Weight Congestion Controller
Ingress Egress
• Core router computes aggregate feedback Δ = Spare BW – Queue / Max-RTT
• Estimates number of IE-flows by counting probes, and divides feedback between them
Occasional explicit feedback in probes… Need software changes only
Occasional explicit feedback in probes… Need software changes only
Stability Idea
Theorem 1: Given a particular load split, the path controller stabilizes the traffic on each link
Theorem 2: Given stable path controllers, Every TeXCP agent sees balanced load on all paths Unused paths have higher utilization than used paths
Load Balancer
Path Controller
Path Controller
Ingress
Egress
Informally stated:
Per-path controller works at a faster timescale than load balancer Can decouple components Stabilize separately
Performance
Simulation Setup
Standard for TE• Rocketfuel topologies • Average demands follow gravity model• IE-traffic consists of large # of Pareto on-off
sources
TeXCP Parameters: Each agent is configured with 10 shortest paths Probe for explicit feedback every 0.1s Load balancer re-computes a split every 0.5s
Compare to Optimal Max-Utilization Obtained with a centralized oracle that has
Immediate and exact demands info, and uses as many paths as necessary
TeXCP Balances Load Without Oscillations
TeXCP converges to a few percent of optimal
TeXCP converges to a few percent of optimal
Time (s)
Maxim
um
Lin
k U
tiliz
ati
on
TeXCP Balances Load Without Oscillations
Time (s)
Utilizations of all links in the network change without oscillations
Utilizations of all links in the network change without oscillations
Lin
k U
tiliz
ati
on
Comparison with MATE
• MATE is the state-of-the-art in online TE
• All simulation parameters are from the MATE paper
Ingress 1
Ingress 2
Ingress 3
Egress 1
Egress 2
Egress 3
L1
L2
L3
Time (s) Time (s)
Avg. drop rate in MATE is 20% during convergence
decrease in cross traffic
decrease in cross traffic
TeXCP MATE
Explicit feedback allows TeXCP to react faster and without oscillations
Explicit feedback allows TeXCP to react faster and without oscillations
TeXCP balances load better than MATEO
ffere
d Li
nk L
oad
Comparison with OSPF-TE
• OSPF-TE is the most-studied offline TE scheme
• It computes link weights, which when used in OSPF balance the load
• OSPF-TE-FAIL is an extension that optimizes for failures
• OSPF-TE-Multi-TM is an extension that optimizes for variations in traffic demands
1
1.2
1.4
1.6
Abovenet Genuity Sprint Tiscali AT&T
OSPF-TE TeXCP
Comparison with OSPF-TE under Static Load
Abovenet Genuity Sprint Tiscali AT&T
1.6
1.4
1.2
1
Optimal
Rati
o o
f M
ax-U
to O
pt.
TeXCP is within a few percent of optimal, outperforming OSPF-TE
TeXCP is within a few percent of optimal, outperforming OSPF-TE
Comparison with OSPF-TE under Static Load
1
1.2
1.4
1.6
Abovenet Genuity Sprint Tiscali AT&T
OSPF-TE TeXCP
Abovenet Genuity Sprint Tiscali AT&T
1.6
1.4
1.2
1
Rati
o o
f M
ax-U
to O
pt.
OSPF-TE TeXCP
Comparison with OSPF-TE-Fail
1
1.2
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
3
Abovenet Genuity Sprint Tiscali AT&T
OSPF-TE-Fail TeXCP
Abovenet Genuity Sprint Tiscali AT&T
3
2.8
2.6
2.4
2.2
2
1.8
1.6
1.4
1.2
1Rati
o o
f M
ax-U
to O
pt.
TeXCP allows an ISP to support same failure resilience with about ½ the capacity !
TeXCP allows an ISP to support same failure resilience with about ½ the capacity !
OSPF-TE-Fail TeXCP
Performance When Traffic Deviates From Long-term Averages
TeXCP reacts better to realtime demands! TeXCP reacts better to realtime demands!
1
1.1
1.2
1.3
1.41.5
1.6
1.7
1.8
1.9
1 1.5 2 2.5 3 3.5 4 4.5 5
Deviation from Long-term Average
OSPF-TE-Multi-TM TeXCP
1.8
1.6
1.4
1.2
1
Rati
o o
f M
ax-U
to O
pt.
Deviation from Long-term Average Demands 1 1.5 2 2.5 3 3.5 4 4.5 5
OSPF-TE-Multi-TM TeXCP
Conclusion• TeXCP: Responsive & Stable Online TE
• Combines load balancing with a Cong. Mngt. Layer to prevent overshoot and drops
• TeXCP keeps utilization always within a few percent of optimal
• Compared to MATE, it is faster and does not overshoot
• Compared to OSPF-TE it keeps utilization 20% to 100% lower it supports the same failure resilience with ½
the capacity major savings for the ISP
http://nms.lcs.mit.edu/projects/texcp/