Understanding and Limiting BGP Instabilities Zhi-Li Zhang ([email protected]) Jaideep Chandrashekar...
-
Upload
mildred-lewis -
Category
Documents
-
view
220 -
download
0
Transcript of Understanding and Limiting BGP Instabilities Zhi-Li Zhang ([email protected]) Jaideep Chandrashekar...
Understanding and Limiting BGP Instabilities
Zhi-Li Zhang ([email protected])
Jaideep Chandrashekar ([email protected]) Kuai Xu ([email protected])
BGP: Internet GlueBGP: Internet Glue
“Path-vector” routing protocol.
Allows networks to tell other networks about destinations that they are “responsible” for and how to reach themUsing “route advertisements”, also called
“NLRI” or “network-layer reachability information”
BGP: Internet Glue (cont’d)BGP: Internet Glue (cont’d)
Policy-based: allow ISPs to richly express their routing policy, both in selecting outbound paths and in announcing internal routes
Relatively “simple” protocol, but configuration is complex and the entire world can see, and be impacted by, mis-configurations.
ASes & AS Numbers (ASNs)ASes & AS Numbers (ASNs)
• An autonomous system is an independent routing domain that has been assigned an Autonomous System Number (ASN).
• Currently over 15,000 in use.• 64512 through 65535 are “private”• Examples
• AS57 U of Minnesota GigaPoP• AS217 U of Minnesota • AS701 UUNET• AS1239 Sprint
• ASNs represent atoms of BGP routing policy.
AS 1Genuity
AS 57 UMN GigaPoP
AS 7911 Wiltel
AS 11537Internet2
AS 217 UMN
AS 1998 State of Minnesota
128.101.0.0/16
Internet Connectivity of University Internet Connectivity of University of Minnesotaof Minnesota
Architecture of Internet Architecture of Internet RoutingRouting
AS 1
AS 2
BGP
EGP = Exterior Gateway Protocol
IGP = Interior Gateway Protocol
Metric based: OSPF, IS-IS, RIP
Policy based: BGP
ISIS
OSPF
Simplified BGP OperationsSimplified BGP Operations
Establish session on TCP port 179
Exchange all active routes
Exchange incremental updates
AS1
AS2
While connection is ALIVE exchangeroute UPDATE messages
BGP session
Types of BGP MessagesTypes of BGP Messages
Open : Establish a peering session.
Keep Alive : Handshake at regular intervals.
Notification : Shuts down a peering session.
Update : announce new routes or withdraw
previously announced routes.
Announcement : prefix + attribute valuesWithdrawals : prefix only
BGP AttributesBGP AttributesValue Code Reference----- --------------------------------- --------- 1 ORIGIN [RFC1771] 2 AS_PATH [RFC1771] 3 NEXT_HOP [RFC1771] 4 MULTI_EXIT_DISC [RFC1771] 5 LOCAL_PREF [RFC1771] 6 ATOMIC_AGGREGATE [RFC1771] 7 AGGREGATOR [RFC1771] 8 COMMUNITY [RFC1997] 9 ORIGINATOR_ID [RFC2796] 10 CLUSTER_LIST [RFC2796] 11 DPA [Chen] 12 ADVERTISER [RFC1863] 13 RCID_PATH / CLUSTER_ID [RFC1863] 14 MP_REACH_NLRI [RFC2283] 15 MP_UNREACH_NLRI [RFC2283] 16 EXTENDED COMMUNITIES [Rosen] ... 255 reserved for development
Not all attributes need to be present in every announcement
Two Types of BGP Neighbor Two Types of BGP Neighbor RelationshipsRelationships
• External Neighbor (eBGP) in a different Autonomous Systems
• Internal Neighbor (iBGP) in the same Autonomous System
AS1
AS2
eBGP
iBGP
iBGP is routed (using IGP!)
eBGP
eBGP
iBGP Peers Must be Fully iBGP Peers Must be Fully MeshedMeshed
iBGP neighbors do not announce routes received via iBGP to other iBGP neighbors.
eBGP update
iBGP updates
• iBGP is needed to avoid routing loops within an AS
• Injecting external routes into IGP does not scale and causes BGP policy information to be lost
• BGP does not provide “shortest path” routing
AS PATH AttributeAS PATH Attribute
AS7018135.207.0.0/16AS Path = 6341
AS 1239Sprint
AS 1755Ebone
AT&T
AS 3549Global Crossing
135.207.0.0/16AS Path = 7018 6341
135.207.0.0/16AS Path = 3549 7018 6341
AS 6341
135.207.0.0/16
AT&T Research
Prefix Originated
AS 12654RIPE NCCRIS project
AS 1129Global Access
135.207.0.0/16AS Path = 7018 6341
135.207.0.0/16AS Path = 1239 7018 6341
135.207.0.0/16AS Path = 1755 1239 7018 6341
135.207.0.0/16AS Path = 1129 1755 1239 7018 6341
Inter-domain Loop Inter-domain Loop PreventionPrevention
BGP at AS YYY will never accept a route with ASPATH containing YYY.
AS 7018
12.22.0.0/16ASPATH = 1 333 7018 877
Don’t Accept!
AS 1
BGP Best Path SelectionBGP Best Path Selection Ignore if exit point unreachableHighest local preferenceLowest AS path lengthLowest origin typeLowest MED (with same next hop AS)Lowest IGP cost to next hop Lowest router ID of BGP speaker
In a nutshellIn a nutshell
BGP = Path Vector Protocol + Policies. The Path vector protocol is very simple
Distribute Reachability. Prevent Loops.
All the complexity is introduced by locally administered policies.
Determine which paths are selected. And which neighbors they are exported to.
Path Exploration and Slow Path Exploration and Slow ConvergenceConvergence
What is Path Exploration?What is Path Exploration?
When a link fails (or is repaired), routers “go through” a sequence of paths before selecting a “converged” path.
Results from dependencies in advertised “path vectors”.Router’s best path is an extension of a neighbors’
best path.Which extends a best path from one of its own
neighbors.And so on……
What is Path Exploration (cont’d)What is Path Exploration (cont’d)
When a link fails, a set of dependent paths becomes invalid (or obsolete).
Removed one by one from the system.Router selects and propagates it.Receives withdrawal.Selects next best path (possibly invalid).Receive withdrawal, repeat till no more invalid
paths.
Path Exploration examplePath Exploration example
0 1
2
3
4
5
6
7
10
310
210
4210
6310
742107521076310
9
8
Network in a steady state
Path Exploration Example (cont’d)Path Exploration Example (cont’d)
0 1
2
3
4
5
6
7
10
310
210
4210
6310
742107521076310
7521076310W
9
8
Path Exploration Example (cont’d)Path Exploration Example (cont’d)
Paths 75210 and 76310 both contain the “problem edge” 10.
2 additional messages to force 7 to flush “bad paths”.
Number of “spurious messages” increases with the “richness” of connectivity …..
0 1
2
3
4
5
6
77210
74210
7510
75210
72510
……
8
9
Impact of Path ExplorationImpact of Path Exploration
In general, convergence time is O(LΔ)‘L’ is the longest simple path in the network.‘Δ’ is the time between successive
announcements.
From measurements: up to 15 minutes to converge (after link failure).
Impact of Path Exploration (cont’d)Impact of Path Exploration (cont’d)
Delays a router from picking valid, alternate paths.Have to first go through all the invalid paths.
Large scale packet losses in a short duration.Core routers process millions of packets a second.
In the absence of path exploration, convergence time is Ω(Dh).‘D’ is “diameter” of the network (D << L)‘h’ is message processing time at a node.
Causes for Path ExplorationCauses for Path Exploration
Invalid paths are selected, propagated, then withdrawn.Routers waste time processing “stale information”Delay convergence to valid, perhaps less preferred,
alternate pathsKey Issue: How to distinguish invalid paths from
valid” pathsDifficult in BGP: AS Paths --high level, abstract
AS PATHS: High Level ConnectivityAS PATHS: High Level Connectivity
AS 81
AS 217
AS 1239
AS 3
AS 11536
AS 217 and AS 3 receive the same AS PATH [11536 1239 81]
Underlying physical paths are disjoint.
Naive Solutions Fail.Naive Solutions Fail.TAG withdrawals:
When router generates withdrawal, tag it with cause/location.
0 1
2
3
4
5
6
7
WDRAW: (2,1)
failed
742107521076310
Naïve Solutions Fail (cont’d)Naïve Solutions Fail (cont’d)
0 1
3
4
5
6
7
AS Paths do not describe (or reflect) internal AS topology.
When an internal edge fails, which AS Path affected?[10] or [210]?
2
Naïve Solutions Fail (cont’d)Naïve Solutions Fail (cont’d)Link between 3.2 and 6.1 fails. 6.1 generates a withdrawal and tags with
<3,6>Should 6.3 remove all paths containing
<3,6> ?
0 1
2
4
5
7
AS 3AS 6
6.1
6.2
3.2
3.3 6.3
EPIC --- A Simple Solution EPIC --- A Simple Solution
Exploit Path dependencies to Invalidate Paths. To avoid Path Exploration:
When link fails, a set of dependent paths becomes invalid.
All the dependent paths must be removed from the system.
Dependent paths cannot be described using only AS Paths.
AS Paths are annotated with additional information (forward edge sequence numbers). Can capture path dependencies. Can distinguish valid and invalid paths.
Forward Edge Sequence NumbersForward Edge Sequence Numbers
When AS Path being advertised to an external AS neighbor, include fesn of “forward” external edge.
fesn = edge identifier + sequence number
AS X AS Y
Edge <X,Y>
Forward Edge Sequence Numbers (cont’d)Forward Edge Sequence Numbers (cont’d)
Defined per destination, for every AS-AS edge.When AS X sends a route to AS Y, the fesn
(X:Y, n) is attached; If route already has a previously attached fesn,
new fesn is prepended to it ---- fesnList.
AS XAS Z
AS Y
AS W
(X:Y
, n)
(X:Z, m)
(X:Y, n)
fesnfesn Management Management
When a link fails, its fesn does not change.Same value carried in withdrawals.
When <X,Y> is repaired:AS X increments the sequence number.Subsequent route announcements carry
“updated” fesn.
So a larger fesn always corresponds to “newer” information
fesnListfesnList Propagation Propagation
0 1
2
3
4
5
6
7
7
7
3
14
73
10
14
11
[4210] {(0:1, 7)(1:2, 7)(2:4, 14)(4:7, 11)}
[0] {(0:1, 7)}
[10] {(0:1, 7)(1:3, 3)}
[10] {
(0:1
, 7)(1
:2,
7)}
Same AS Path, distinct fesnLists
fesnListfesnList Propagation Propagation
0 1
2
3
4
5
6
7
4210 (4:7, 11) (2:4, 14) (1:2, 7) (0:1, 7)
5210 (5:7, 10) (2:5, 14) (1:2, 7) (0:1, 7)
6310 (6:7, 3) (3:6, 7) (1:3, 7) (0:1, 7)
After the routes are processed at all nodes
Routing Table at AS 7
Invalidating Paths upon FailureInvalidating Paths upon Failure When router generates a withdrawal:
The fesnList of withdrawn route (“path stem”) is attached to the withdrawal.
When router receives a withdrawal:1. Invalidates all routes containing the fesnList2. Selects a new best path3. If best path has changed, it sends new best
route to its neighbors, and the withdrawal is piggybacked.
4. If no valid path, only withdrawal is forwarded.
Invalidating Paths: ExampleInvalidating Paths: Example
0 1
2
3
4
5
6
7
W: {(1:2, 7
), (0:1,
7)}
W: {(1:2, 7), (0:1,
7)}
4210 (4:7, 11) (2:4, 14) (1:2, 7) (0:1, 7)
5210 (5:7, 10) (2:5, 14) (1:2, 7) (0:1, 7)
6310 (6:7, 3) (3:6, 7) (1:3, 7) (0:1, 7)
76310
Handling Link RepairsHandling Link Repairs
When <X,Y> is repaired:
1. AS X increments the fesn for the edge
2. Generates a new route announcement to send to AS Y (reflects updated fesn)
3. At AS Y, the route is installed into routing table and a subsequent route update may be generated.
4. After all updates have been processed, every fesnList containing (X:Y, n) will reflect the updated value.
What about Multiple Edges?What about Multiple Edges?
Each edge is associated with a minor fesnContrast with major fesn for “logical” AS-AS edge.
All edges between ASes share the same major fesn, but have distinct minor fesn’s.
Minor fesn is incremented with corresponding edge.major fesn incremented only if all edges are
affected.
Minor Minor fesn’sfesn’sMinor fesn’s are only used between adjacent
ASes.All routers in AS 6 include minor fesn in route
updates.When the updates exported externally (to AS
7) minor fesn is removed.
AS 3 AS 6
0 1
2
4
5
7
6.1
6.2
3.2
3.3 6.3
7 (11)
7 (13)
common major fesn distinct
minor fesn’s
fesn fesn – Key Properties– Key Properties
Sequence number is monotonic --- new events will have higher values.
Imposes a partial ordering on the fesnLists.Old information can be easily detected, and
discarded.
Allows compact, correct description of invalid paths i.e. the fesnList in a withdrawal captures all obsolete paths.
EPIC PropertiesEPIC Properties
No router will select an invalid path after receiving any update triggered by a single failure event.
No router will select an invalid path after receiving at least one update triggered by each of a set of multiple failure events.
Achieves optimal bounds for a path vector protocol.Routers may still explore paths.But these paths are all valid.
EPIC Performance (vs BGP)EPIC Performance (vs BGP)
Time (L-2)Δ (D-1)h
Messages (L-2)(|E| -1) |E| - 1
Time (L’+D’–1) Δ D’(h+Δ)
Messages (L’+D’-1)(|E’|-1) (|E’|-1)D’(h+Δ)/Δ
Fail Down
Fail Over
BGP EPIC
Root Cause Analysis of Root Cause Analysis of BGP EventsBGP Events
BGP Routing DynamicsBGP Routing Dynamics
BGP routing instabilities BGP routing suffers from many problems, e.g., mis-
configurations, link failures, policy changes, slow convergence, etc.
BGP update streams are visible from all BGP-monitoring vantage points.
Open research problems What are the common characteristics of BGP
dynamics? What are primary causes of BGP routing dynamics? How to visualize BGP dynamics?
BGP Routing Update (per second)BGP Routing Update (per second) View: UMN View: UMN Time: 2003/12/07 – 2003/12/14Time: 2003/12/07 – 2003/12/14
Time vs. Number of BGP updates at prefix level
BGP Update Burst
BGP Update Noise
BGP Routing Update (per second) (cont.) View: UMN Time: 2003/12/07 – 2003/12/14
Time vs. Number of BGP updates at AS level
BGP Update Burst
BGP Update Noise
Modeling BGP Routing DynamicsModeling BGP Routing Dynamics
Modeling BGP dynamics on all prefixes/ASes is challenging.~120, 000 prefixes, ~16,000 ASes
High-dimensional time-series BGP updates are temporally and spatially
correlated