MPLS, RSVP, etc Olof Hagsand KTH/CSC · networks The generalized form of MPLS: ... OSPF IS-IS BGP...
Transcript of MPLS, RSVP, etc Olof Hagsand KTH/CSC · networks The generalized form of MPLS: ... OSPF IS-IS BGP...
Background● MPLS - Multiprotocol Label Switching● Originally thought to simplify IP forwarding
– Small label lookup instead of longest prefix match
● Roots in ATM (Asynchronous Transfer Mode)– Early 90s, most telecoms thought ATM would take over all
data- and tele-communication
● MPLS was standardized in IETF in mid 90s.
MPLS Advantages
Originally, the motivation was speed and cost.
But routers does IP lookup in hardware at very high speeds.
Current advantages:● Label switching can be used for traffic engineering
– Aggregating a class of traffic and treating it in a specific way
– Control of traffic in a network
● Labels can be used to forward using other fields than destination address
● Label switching can be used to support VPNs – virtual private networks
● The generalized form of MPLS: GMPLS is ideal for optical network management
MPLS Terminology
MPLS uses some new terminology (MPLS ~ IP)● LSR: Label Switching Router ~ Router
● LER: Label Edge Router ~ Border router– Alternative: PE - Provider Edge / CE – Customer Edge
– Also: Egress/Ingress/Transit LSR
● LSP: Label Switched Path ~ Tunnel● FEC: Forwarding Equivalence Class ~ Flow● Label distribution ~ Routing● LFIB: Label Forwarding Information Base ~ FIB
Control and Data Planes
Control Plane(Binding Layer)
Data Plane(Forwarding Layer)
LFIB
LSR
MPLSpacket
MPLSpacket
Control Plane(Routing Layer)
Data Plane(Forwarding Layer)
FIB
Router
IPpacket
IPpacket
Routing:OSPFIS-ISBGP
LabelDistribution:LDPRSVP-TEMBGP
Routing:OSPFIS-ISBGP
LabelDistribution:LDPRSVP-TEMBGP
Labels● A label is an integer identifying a FEC (or a flow).● You cannot have globally or network- unique labels
– Too complex to negotiate
– Too large labels
● Labels are unique only between two nodes● Labels 0-1048575.
– 0-15 reserved by the IETF.
● Labels change at each node as a packet traverses a path● You can set labels manually(worse than static routing), or use
label distribution● Example of Label Forwarding Information Base LFIB (MPLS
forwarding table):
20bit label
In-Interface in-label op out-label out-interface
if 3 99 swap 203 if1if 3 333 swap 978 if2
Encapsulation
● MPLS uses a 32-bit shim header:– Label: Value for table lookup in router
– TTL: Time To Live (resembles IP TTL)
– Exp: Experimental, several ways to support QoS has been defined
– Stack: Indicates that the bottom of a stack of labels has been reached
● Shim headers may be concatenated into stacks
Label hdrEth hdr IP hdr IP payload
20bit label8-bitTTL
1-bitstack
3-bitexp
Label hdrEth hdr IP hdr IP payloadLabel hdr
stack=0 stack = 1
Forwarding Equivalence Class (FEC)
Sort packets into different classes – classify them.Classification is a more general form of lookup
Example1: All packets to one destination.
Example2: All UDP packets with the ToS field set to 0x42 from sub-network 192.168.20.0/24
Such a subset is called a Forwarding Equivalence Class
MPLS binds labels to different FECs – Labelling
FECs granularityCoarse – good for scalability
Fine – good for expressibility
What defines FECs?
“Something else”: eg BGP routing table, VPN, packet filters, etc.
FEC label
FEC1 99FEC2 2345... ...
Label operations
● Label operations are interface-specific● Push a label (typically at ingress)
– Double push (label stacking)
● Swap a label (internal LSRs / P routers)● Pop a label (typically at egress or pen-ultimate)
Label assignment
● The border router (LER) classifies packets into FECs● Binds a label to the packet
– Or to an LSP, which in turn is mapped to a label
● Pushes an MPLS header on the packet● And sends it on
pkt 99 pkt LSP
LER
FEC op out-label interface
F1 push 99 if1F2 push 300 if2
if1
Label swapping
● A router (LSR) makes a label lookup and swaps the label● Rewrites the MPLS header● And sends it further on the LSP
99 pktLSPLSP
203 pkt
In-Interface in-label op out-label out-interface
if 3 99 swap 203 if1if 3 333 swap 978 if2
if1if3
Label popping
● The border router (LER) pops the MPLS packet● And then forwards it as usual depending on the packets
protocol● Example: pkt is an IP packet --> pkt is sent to IP forwarding
– Thus both MPLS and IP forwarding on same node!
pktLSP
LER
203 pkt
In-Interface in-label op out-label out-interface
if 3 203 pop (IP lookup) if 3 333 swap 978 if2
if1if3
Pen-ultimate popping
● To make it easier for the border router, pop the label on the previous router (pen-ultimate)
● The pen-ultimate LSR does MPLS pop● The LER does only IP routing
pktLSP
Pen-ultimateLSR
203 pkt
LER
MPLS pop IP routing
if1if3
In-Interface in-label op out-label out-interface
if 3 203 pop - if1 if 3 333 swap 978 if2
MPLS Label Switched Paths
LSP
IP pktxIP pkt
yIP pktzIP pkt
IP pkt
IP pkt
Label hdrEth hdr IP hdr IP payload
Push label
Pop labelSwap label
20bit label 3bitexp
1bitstack
8bitTTL
Route packet
Penultimate popping
Label Stacking
● Used when e.g. packet forwarded through transit routing domain● An inner label is transferred through the domain● Push outer label onto stack at ingress LSR, ● swap outer label in domain● pop outer label at egress LSR
in=3IP pkt
in=5IP pkt
in=3IP pkt out=6
out=6in=5IP pkt
in=3IP pkt out=7
out=7in=5IP pkt
in=3IP pkt
in=5IP pkt
Transit Routing Domain
Typical use: MPLS for transit (lab scenario)
● Use an IGP to compute internal routes● Setup LSPs between border routers using the IGP
– Eg border routers may set up a full-mesh of LSPs
● Send transit traffic via LSPs (src and dst outside the AS)● But still send internal traffic via IP (src or dst inside the AS)● External routes need not be distributed to non-border routers, so we
do not need IBGP there– A BGP-free core
● Only the border routers need to speak BGP– This is considered an advantage
BGP
IGP + MPLS BGP BGP
AS
Special labels operations
● 0: IPv4 explicit NULL
– Downstream LSR should pop label unconditionally
– Popped packet is an IPv4 datagram
● 1: Router alert
– Deliver to control plane – do not forward
● 2: IPv6 explicit-NULL
– Downstream LSR should pop unconditionally
– Popped packet is an IPv6 datagram
● 3: Implicit-NULL
– Pop immediately and treat as IPv4 packet
– does not appear on link
– Pen-ultimate popping!
Label Distribution ● Labels need to be assigned● And LFIBs programmed● A signaling protocol distributes labels
– Creates an LSP through an MPLS network
● Make a new protocol– LDP – Label Distribution Protocol
● Or extend existing protocols– BGP – Border Gateway Protocol
– RSVP – Resource Reservation Protocol● These protocols all distributes labels
– But they are somewhat different and can be combined to transfer different labels, eg BGP+RSVP, where BGP transfers inner labels and RSVP negotiate outer labels.
Label distribution and IGP
● Label distribution protocols typically rely on an IGP (eg OSPF) to find the shortest path of their LSP.– Although RSVP may use source-routing as well
● They then assign labels to the LSP on the shortest path.● Labels are assigned in upstream direction normally ● During setup of LSPs, traffic may be black-holed
– IGP and label distribution
– Same is true during reconvergence
Resource Reservation Protocol
● RSVP is a network control protocol used to express quality of service (QoS).– Binds a QoS request to a flow
● RSVP is a receiver oriented protocol – Similar to IP multicast
● RSVP delivers QoS reservations along a path from source to destination(s).– no routing – IGP computes path
– uses soft state – periodically refresh● With MPLS, RSVP-TE is used for traffic engineering
– To guarantee a quality for a FEC
– Aggregated flows
– Extended to carry MPLS labels
RSVP Building Blocks
RSVP carries the following information:● Classification information
– Flows with QoS requirements must be allocated within the network
– Includes src and dst IP addr and UDP/TCP port numbers● Or MPLS labels
● Traffic specification (TSpecs)● QoS requirements (FlowSpecs)
RSVP is explicitly designed to support multicastbut MPLS is not,...
RSVP Model
Two types of messages to set up reservations:● PATH message
– From a sender to one or several receivers, carrying TSpec and classification information provided by sender
● RESV message– From receiver, carrying FlowSpec indicating QoS required by
receiver
Sender
Receiver
Receiver
PATH msg (TSpec)
RESV msg (FlowSpec)
RSVP and Router Operation
● At each router, RSVP applies admission control
● If the router cannot reserve resources the traffic an error message is returned.
● PATH messages flowing downstreams indicate how the traffic will flow.
● RESV messages flowing upstreams negotiate the actual reservation
– And MPLS labels
MPLS Support for RSVP
● RSVP has been successfully adapted to support MPLS● New objects defined carrying labels● RSVP can express QoS (Tspec, FlowSpec)
– Each LSR associates QoS resources with an LSP
– The ingress router needs to classify and bind traffic to an LSP
– Network resources are then bound to that LSP throughout the network.
Ingess EgressPATH
RESVRESV lbl=5RESV lbl=9
Objects in PATH messages
● Label Request Object– Request an MPLS label for a path
● Explicit Route Object (ERO)– Addresses through which the LSP must pass.
– Used for source routing
● Record Route Object (RRO)– Record the path of the PATH message
● Sender Traffic Specification (Tspec)– A specification of the traffic of this LSP
– Eg: 2Mbps
Objects in RESV messages
● Label Object– Contains a label that was requested
– Labels are thus allocated upstreams
● Record Route Object (RRO)– Record the path pf the RESV message
● Flow Specification (Flowspec)– What was actually reserved
Traffic engineering with RSVP-TE/MPLS● RSVP can reserve resources in the network● RSVP can signal alternative paths using
– Strict source routing
– Loose source routing
● Why? Traffic distribution, alternate paths– Difficult to do with regular routing protocols.
● Example:– LSP loose source routing: via D (20Mbps)
– LSP strict source routing: A,B,C,D,E (15 Mbps)
A
B D
E
C
Fast reroute and BFD
● With alternative paths already computed, switchover can be fast
● No need to recompute a new best path● Link failures can be detected in many ways
– Loss of signal
– RSVP Path/Resv loss
– BFD – Bidirectional Forwarding Detection
● BFD can detect link failures in ~100ms– Based on a simple ”ping” method
– Is also used for other protocols: OSPF, BGP, etc
● But it is complex to administrate several alternate paths in a large network
Traffic engineering in a transit network
● Putting it all together:● IGP is used to routes between all routers in the network● MPLS is used to carry transit traffic between border
routers (PE)● RSVP-TE (or LDP-TE) is used to signal MPLS LSPs between
border routers.● RSVP-TE can be used to reserve bandwidth in LSPs,
compute alternative paths for switchover using source routing. – In this way an operator can control all transit traffic
● BGP is used for all external routes and to provide nexthops for transit routes to RSVP.
● This is the lab scenario.
Labs: Issues
● There are routes both from IGP and RSVP. Internal traffic should use the IGP, external should use RSVP.– RSVP adds end-points in a separate inet table which BGP
uses: inet.3. RSVP has lower precedence than the IGP
– BGP looks in both inet.0 and inet.3. IGP does not. This is how transit traffic uses the LSP tunnels.
● You need to cooperate between the groups to set up EBGP peering
● Recapitulate BGP configurations● BGP should set next-hop self.● RSVP is used to traffic engineer using source routing.
L3VPN● You can use BGP + MPLS in other scenarios.● One is building VPNs. Many variants exists: L3VPN, pseudo-wire
(L2VPN), VPLS, etc.● L3VPN is a dynamic VPN using BGP and MPLS● Each customer site is modelled as a separate AS – customer
interior routing runs independently at each site● An address conversion scheme makes each customer VPN
route unique within the provider's network:
– New address family VPN+IPv4
● Multiple routing and forwarding tables are supported on each PE separating different customer routing information
● BGP is used as a signalling protocol to setup VPN connections between customer sites.
● RSVP (or LDP) is used to setup the MPLS paths● MPLS multistacking is used to keep provider's network free of
customer routing information
– Outer tag is used by RSVP, Inner tag identifies VPN
● Encryption by other means, security by trusting the provider
L3VPN
PE
CE
CEPE PE
P
PE
CE
CE
CE
P
AS 65100
192.16.100.0/24192.16.100.0/24
10.1.1.0/2410.1.1.0/24
10.2.1.0/24
MPLS LSPs
Routing between LAN islandsCE/PEs exchange routing IP: dst192.16.100.1MPLS: 10MPLS: 8
Double MPLS headers are used. Outer label for IGP routing. Inner label to identify VPN.
192.16.100.0/24
View from one customer
Provider network acts as a ”distributed router”
10.2.1.0/24
10.1.1.0/24
AS 65100
BGP repetition: Peering sessions● Neighbor negotiation of IBGP and EBGP are the same
– IBGP peering: within an AS
– EBGP peering: between AS:s
● Two peers must have IP connectivity– Simple check: they should be able to ping each other
● If EBGP peers are not physically connected: Multihop EBGP
IBGPEBGP
AS2AS1
RTF
RTC
RTD
RTEAS3
RTA
RTB
EBGP Multihop
Physical
Logical
This router does not run BGP
IBGP loopback peering
● IBGP peering is usually done with loopbacks. Why?– More stable: Not tied to single physical path, if a link/interface goes
down, another route may be chosen.
● But IBGP needs an IGP so that the loopbacks can be reached– And TCP connections can be established
AS2
RTCRTB
RTD
RTEeth1
loeth0
EBGP peering
● EBGP peering is typically made over a physical directly connected link
– Demilitarized zone (DMZ) that does not “belong” to either AS
● But if routes learned via EBGP uses the DMZ address as nexthop
– The DMZ must be redistributed via IGP!
– But the DMZ is not really part of the AS,...
● Although the DMZ is usually a part of one of the AS:s address ranges
EBGP
AS2
RTB
DMZ:
192.168.200.0/30
AS1
RTA
.2.1
NEXTHOP is 192.168.200.2How do I reach it?
RTC
IGPIBGP
EBGP: Next-hop self
● Alternative:
– Set next-hop-self
– Announce routes using the loopback address of the border router as next-hop
– DMZ does not need to be distributed within the AS
● But RTA still uses the directly connected DMZ address
EBGP
AS2
RTB
DMZ:
192.168.200.0/30
AS1
RTA
.2.1
RTC
IGPIBGP
lo: 10.0.0.1
NEXTHOP is 10.0.0.1 NEXTHOP is
192.168.200.2
EBGP nexthop: recursive lookup
EBGP
AS2
RTB
DMZ:
192.168.200.0/30
AS1
RTA.2.1
RTC
IGPIBGP
lo: 10.0.0.1
Route Nexthop Protocol130.2.3.4/24 10.0.0.1 IBGP10.0.0.1/32 12.0.0.1 IGP12.0.0.0/30 - direct
Route Nexthop Protocol130.2.3.4/24 192.168.200.2 IBGP192.168.200.0/30 12.0.0.1 IGP12.0.0.0/30 - direct
12.0.0.0/30.1
RTD
RTC:s routing table alternatives:
DMZ nexthop
Next-hop self