Oam Best Practices
-
Upload
paul-tichy -
Category
Documents
-
view
231 -
download
0
Transcript of Oam Best Practices
-
8/3/2019 Oam Best Practices
1/26
SERVICE PROVIDER
OAM Best Practices in Mission-Critical
MPLS, IP, and Carrier Ethernet Networks
A variety of Operations, Administration, and Management (OAM)
protocols and tools have been developed recently for MPLS, IP, and
Ethernet networks, which provide the unparalleled power to proactively
manage networks and customer Service-Level Agreements (SLAs).
This paper reviews the OAM tools available in MPLS, IP, and Ethernet
networks at various layers and describes best practices for choosing
the right OAM tool to use for particular network deployments.
-
8/3/2019 Oam Best Practices
2/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 2 of 26
CONTENTS
Overview ............................................................................................................................................................................................................................................. 3OAM Layering ................................................................................................................................................................. 3OAM Tools and Network Layers .................................................................................................................................... 4
Layer 2 OAM Tools .......................................................................................................................................................................................................................... 5Layer 2 Trace ................................................................................................................................................................. 5Port Loop Detection ....................................................................................................................................................... 6Unidirectional Link Detection ........................................................................................................................................ 7Single-Link LACP Keep-Alive .......................................................................................................................................... 8IEEE 802.1ag CFM ......................................................................................................................................................... 9
Continuity Check Messages (CCM) ...................................................................................................................... 11Loopback Messages (LBM) .................................................................................................................................. 11Linktrace Messages (LTM) ................................................................................................................................... 11Brocade Implementation of 802.1ag: ................................................................................................................. 12Hierarchical Fault Detection using 802.1ag ....................................................................................................... 12IEEE 802.1ag Configuration Example ................................................................................................................. 13IEEE 802.1ag CFM versus ITU-T Y.1731 OAM .................................................................................................... 15
ITU-T Y.1731 Performance Management ................................................................................................................... 15IEEE 802.3ah Ethernet First Mile (EFM) Link OAM .................................................................................................... 16Layer 2 OAM Summary ................................................................................................................................................ 17
MPLS OAM Tools ...........................................................................................................................................................................................................................18LSP Ping ....................................................................................................................................................................... 18LSP Traceroute ............................................................................................................................................................. 19LSP Ping and LSP Traceroute Considerations ............................................................................................................ 19BFD for RSVP-TE LSPs ................................................................................................................................................. 20MPLS OAM Summary ................................................................................................................................................... 21
IP and VRF OAM Tools .................................................................................................................................................................................................................22IP and VRF Ping ............................................................................................................................................................ 22IP and VRF Traceroute ................................................................................................................................................. 22BFD for OSPFv2, OSPFv3, IS-IS, and BGP4 ................................................................................................................ 23IP and VRF OAM Summary .......................................................................................................................................... 25
Summary .........................................................................................................................................................................................................................................26
-
8/3/2019 Oam Best Practices
3/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 3 of 26
OVERVIEW
A variety of OAM tools have been developed in recent years for MPLS, IP, and Ethernet networks. These
tools provide unparalleled power for an operator to proactively manage networks and customer Service-
Level Agreements (SLAs). These OAM tools address fault detection, fault verification, and fault isolation and
provide proactive detection of service degradation, service performance monitoring, and SLA verification.
In MPLS, IP, and Ethernet networks, Operations, Administration, and Management (OAM) and Provisioning
(OAM&P) encompasses the Management Plane (see Figure 1), represented by Network Management
Systems (NMS) and Element Management Systems (EMS), and the Network Plane, represented by Network
Elements (NE) and the OAM tools that run across NEs.
This white paper reviews the OAM tools available in MPLS, IP, and Ethernet networks at various layers of the
networking stack and recommends and reviews best practices for choosing the right OAM tool to use for a
particular network deployment.
Figure 1. OAM tools
OAM Layering
OAM tools can be classified into three main types based on the OAM layer (Figure 2):
Service Layer OAM. Tools applicable to services on an end-to-end basis Network Layer OAM. Tools applicable to services over a particular network Transport Layer OAM. Tools applicable to the transport layer of the network
Figure 2. OAM layers
These OAM layers are hierarchical in nature. For example, inFigure 3the Service Layer OAM for Operator A
can be seen as a Transport Layer OAM for the service provider, who sees the service provided by Operator A
as a transport tunnel for the customer.
Management Plane
(NMS, EMS)
OAM&PNetwork Plane
(Network Elements)
The scope of this paper is OAM tools across Network Elements
Service Layer OAM
Network Layer OAM
Transport Layer OAM
-
8/3/2019 Oam Best Practices
4/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 4 of 26
NOTE: The terms customer, service provider, and operator are commonly used to reflect the
business relationships that often exist among organizations and individuals. An operator provides a
single Layer 2 or Layer 3 backbone network to a service provider. An operator can be identical to, or a
part of the same organization as, a service provider.
The best OAM tools to use at a particular network layer depend on the type of network. For example, in
Figure 3, Operator A has an MPLS network and uses MPLS OAM tools, while Operator B has an Ethernetnetwork and uses Ethernet OAM tools.
Figure 3. Customer, operator, and service provider views of OAM layering
OAM Tools and Network Layers
Each network layer has its own best-suited OAM tools.Figure 4lists common OAM tools applicable to
Layer 2, MPLS, IP (Layer 3), and the Virtual Private Network (VPN), which includes Layers 2 and 3 VPNs.
Note that certain OAM tools, for example,802.1ag CFM and Y.1731 PM, are applicable to Layer 2 networks
and also to Layer 2 VPN services, as shown inFigure 4.
The following sections address the OAM tools shown inFigure 4.
Figure 4. Each network layer has its own best-suited OAM tools
Customer
network
Site 1 Site 2
Customer
networkOperator B
Network
Ethernet
Operator A
Network
MPLS
Service Provider
Ethernet OAM
(Operator B)
Link
OAM
Link
OAM
Link
OAM
MPLS OAM
(Operator A)
Service OAM
IP
Layer 2
tracePort loop
detection UDLD
Single-link
LACPkeep-alive
Ping and Traceroute BFD for OSPF and IS-IS
VRF Ping and Traceroute
(L3VPN)
802.1ag CFM for VPLS/VLLY.1731 PM for VPLS/VLL
(L2VPN)
VPN
MPLS
Layer 2
802.1ag
CFM/Y.7131 PM
802.3ah
EFM OAM
BFD for RSVP-TE LSPsLSP Ping and Traceroute
-
8/3/2019 Oam Best Practices
5/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 5 of 26
LAYER 2OAMTOOLS
This section addresses the Layer 2 OAM tools listed in Figure 4. These tools function in Layer 2 networks to
monitor:
Layer 2 services and connectivity (VLANs): Layer 2 Trace, Port Loop Detection, 802.1ag CFM, andY.1731 PM
Layer 2 links: UDLD, single-link keep-alive, and 802.3ah EFM OAMLayer 2 Trace
Layer 2 Trace is a Brocade proprietary OAM tool that traces the traffic path in a VLAN. Layer 2 Trace is run
on demand using a CLI command. Layer 2 Trace can be used to trace a particular IP, MAC, or hostname in a
given VLAN. The Layer 2 Trace command (trace-l2) probes the entire Layer 2 topology and displays the
input or output ports of each hop in the path, the round trip travel time of each hop, and each hop's Layer 2
protocol (such as STP, RSTP, 802.1w, SSTP, metro ring, or route-only).
Figure 5 shows an example of Layer 2 Trace command (trace-l2) executed for the given network
configuration. The probed Layer 2 information is discarded after 10 minutes or when a new trace-l2
command is issued.
Layer 2 Trace can also display hops that form a forwarding loop in a VLAN. Figure 6 is an example in which
the active topology for VLAN 2 forms a forwarding loop. In this case, Layer 2 Trace on VLAN 2 detects the
forwarding loop and issues the indicated warning message.
Layer 2 Trace configuration considerations:
The devices that will participate in the Layer 2 Trace protocol must be assigned to a VLAN and alldevices on that VLAN must be Brocade devices that support the Layer 2 Trace protocol.
Devices that do not support the Layer 2 Trace protocol simply forward Layer 2 Trace packets without areply and are transparent to the Layer 2 Trace protocol.
The destination for the packet with the trace-l2 protocol must be a device that supports the Layer 2Trace protocol.
The destination cannot be a client, such as a personal computer, or devices from other vendors.
Figure 5. Layer 2 Trace example
-
8/3/2019 Oam Best Practices
6/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 6 of 26
Figure 6. Layer 2 trace in a loop topology
Port Loop Detection
Port Loop Detection is a Brocade proprietary OAM toll used to detect Layer 2 forwarding loops. Upondetecting a Layer 2 forwarding loop, the Port Loop Detection tool disables the errant port(s). The device can
be configured to automatically re-enable ports after a timeout period.
This OAM tool sends special protocol packets from the device and detects Layer 2 forwarding loops when
these packets are received on ports on the same device.
Layer 2 Trace can also detect forwarding loops. However, the difference is that Port Loop Detection does
not require manual interaction to detect loops. That is, Layer 2 Trace is run on demand using a CLI
command, while Port Loop Detection runs continuously to provide automatic detection and reduce down-
time due to misconfigurations.
Port Loop Detection supports two modes of operation:
Strict mode. Detects a Layer 2 forwarding loop where packets loop back to the same physical port,
that is, a hair pin loop.
NetIron(config)#interface ethernet 1/1
NetIron(config-if-e1000-1/1)#loop-detection
Loose mode. Detects Layer 2 forwarding loops for a given VLAN or a VLAN group. Loose mode floodstest packets to the entire VLAN or VLAN group. See Figure 7.
NetIron(config)#vlan 20
NetIron(config-vlan-20)#loop-detection
NetIron(config)#vlan-group 10
NetIron(config-vlan-group-10)#add-vlan 1 to 100
NetIron(config-vlan-group-10)#loop-detection
-
8/3/2019 Oam Best Practices
7/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 7 of 26
Figure 7. Port Loop Detection example (loose mode)
Unidirectional Link Detection
Unidirectional Link Detection (UDLD) is a Brocade proprietary OAM tool used to monitor an Ethernet link
between two Brocade NetIron devices and to provide fast detection of link failures.
Ports enabled for UDLD exchange proprietary health-check packets once every keep-alive interval. The
keep-alive interval can be configured between 100 ms and 6000 ms in increments of 100 ms. The default
keep-alive interval is 500 ms.
If a port does not receive a health-check packet from the port at the other end of the link after a number ofkeep-alive retry intervals, UDLD brings the port down. As a consequence, UDLD brings the ports on both
ends of the link down if the link goes down on one direction. Keep-alive retry intervals can be configured
from 3 to 10, and the default is 5.
When UDLD is enabled on a port, the port transitions into an init state to detect if the other end supports
UDLD. The port does not go down if the other end is not UDLD-enabled.
Figure 8illustrates UDLD used to monitor a link between two nodes. Figure 9 is an example of a global
show UDLD command. The show command also supports showing information for a specific port (not
shown in the figure).
Configuration considerations include the following:
UDLD is supported only on Ethernet ports. To configure UDLD on a LAG group, you must configure the feature on each port of the group
individually. Configuring UDLD on a LAG groups primary port enables the feature on that port only.
Dynamic LAG is not supported. If you want to configure a LAG group that contains ports on which UDLDis enabled, you must remove the UDLD configuration from the ports. After you create the LAG group,
you can add the UDLD configuration back.
Tagged UDLD is also supported:NetIron(config)# link-keepalive ethernet 1/18 vlan 22
Figure 8. UDLD configuration example
-
8/3/2019 Oam Best Practices
8/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 8 of 26
Figure 9. Displaying UDLD information
Single-Link LACP Keep-Alive
The Single-Link Link Aggregation Control Protocol (LACP) Keep-Alive OAM tool supports asingle-port Link
Aggregation Group (LAG). Single-Link LACP Keep-Alive is used to monitor an Ethernet link between two
devices and to provide for fast detection of link failures. This is similar to the UDLD OAM tool, except that
the Single-Link LACP Keep-Alive OAM tool uses LACP, which is a standard protocol, instead of a proprietary
protocol between nodes.
When should you use Single-link LACP Keep-Alive instead of UDLD?
UDLD is a proprietary protocol. Single-link LACP Keep-Alive can be used to interoperate with third-partyequipment also supporting this feature.
With Single-Link LACP Keep-Alive, LACP PDUs are exchanged between the two nodes to determine if the
connection between the devices is still active. If no LACP PDUs are received from the other node after 3
lacp-timeout periods, a timeout event occurs and the port is blocked.
The LACP keep-alive PDUs can be sent every 1 second (lacp-timeout short) or every 30 seconds (lacp-
timeout long). Since a timeout is declared after missing 3 consecutive LACP keep-alive PDUs, a timeout can
be declared in 3 seconds or 90 seconds, depending on the selected LACP keep-alive PDUs interval.
To configure single-link LACP keep-alive timeout intervals:
NetIron(config)# lacp-timeout short | long
Figure 10shows an example of a single-link LACP keep-alive configuration.
Figure 10. Single-Link LACP Keep-Alive example
-
8/3/2019 Oam Best Practices
9/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 9 of 26
IEEE 802.1ag CFM
The IEEE 802.1ag Connectivity Fault Management (CFM) OAM tool facilitates path discovery, fault
detection, fault verification and isolation, fault notification, and fault recovery.
CFM terminology (seeFigure 11):
MD (Maintenance Domain). The part of a network for which faults in Layer 2 connectivity can bemanaged.
MEP (Maintenance End Point). A Maintenance Point (MP) at the edge of a domain that actively sourcesCFM messages. There are two types of MEPs, as shown in Figure 12:
Up (inward) MEP: Considering a MEP on a given physical port, an up MEP sends 802.1agmessages into the node.
Down (outward) MEP: A down MEP sends 802.1ag messages out of the node.Note that up and down MEPs can be used to include or exclude more of the internal path inside a
switch, as shown inFigure13.
MIP (Maintenance Intermediate Point). A maintenance point internal to a domain that only respondswhen triggered by certain CFM messages. A MIP does not actively source CFM messages.
MA (Maintenance Association). A set of MEPs established to verify the integrity of a single serviceinstance, for example, a VLAN or a VPLS.
ME (Maintenance Entity). A point-to-point relationship between two MEPs within a single MA. MD Level. An integer from 0 to 7 in a field in a CFM PDU that is used, along with the VLAN ID, to
identify which MIPs/MEPs would be interested in the contents of a CFM PDU. MD levels are used to
separate the MAs of customer, service provider, and operators. MD levels 802.1ag recommendations
for customers, service providers, and operators are shown inFigure 11.
CFM Hierarchy. MD levels create a hierarchy in which 802.1ag messages sent by customer, serviceprovider, and operators are processed only by MIPs and MEPs at the respective level of the message.
A common practice is for the service provider to set up a MIP at the customer MD level at the edge ofthe network, as shown inFigure11, to allow the customer to check continuity of the Ethernet service to
the edge of the network. Similarly, operators set up MIPs at the service provider level at the edge of
their respective networks, as shown inFigure 11, to allow service providers to check the continuity of
the Ethernet service to the edge of the operators networks. Inside an operator network, all MIPs are at
the respective operator level, also shown inFigure 11.
-
8/3/2019 Oam Best Practices
10/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 10 of 26
Figure 11. IEEE 802.1ag terminology
Figure 12. Up and down MEPs
Figure 13. Using up and down MEPs to include or exclude the path inside a switch
IEEE 802.1ag CFM supports Continuity Check Messages (CCM), Linktrace, and Loopback Messages, which
are described in the following sections.
Customer
network
Site 1
MEDown
MEPMD level 5
(7, 6, or 5)
Site 2
Customer
networkOperator B
Network
Operator A
Network
Service Provider
Customer MA
ME
MEP
MIP
Up
MEPMD level 3
(4 or 3)
Service Provider MA
ME MD level 1
(2, 1, or 0)
Operator A MA
ME
Operator B MA
EthernetMPLS
Down MEP
Up MEP
Down MEP
Up MEP
Switch
Port Port
Down MEP Down MEPUp MEP Up MEP
Switch Switch
-
8/3/2019 Oam Best Practices
11/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 11 of 26
Continuity Check Messages (CCM)
CCMs are periodic hello messages multicast by a MEP within the maintenance domain to detect continuity
failures. If a MEP stops receiving periodic CCMs from a peer MEP on a remote bridge, it assumes that either
the remote bridge has failed or the continuity of the path between the two bridges has been interrupted.
Figure 14. 802.1ag Continuity Check Messages (CCM)
Loopback Messages (LBM)
LBM is a Unicast message used to verify the connectivity between a MEP and a peer MEP or MIP. Loopback
messages are also used for fault localization.
To verify the connectivity between a MEP and a peer MEP or a MIP, an LBM is initiated by the source MEP
with a destination MAC address set to the MAC address of desired peer MEP or MIP. The receiving MIP or
MEP responds to the LBM with a (Unicast) Loopback Reply (LBR) addressed to the source MEP.
LBM helps a MEP identify the location of a continuity fault along a given MA. A MIP in front of the continuity
fault responds with a loopback reply. A MIP or MEP behind the continuity fault does not respond. For
loopback to work, the MEP must know the MAC address of the target MIP or MEP. These MAC addresses
can be discovered using the Linktrace Message.
Figure 15. 802.1ag Loopback Message (LBM)
Linktrace Messages (LTM)
LTM is a multicast message used by a source MEP to trace the path to other MEPs in the same MA. Allreachable MIPs and MEPs respond back with a Linktrace Reply (LTR) message addressed to the source
MEP. The originating MEP can then determine the MAC addresses of all MIPs and MEPs belonging to the
same MA.
Note that the source MEP sends a single LTM to the next hop along the trace path. However, it can receive
many LTR messages from different MIPs along the trace path and different MEPs terminating the branches
of the trace path.
Linktrace can also be used when no faults are apparent in order to discover the routes normally taken by
data through the network.
Figure 16. 802.1ag Linktrace Message (LTM)
-
8/3/2019 Oam Best Practices
12/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 12 of 26
Brocade Implementation of 802.1ag:
CCM period 3.3 ms, 10 ms, 100 ms, 1 sec, 1 min, 10 min Support for minimum CCM timers (3.3 ms) using hardware offload
Support for MIPs and up/down MEPs Support for all 8 MD levels (0 7) Support for the following types of end-points/services
VLANs, VPLS, and VLLHierarchical Fault Detection using 802.1ag
As shown inFigure 11, 802.1ag CFM defines a domain hierarchy in which customers, service providers, and
operators use different MD levels. This hierarchy is also used for fault detection.
Figure 17illustrates an example in which a customer has an Ethernet service between Sites 1 and 2. This
Ethernet service is provided by Operators A and B. Operator B supports the service at the core with an MPLS
network. Operator A supports the service at Metro Locations 1 and 2 using a Layer 2 Ethernet network.
InFigure 17, a service continuity fault occurs inside Operator Bs network. The customer can detect an end-
to-end service continuity fault using CCM, but it cannot determine the location of the fault within the
operators network. Operator A can detect that a service continuity fault exists within Operator Bs network.
Operator B can detect the service continuity fault, but it cannot isolate the location of the continuity fault
using 802.1ag CFM, since it has an MPLS network. Operator B needs to use MPLS OAM tools to isolate the
fault location.
Figure 17. Example of 802.1ag hierarchical fault detection (refer to the numbered items below)
-
8/3/2019 Oam Best Practices
13/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 13 of 26
To simplify this example, the service provider level is not shown. If it were, the service provider would be
represented by the overall network from Operator A in Location 1 through Operator B to Operator A in
Location 2.
The following is an example of how this fault can be detected at the different levels of the hierarchy:
1. The customer detects a service continuity fault using CCMs.2. Using Linktrace, the customer finds that the fault is beyond the MIPs at the border of Operator A.3. Provider A detects a service continuity fault using CCMs.4. Using Linktrace, Provider A determines that the fault is inside Operator Bs network.5. Operator B detects a service continuity fault using CCMs.Operator B uses MPLS OAM tools to determine the location of the fault in its MPLS network. See the MPLS
OAM section for details on MPLS-specific OAM tools. This statement is included here to emphasize the fact
that you need to use the appropriate OAM tools for the type of network being used. In this case, Operator B
has an MPLS network and needs to use MPLS OAM tools. Operator A has a Layer 2 Ethernet network and
can use 802.1ag CFM. Note that Operator Bs MPLS network is required to support 802.1ag CFM messages
over VPLS and VLL to allow customers and Operator A to use 802.1ag end-to-end.1
Note that the customer, Operator A, and Operator B can concurrently and independently detect the
continuity fault and run Linktrace to determine the location of the fault. The steps above are numbered to
allow for easy reference to the respective actions depicted inFigure 17. The numbering does not imply an
ordered sequence of events. That is, Operator A does not have to wait for the customer to tell it that the
service is broken before it runs its own Continuity Check.
Note that the CCMs shown inFigure 17can be set up to run continuously to detect potential continuity
faults or they can be set up on demand as needed.
IEEE 802.1ag Configuration Example
InFigure 18, a customer has a point-to-point service (VLL) over an MPLS network. In this example, the
customer runs CCM at 10 ms intervals at MD level 7 between CE1 and CE2. The service provider runs CCMat 10 ms intervals at MD level 4 between PE1 and PE2.
Figure 19andFigure 20show example configurations for CE1, CE2, PE1, and PE2 shown inFigure 18.
1 Brocade supports 802.1ag CFM over VPLS and VLL to allow Ethernet OAM to function end-to-end over an
MPLS core network.
-
8/3/2019 Oam Best Practices
14/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 14 of 26
Figure 18. Example of 802.1ag configuration
Figure 19. CE1 and CE2 configurations
MPLS
7 77
7
4
44
7
7VLL
Customer CCM @ 10 sec
Service provider CCM @ 10sec
1/1 1/1 2/1 2/1
CE1 CE2PE1 PE2
Customer down MEP
Customer MIP
Service Provider up MEP
-
8/3/2019 Oam Best Practices
15/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 15 of 26
Figure 20. PE1 and PE2 configurations
IEEE 802.1ag CFM versus ITU-T Y.1731 OAM
ITU-T Y.1731 OAM is a superset of IEEE 802.1ag CFM. 2
ITU-T Y.1731 Performance Management
ITU-T Y.1731s ETH-CC (Ethernet Connectivity
Check), ETH-LB (Ethernet Loopback), and ETH-LT (Ethernet Linktrace) OAM functions are equivalent to
802.1ag CCM, LBM, and LTM, respectively. Devices deploying 802.1ag CCM, LBM, and LTM can
interoperate with devices deploying Y.1731 ETH-CC, ETH-LB, and ETH-LT, respectively. However, Y.1731
ETH-CC supports either multicast or Unicast messages, while 802.1ag CCM supports multicast messages
only. Therefore, to interoperate 802.1ag CCM with Y.1731 ETH-CC, the Y.1731 device must be set up to use
ETH-CC multicast messages.
ITU-T Y.1731 Performance Management (PM) supports on-demand measurement of round-trip Frame Delay
(FD) and Frame Delay Variation (FDV). These measurements are made between defined MEPs (seeFigure
21).
The main benefit of Y.1731 PM is for Service Level Agreement (SLA) monitoring and verification of services
provided to customers in aggregation, metro, and core networks. SLA monitoring and verification is
essential for delay-sensitive applications, for example, voice, and for services with SLA guarantees.
The Brocade implementation supports a high-precision, hardware-based time-stamping mechanism that
provides measurements with microsecond granularity. It also supports delay measurements for Layer 2
bridging services and for VPLS and VLL services.
Figure 21. Y.1731 delay measurement
2 Besides CFM and other functionality, ITU-T Y.1731 also includes Performance Management, which is
addressed in this paper.
Brocade MLX
ETH-DMMEP 2MEP 3
Brocade MLX
-
8/3/2019 Oam Best Practices
16/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 16 of 26
Figure 22shows an example of the Y.1731 delay measurement between MEP3 and MEP2 shown inFigure
21. The command sends a selectable number (default is 10) of delay measurement PDUs (ETH-DM), which
are time-stamped in hardware at the source and destination MEPs to achieve high-precision measurement
independent of software delays. The command averages the individual measurements and lists the
resulting minimum, average, and maximum delays.
Figure 22. Y.1731 delay measurement example
IEEE 802.3ah Ethernet First Mile (EFM) Link OAM
IEEE 802.3ah Ethernet First Mile (EFM) link OAM monitors and supports troubleshooting individual links.
That is, 802.3ah OAM operates on a point-to-point link and does not propagate beyond a single hop. As
shown inFigure 23, this IEEE standard was originally developed to monitor the link between a service
provider and customer, where it is usually called the first mile link.
802.3ah EFM OAM supports the following functions:
OAM discovery Used to discover the 802.3ah EFM OAM capabilities of the peer device
Remote failure indication (critical events) Used to inform the peer node that the receive path of the link is non-operational Also includes communication of conditions such as dying gasp
Link monitoring Can generate event notifications (alarms) when defined error thresholds are exceed
Remote loopback testing Puts the peer in data loopback state
802.3ah supports two modes of operation:
Active mode Normally used by a device controlled by a service provider The device can source OAM PDU packets in order to initiate an EFM OAM discovery process
Passive mode Normally used by customer devices connected to a service provider device The device cannot source OAM PDU packets, but it can respond to received OAM PDUs
-
8/3/2019 Oam Best Practices
17/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 17 of 26
Figure 23. IEEE 802.3ah EFM OAM
Figure 24shows an example of the output of an 802.3ah EFM OAM show command. Note that the show
command displays not only local link OAM information, but also remote link OAM information.
Figure 24. Example of 802.3ah EFM OAM show command
Layer 2 OAM Summary
Table 1 presents a summary of the Layer 2 OAM tools described in this section.
Layer 2 Trace Port Loop
Detection
UDLD Single-Link
Keep-Alive
802.1ag
CFM
Y.1731 PM 802.3ah
EFM OAM
Intended
Application
Layer 2 network
troubleshooting
and detection of
misconfiguration
Layer 2 network
troubleshooting
and detection of
misconfiguration
Single-link
keep alive
Single-link
keep alive
Service
verification
Perfor-
mance
(SLA)
verification
Customer
access
verification
Supports Layer 2 topology
discovery, Layer 2
loop detection
Layer 2 loop
detection
Single-link
keep alive
Single-link
keep alive
Layer 2
connectivity
Check,
Linktrace,
loopback
One-waydelay and
delay
variation
Single-link
OAM: fault
detection,
discovery,
loopback
GenerationManual Automatic Automatic Automatic
CC: auto
LT, LB:
manual
Manual
Auto,
Manual
(LB)
Standard No No No Yes Yes Yes Yes
802.3ahOAM
802.3ahOAM
-
8/3/2019 Oam Best Practices
18/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 18 of 26
MPLSOAMTOOLS
This section addresses the MPLS OAM tools listed in Figure 4:
LSP Ping LSP Traceroute BFD for RSVP-TE LSPsLSP Ping
LSP Ping provides OAM functionality for MPLS networks based on RFC 4379. LSP Ping is used to detect
data plane failure and to check the consistency between the data plane and the control plane.
LSP Ping verifies that packets that belong to a particular Forwarding Equivalence Class (FEC) actually end
their MPLS path on a Label Switching Router (LSR) that is an egress for that FEC. LSP Ping sends MPLS
echo requests following the same data path that normal MPLS packets would traverse (Figure 25).
LDP LSP Ping and RSVP LSP Ping are supported, as shown inFigure 26andFigure 27respectively.
Figure 25. LSP Ping operation
Figure 26. LDP LSP Ping example
Figure 27. RSVP LSP Ping example
MPLS Network
P
LSR
LSP
PE PE
LERLER
Echo Request
Echo Reply
-
8/3/2019 Oam Best Practices
19/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 19 of 26
LSP Traceroute
LSP Traceroute provides OAM functionality for MPLS networks based on RFC 4379. LSP Traceroute is used
to isolate a data plane failure to a particular router and to provide LSP path tracing.
With LSP Traceroute, an echo request packet is sent to each transit LSR and the LER. The echo request
follows the same data path that normal MPLS packets would traverse. A transit LSR or an LER receiving the
echo request checks that it is indeed a transit LSR or LER for this path and returns echo replies (Figure 28).
LDP LSP Traceroute and RSVP LSP Traceroute are supported, as exemplified inFigure 29andFigure 30,
respectively.
Figure 28. LSP Traceroute operation
Figure 29. LDP LSP Traceroute example
Figure 30. RSVP LSP Traceroute example
LSP Ping and LSP Traceroute ConsiderationsThe following are common considerations for LSP Ping and LSP Traceroute:
Redundant RSVP LSPs. LSP Ping or LSP Traceroute on a LSP is performed on the currently active path. One-to-one Fast ReRoute (FRR) LSPs. LSP Ping or LSP Traceroute on a one-to-one FRR LSP is
performed on the active path. If a path switchover occurs while a Ping or Traceroute is in-progress, the
echo request is sent out on the old active path.
FRR bypass LSPs. You can Ping or Traceroute the protected LSP and bypass tunnel separately, e.g., byspecifying the name of the LSP.
MPLS Network
P
LSR
LSP
PE PE
LERLER
Echo request
Echo replies
-
8/3/2019 Oam Best Practices
20/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 20 of 26
Transit-originated detour. The user can initiate a Ping or Traceroute operation on a transit-originated,detour LSP. Because the session name does not uniquely identify a session on a transit LSR, the user
needs to specify the entire session ID (including the tunnel end-point, tunnel ID, and extended tunnel
ID) for the detour LSP to which the LSP Ping or Traceroute command is applied.
LSP re-optimization. If LSP re-optimization occurs while the Ping or Traceroute is in progress, the echorequest will be sent out on the current LSP instance until the new instance is created.
BFD for RSVP-TE LSPs
Bidirectional Forwarding Detection (BFD) RSVP-TE LSP defines a method for rapid detection of the failure of
the data path of an LSP (Figure 31). While LSP Ping can be used for this purpose, BFD for RSVP-TE LSP
provides the following advantages:
BFD for RSVP-TE LSP can be configured to dynamically detect data plane failure of MPLS RSVP LSPs. BFD for RSVP-TE LSP provides faster failure detection, since it does not require control plane
verification as LSP Ping does.
BFD for RSVP-TE LSP can be used to concurrently detect faults on a number of LSPs without manualinteraction as required using LSP ping.
BFD allows for the detection of a forwarding path failure in 300 milliseconds or less (depending on the
configuration).
Figure 31. BFD for RSVP-TE operation
BFD for RSVP-TE LSP should be used selectively to monitor unreliable paths such as those through non-
MPLS devices, for example, optical switches. InFigure 32, for example, the LSP traverses optical switches.
The optical switches keep the links to the MPLS routers up even in the event of a failure between the
optical switches. This would prevent the MPLS routers from supporting path switchover (since, as far as the
MPLS routers are concerned, the link between them is up). BFD for RSVP-TE LSP would detect the LSP path
failure and would trigger a path switchover.3
Since a link failure will trigger FRR directly, the only benefit of using BFD for RSVP-TE LSP when there are no
optical switches (or other transport types that would prevent MPLS routers from detecting the physical path
as down) would be to detect control plane failures.
3In configurations in which there is no alternative path, the LSP is brought down and the BFD session is deleted.
The LSP then follows the normal retry procedures to come back up.
MPLS Network
P
LSR
LSP
PE PE
LERLER
BFD
-
8/3/2019 Oam Best Practices
21/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 21 of 26
Figure 32. BFD for RSVP-TE LSP used to monitor paths through non-MPLS devices
BFD for RSVP-TE LSP can be enabled or disabled on the fly at the global MPLS level 4 Figure 33(see ) or for
each individual RSVP LSP (seeFigure 34) without affecting the LSP operational status. In addition, BFD for
RSVP-TE LSP parameters can be changed on the fly without changing the state of the BFD session.
Figure 33. Enabling BFD for RSVP LSP globally
Figure 34. Enabling BFD for a specific RSVP-TE LSP
MPLS OAM Summary
Table 2 presents a summary of the MPLS OAM tools described in this section.
LSP Ping LSP Traceroute BFD for RSVP-TE LSPs
Intended ApplicationTo detect data plane failure
and to check the consistency
between the data plane and
the control plane
To isolate the data plane
failure to a particular router
and to provide LSP path
tracingFast data plane failure
detection for RSVP LSPs
Supports
Connectivity verificationConnectivity troubleshooting,fault localization
Fast data plane failure
detection (link may be
up, but data path is
down)
Generation Manual Manual Automatic
Standard Yes Yes Yes
4The number of BFD sessions supported by the system must be taken into account when enabling BFD for RSVP-
TE globally.
LSP
Failure
BFD BFD
-
8/3/2019 Oam Best Practices
22/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 22 of 26
IP ANDVRFOAMTOOLS
This section addresses the IP and L3VPN (VRF) OAM tools listed in Figure 4:
IP and VRF Ping IP and VRF Traceroute BFD for OSPFv2, OSPFv3, IS-IS, and BGP4IP and VRF Ping
IP Ping is a tool used to verify connectivity at the IP level. The IP ping command sends an Internet Control
Message Protocol (ICMP) echo request to the IP address or selected hostname and waits for a reply (see
Figure 35). The Ping VRF option lets you ping an address on a specific L3VPN, that is, an address
associated with a VRF table.
Figure 36shows an example of IPv4 Ping, whileFigure 37shows an example of IPv6 Ping. Note that Ping
VRF is supported for both IPv4 and IPv6.
Figure 35. IP Ping operation
Figure 36. IPv4 Ping example
Figure 37. IPv6 Ping example
IP and VRF Traceroute
The IP Traceroute tool identifies the path that packets take through a network on a hop-by-hop basis. The
IP Traceroute tool works by sending ICMP echo packets with varying IP Time-to-Live (TTL) values to thedestination (seeFigure 38).
The Traceroute VRF option lets you traceroute an address on a specific L3VPN, that is, an address
associated with a VRF table.
Figure 39shows an example of IPv4 Traceroute, whileFigure 40shows an example of IPv6 Traceroute.
Note that Traceroute VRF is supported for IPv4 and IPv6.
Sourcerouter
Destinationrouter
Echo request
Echo reply
-
8/3/2019 Oam Best Practices
23/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 23 of 26
Figure 38. IP Traceroute operation
Figure 39. IPv4 Traceroute example
Figure 40. IPv6 Traceroute example
BFD for OSPFv2, OSPFv3, IS-IS, and BGP4
Bidirectional Forwarding Detection (BFD) defines a method for rapid detection of the failure of a forwarding
path by checking that the next-hop router is alive. Without BFD enabled, it can take from 3 to 30 seconds to
detect that a neighboring router is not operational (and packet losses would occur during that time).
BFD can detect data path failures when a link is up, but the data path is not, for example, failures due to
misconfiguration and path through optical switches (seeFigure 41). BFD allows for the detection of a
forwarding path failure in 300 ms or less (depending on the configuration). When BFD is enabled on a
routed interface, a BFD session is automatically established when a neighbor router is discovered.
Figure 41. BFD operation
Source
router
Destination
router
Echo request
Echo reply
Echo request
Echo reply
Failure
Link is up
BFDBFD BFD
BFDBFDBFD
-
8/3/2019 Oam Best Practices
24/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 24 of 26
Figure 42shows an example of BFD configuration. BFD can be enabled or disabled for all interfaces or
per interface for use with OSPFv2 (that is, IPv4), OSPFv3 (that is, IPv6), and IS-IS, as shown inFigure 43,
Figure 44, andFigure 45, respectively.
Figure 42. BFD configuration example
Figure 43. Enabling/disabling BFD for OSPFv2 for all interfaces (top) or per interface (bottom)
Figure 44. Enabling/disabling BFD for OSPFv3 for all interfaces (top) and per interface (bottom)
Figure 45. Enabling/disabling BFD for IS-IS for all interfaces (top) and per interface (bottom)
-
8/3/2019 Oam Best Practices
25/26
SERVICE PROVIDER BEST PRACTICES GUIDE
OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 25 of 26
BFD for BGP4 supports single-hop and multi-hop BFD on Ethernet, POS, and Virtual Interfaces. BFD for
BGP4 can be enabled or disabled at the global BGP router level, for each individual peer, or for a peer
group, as shown inFigure 46, Figure 47, andFigure 48, respectively.
Figure 46. Enabling/disabling BFD globally for BGP4
Figure 47. Enabling/disabling BFD for a specific BGP4 peer
Figure 48. Enabling/disabling BFD for a BGP4 peer group
IP and VRF OAM Summary
Table 3 presents a summary of the IP and VRF OAM tools described in this section.
IP Ping
VRF Ping
IP Traceroute
VRF Traceroute
BFD for OSPFv2,
OSPFv3, IS-IS, BGP4
Intended Application Connectivity verification
at the IP level
Identification of the path that IP
packets take through a network
on a hop-by-hop basisFast data path failure
detection
SupportsConnectivity verification Connectivity troubleshooting,
fault localizationData path failure detection
(link may be up, but data
path is down)Generation Manual Manual AutomaticStandard Yes Yes Yes
-
8/3/2019 Oam Best Practices
26/26
SERVICE PROVIDER BEST PRACTICES GUIDE
SUMMARY
This paper reviewed OAM tools available for MPLS, IP, and Ethernet networks at various layers of the stack
and reviewed best practices for choosing the right OAM tool to use in a particular network deployment.
These tools provide unparalleled power for an operator to proactively manage networks and customer
Service Level Agreements (SLAs). These OAM tools address fault detection, fault verification, and fault
isolation; enable proactive detection of service degradation; and provide service performance monitoringand SLA verification.
2010 Brocade Communications Systems, Inc. All Rights Reserved. 11/10 GS-BP-356-00
Brocade, the B-wing symbol, BigIron, DCFM, DCX, Fabric OS, FastIron, IronView, NetIron, SAN Health, ServerIron, TurboIron, and Wingspan
are registered trademarks, and Brocade Assurance, Brocade NET Health, Brocade One, Extraordinary Networks, MyBrocade, VCS, and VDX
are trademarks of Brocade Communications Systems, Inc., in the United States and/or in other countries. Other brands, products, or
service names mentioned are or may be trademarks or service marks of their respective owners.
Notice: This document is for informational purposes only and does not set forth any warranty, expressed or implied, concerning
any equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the right to make changes
to this document at any time, without notice, and assumes no responsibility for its use. This informational document describes
features that may not be currently available. Contact a Brocade sales office for information on feature and product availability.
Export of technical data contained in this document may require an export license from the United States government.