Oam Best Practices

8/3/2019 Oam Best Practices

1/26

SERVICE PROVIDER

OAM Best Practices in Mission-Critical

MPLS, IP, and Carrier Ethernet Networks

A variety of Operations, Administration, and Management (OAM)

protocols and tools have been developed recently for MPLS, IP, and

Ethernet networks, which provide the unparalleled power to proactively

manage networks and customer Service-Level Agreements (SLAs).

This paper reviews the OAM tools available in MPLS, IP, and Ethernet

networks at various layers and describes best practices for choosing

the right OAM tool to use for particular network deployments.


2/26

SERVICE PROVIDER BEST PRACTICES GUIDE

OAM Best Practices in Mission Critical MPLS, IP, and Carrier Ethernet Networks 2 of 26

CONTENTS

Overview ............................................................................................................................................................................................................................................. 3OAM Layering ................................................................................................................................................................. 3OAM Tools and Network Layers .................................................................................................................................... 4

Layer 2 OAM Tools .......................................................................................................................................................................................................................... 5Layer 2 Trace ................................................................................................................................................................. 5Port Loop Detection ....................................................................................................................................................... 6Unidirectional Link Detection ........................................................................................................................................ 7Single-Link LACP Keep-Alive .......................................................................................................................................... 8IEEE 802.1ag CFM ......................................................................................................................................................... 9

Continuity Check Messages (CCM) ...................................................................................................................... 11Loopback Messages (LBM) .................................................................................................................................. 11Linktrace Messages (LTM) ................................................................................................................................... 11Brocade Implementation of 802.1ag: ................................................................................................................. 12Hierarchical Fault Detection using 802.1ag ....................................................................................................... 12IEEE 802.1ag Configuration Example ................................................................................................................. 13IEEE 802.1ag CFM versus ITU-T Y.1731 OAM .................................................................................................... 15

ITU-T Y.1731 Performance Management ................................................................................................................... 15IEEE 802.3ah Ethernet First Mile (EFM) Link OAM .................................................................................................... 16Layer 2 OAM Summary ................................................................................................................................................ 17

MPLS OAM Tools ...........................................................................................................................................................................................................................18LSP Ping ....................................................................................................................................................................... 18LSP Traceroute ............................................................................................................................................................. 19LSP Ping and LSP Traceroute Considerations ............................................................................................................ 19BFD for RSVP-TE LSPs ................................................................................................................................................. 20MPLS OAM Summary ................................................................................................................................................... 21

IP and VRF OAM Tools .................................................................................................................................................................................................................22IP and VRF Ping ............................................................................................................................................................ 22IP and VRF Traceroute ................................................................................................................................................. 22BFD for OSPFv2, OSPFv3, IS-IS, and BGP4 ................................................................................................................ 23IP and VRF OAM Summary .......................................................................................................................................... 25

Summary .........................................................................................................................................................................................................................................26


3/26



OVERVIEW

A variety of OAM tools have been developed in recent years for MPLS, IP, and Ethernet networks. These

tools provide unparalleled power for an operator to proactively manage networks and customer Service-

Level Agreements (SLAs). These OAM tools address fault detection, fault verification, and fault isolation and

provide proactive detection of service degradation, service performance monitoring, and SLA verification.

In MPLS, IP, and Ethernet networks, Operations, Administration, and Management (OAM) and Provisioning

(OAM&P) encompasses the Management Plane (see Figure 1), represented by Network Management

Systems (NMS) and Element Management Systems (EMS), and the Network Plane, represented by Network

Elements (NE) and the OAM tools that run across NEs.

This white paper reviews the OAM tools available in MPLS, IP, and Ethernet networks at various layers of the

networking stack and recommends and reviews best practices for choosing the right OAM tool to use for a

particular network deployment.

Figure 1. OAM tools

OAM Layering

OAM tools can be classified into three main types based on the OAM layer (Figure 2):

Service Layer OAM. Tools applicable to services on an end-to-end basis Network Layer OAM. Tools applicable to services over a particular network Transport Layer OAM. Tools applicable to the transport layer of the network

Figure 2. OAM layers

These OAM layers are hierarchical in nature. For example, inFigure 3the Service Layer OAM for Operator A

can be seen as a Transport Layer OAM for the service provider, who sees the service provided by Operator A

as a transport tunnel for the customer.

Management Plane

(NMS, EMS)

OAM&PNetwork Plane

(Network Elements)

The scope of this paper is OAM tools across Network Elements

Service Layer OAM

Network Layer OAM

Transport Layer OAM


4/26



NOTE: The terms customer, service provider, and operator are commonly used to reflect the

business relationships that often exist among organizations and individuals. An operator provides a

single Layer 2 or Layer 3 backbone network to a service provider. An operator can be identical to, or a

part of the same organization as, a service provider.

The best OAM tools to use at a particular network layer depend on the type of network. For example, in

Figure 3, Operator A has an MPLS network and uses MPLS OAM tools, while Operator B has an Ethernetnetwork and uses Ethernet OAM tools.

Figure 3. Customer, operator, and service provider views of OAM layering

OAM Tools and Network Layers

Each network layer has its own best-suited OAM tools.Figure 4lists common OAM tools applicable to

Layer 2, MPLS, IP (Layer 3), and the Virtual Private Network (VPN), which includes Layers 2 and 3 VPNs.

Note that certain OAM tools, for example,802.1ag CFM and Y.1731 PM, are applicable to Layer 2 networks

and also to Layer 2 VPN services, as shown inFigure 4.

The following sections address the OAM tools shown inFigure 4.

Figure 4. Each network layer has its own best-suited OAM tools

Customer

network

Site 1 Site 2

Customer

networkOperator B

Network

Ethernet

Operator A

Network

MPLS

Service Provider

Ethernet OAM

(Operator B)

Link

OAM

Link

OAM

Link

OAM

MPLS OAM

(Operator A)

Service OAM

IP

Layer 2

tracePort loop

detection UDLD

Single-link

LACPkeep-alive

Ping and Traceroute BFD for OSPF and IS-IS

VRF Ping and Traceroute

(L3VPN)

802.1ag CFM for VPLS/VLLY.1731 PM for VPLS/VLL

(L2VPN)

VPN

MPLS

Layer 2

802.1ag

CFM/Y.7131 PM

802.3ah

EFM OAM

BFD for RSVP-TE LSPsLSP Ping and Traceroute


5/26



LAYER 2OAMTOOLS

This section addresses the Layer 2 OAM tools listed in Figure 4. These tools function in Layer 2 networks to

monitor:

Layer 2 services and connectivity (VLANs): Layer 2 Trace, Port Loop Detection, 802.1ag CFM, andY.1731 PM

Layer 2 links: UDLD, single-link keep-alive, and 802.3ah EFM OAMLayer 2 Trace

Layer 2 Trace is a Brocade proprietary OAM tool that traces the traffic path in a VLAN. Layer 2 Trace is run

on demand using a CLI command. Layer 2 Trace can be used to trace a particular IP, MAC, or hostname in a

given VLAN. The Layer 2 Trace command (trace-l2) probes the entire Layer 2 topology and displays the

input or output ports of each hop in the path, the round trip travel time of each hop, and each hop's Layer 2

protocol (such as STP, RSTP, 802.1w, SSTP, metro ring, or route-only).

Figure 5 shows an example of Layer 2 Trace command (trace-l2) executed for the given network

configuration. The probed Layer 2 information is discarded after 10 minutes or when a new trace-l2

command is issued.

Layer 2 Trace can also display hops that form a forwarding loop in a VLAN. Figure 6 is an example in which

the active topology for VLAN 2 forms a forwarding loop. In this case, Layer 2 Trace on VLAN 2 detects the

forwarding loop and issues the indicated warning message.

Layer 2 Trace configuration considerations:

The devices that will participate in the Layer 2 Trace protocol must be assigned to a VLAN and alldevices on that VLAN must be Brocade devices that support the Layer 2 Trace protocol.

Devices that do not support the Layer 2 Trace protocol simply forward Layer 2 Trace packets without areply and are transparent to the Layer 2 Trace protocol.

The destination for the packet with the trace-l2 protocol must be a device that supports the Layer 2Trace protocol.

The destination cannot be a client, such as a personal computer, or devices from other vendors.

Figure 5. Layer 2 Trace example


6/26



Figure 6. Layer 2 trace in a loop topology

Port Loop Detection

Port Loop Detection is a Brocade proprietary OAM toll used to detect Layer 2 forwarding loops. Upondetecting a Layer 2 forwarding loop, the Port Loop Detection tool disables the errant port(s). The device can

be configured to automatically re-enable ports after a timeout period.

This OAM tool sends special protocol packets from the device and detects Layer 2 forwarding loops when

these packets are received on ports on the same device.

Layer 2 Trace can also detect forwarding loops. However, the difference is that Port Loop Detection does

not require manual interaction to detect loops. That is, Layer 2 Trace is run on demand using a CLI

command, while Port Loop Detection runs continuously to provide automatic detection and reduce down-

time due to misconfigurations.

Port Loop Detection supports two modes of operation:

Strict mode. Detects a Layer 2 forwarding loop where packets loop back to the same physical port,

that is, a hair pin loop.

NetIron(config)#interface ethernet 1/1

NetIron(config-if-e1000-1/1)#loop-detection

Loose mode. Detects Layer 2 forwarding loops for a given VLAN or a VLAN group. Loose mode floodstest packets to the entire VLAN or VLAN group. See Figure 7.

NetIron(config)#vlan 20

NetIron(config-vlan-20)#loop-detection

NetIron(config)#vlan-group 10

NetIron(config-vlan-group-10)#add-vlan 1 to 100

NetIron(config-vlan-group-10)#loop-detection


7/26



Figure 7. Port Loop Detection example (loose mode)

Unidirectional Link Detection

Unidirectional Link Detection (UDLD) is a Brocade proprietary OAM tool used to monitor an Ethernet link

between two Brocade NetIron devices and to provide fast detection of link failures.

Ports enabled for UDLD exchange proprietary health-check packets once every keep-alive interval. The

keep-alive interval can be configured between 100 ms and 6000 ms in increments of 100 ms. The default

keep-alive interval is 500 ms.

If a port does not receive a health-check packet from the port at the other end of the link after a number ofkeep-alive retry intervals, UDLD brings the port down. As a consequence, UDLD brings the ports on both

ends of the link down if the link goes down on one direction. Keep-alive retry intervals can be configured

from 3 to 10, and the default is 5.

When UDLD is enabled on a port, the port transitions into an init state to detect if the other end supports

UDLD. The port does not go down if the other end is not UDLD-enabled.

Figure 8illustrates UDLD used to monitor a link between two nodes. Figure 9 is an example of a global

show UDLD command. The show command also supports showing information for a specific port (not

shown in the figure).

Configuration considerations include the following:

UDLD is supported only on Ethernet ports. To configure UDLD on a LAG group, you must configure the feature on each port of the group

individually. Configuring UDLD on a LAG groups primary port enables the feature on that port only.

Dynamic LAG is not supported. If you want to configure a LAG group that contains ports on which UDLDis enabled, you must remove the UDLD configuration from the ports. After you create the LAG group,

you can add the UDLD configuration back.

Tagged UDLD is also supported:NetIron(config)# link-keepalive ethernet 1/18 vlan 22

Figure 8. UDLD configuration example


8/26



Figure 9. Displaying UDLD information

Single-Link LACP Keep-Alive

The Single-Link Link Aggregation Control Protocol (LACP) Keep-Alive OAM tool supports asingle-port Link

Aggregation Group (LAG). Single-Link LACP Keep-Alive is used to monitor an Ethernet link between two

devices and to provide for fast detection of link failures. This is similar to the UDLD OAM tool, except that

the Single-Link LACP Keep-Alive OAM tool uses LACP, which is a standard protocol, instead of a proprietary

protocol between nodes.

When should you use Single-link LACP Keep-Alive instead of UDLD?

UDLD is a proprietary protocol. Single-link LACP Keep-Alive can be used to interoperate with third-partyequipment also supporting this feature.

With Single-Link LACP Keep-Alive, LACP PDUs are exchanged between the two nodes to determine if the

connection between the devices is still active. If no LACP PDUs are received from the other node after 3

lacp-timeout periods, a timeout event occurs and the port is blocked.

The LACP keep-alive PDUs can be sent every 1 second (lacp-timeout short) or every 30 seconds (lacp-

timeout long). Since a timeout is declared after missing 3 consecutive LACP keep-alive PDUs, a timeout can

be declared in 3 seconds or 90 seconds, depending on the selected LACP keep-alive PDUs interval.

To configure single-link LACP keep-alive timeout intervals:

NetIron(config)# lacp-timeout short | long

Figure 10shows an example of a single-link LACP keep-alive configuration.

Figure 10. Single-Link LACP Keep-Alive example


9/26



IEEE 802.1ag CFM

The IEEE 802.1ag Connectivity Fault Management (CFM) OAM tool facilitates path discovery, fault

detection, fault verification and isolation, fault notification, and fault recovery.

CFM terminology (seeFigure 11):

MD (Maintenance Domain). The part of a network for which faults in Layer 2 connectivity can bemanaged.

MEP (Maintenance End Point). A Maintenance Point (MP) at the edge of a domain that actively sourcesCFM messages. There are two types of MEPs, as shown in Figure 12:

Up (inward) MEP: Considering a MEP on a given physical port, an up MEP sends 802.1agmessages into the node.

Down (outward) MEP: A down MEP sends 802.1ag messages out of the node.Note that up and down MEPs can be used to include or exclude more of the internal path inside a

switch, as shown inFigure13.

MIP (Maintenance Intermediate Point). A maintenance point internal to a domain that only respondswhen triggered by certain CFM messages. A MIP does not actively source CFM messages.

MA (Maintenance Association). A set of MEPs established to verify the integrity of a single serviceinstance, for example, a VLAN or a VPLS.

ME (Maintenance Entity). A point-to-point relationship between two MEPs within a single MA. MD Level. An integer from 0 to 7 in a field in a CFM PDU that is used, along with the VLAN ID, to

identify which MIPs/MEPs would be interested in the contents of a CFM PDU. MD levels are used to

separate the MAs of customer, service provider, and operators. MD levels 802.1ag recommendations

for customers, service providers, and operators are shown inFigure 11.

CFM Hierarchy. MD levels create a hierarchy in which 802.1ag messages sent by customer, serviceprovider, and operators are processed only by MIPs and MEPs at the respective level of the message.

A common practice is for the service provider to set up a MIP at the customer MD level at the edge ofthe network, as shown inFigure11, to allow the customer to check continuity of the Ethernet service to

the edge of the network. Similarly, operators set up MIPs at the service provider level at the edge of

their respective networks, as shown inFigure 11, to allow service providers to check the continuity of

the Ethernet service to the edge of the operators networks. Inside an operator network, all MIPs are at

the respective operator level, also shown inFigure 11.


10/26



Figure 11. IEEE 802.1ag terminology

Figure 12. Up and down MEPs

Figure 13. Using up and down MEPs to include or exclude the path inside a switch

IEEE 802.1ag CFM supports Continuity Check Messages (CCM), Linktrace, and Loopback Messages, which

are described in the following sections.

Customer

network

Site 1

MEDown

MEPMD level 5

(7, 6, or 5)

Site 2

Customer

networkOperator B

Network

Operator A

Network

Service Provider

Customer MA

ME

MEP

MIP

Up

MEPMD level 3

(4 or 3)

Service Provider MA

ME MD level 1

(2, 1, or 0)

Operator A MA

ME

Operator B MA

EthernetMPLS

Down MEP

Up MEP

Down MEP

Up MEP

Switch

Port Port

Down MEP Down MEPUp MEP Up MEP

Switch Switch


11/26



Continuity Check Messages (CCM)

CCMs are periodic hello messages multicast by a MEP within the maintenance domain to detect continuity

failures. If a MEP stops receiving periodic CCMs from a peer MEP on a remote bridge, it assumes that either

the remote bridge has failed or the continuity of the path between the two bridges has been interrupted.

Figure 14. 802.1ag Continuity Check Messages (CCM)

Loopback Messages (LBM)

LBM is a Unicast message used to verify the connectivity between a MEP and a peer MEP or MIP. Loopback

messages are also used for fault localization.

To verify the connectivity between a MEP and a peer MEP or a MIP, an LBM is initiated by the source MEP

with a destination MAC address set to the MAC address of desired peer MEP or MIP. The receiving MIP or

MEP responds to the LBM with a (Unicast) Loopback Reply (LBR) addressed to the source MEP.

LBM helps a MEP identify the location of a continuity fault along a given MA. A MIP in front of the continuity

fault responds with a loopback reply. A MIP or MEP behind the continuity fault does not respond. For

loopback to work, the MEP must know the MAC address of the target MIP or MEP. These MAC addresses

can be discovered using the Linktrace Message.

Figure 15. 802.1ag Loopback Message (LBM)

Linktrace Messages (LTM)

LTM is a multicast message used by a source MEP to trace the path to other MEPs in the same MA. Allreachable MIPs and MEPs respond back with a Linktrace Reply (LTR) message addressed to the source

MEP. The originating MEP can then determine the MAC addresses of all MIPs and MEPs belonging to the

same MA.

Note that the source MEP sends a single LTM to the next hop along the trace path. However, it can receive

many LTR messages from different MIPs along the trace path and different MEPs terminating the branches

of the trace path.

Linktrace can also be used when no faults are apparent in order to discover the routes normally taken by

data through the network.

Figure 16. 802.1ag Linktrace Message (LTM)


12/26



Brocade Implementation of 802.1ag:

CCM period 3.3 ms, 10 ms, 100 ms, 1 sec, 1 min, 10 min Support for minimum CCM timers (3.3 ms) using hardware offload

Support for MIPs and up/down MEPs Support for all 8 MD levels (0 7) Support for the following types of end-points/services

VLANs, VPLS, and VLLHierarchical Fault Detection using 802.1ag

As shown inFigure 11, 802.1ag CFM defines a domain hierarchy in which customers, service providers, and

operators use different MD levels. This hierarchy is also used for fault detection.

Figure 17illustrates an example in which a customer has an Ethernet service between Sites 1 and 2. This

Ethernet service is provided by Operators A and B. Operator B supports the service at the core with an MPLS

network. Operator A supports the service at Metro Locations 1 and 2 using a Layer 2 Ethernet network.

InFigure 17, a service continuity fault occurs inside Operator Bs network. The customer can detect an end-

to-end service continuity fault using CCM, but it cannot determine the location of the fault within the

operators network. Operator A can detect that a service continuity fault exists within Operator Bs network.

Operator B can detect the service continuity fault, but it cannot isolate the location of the continuity fault

using 802.1ag CFM, since it has an MPLS network. Operator B needs to use MPLS OAM tools to isolate the

fault location.

Figure 17. Example of 802.1ag hierarchical fault detection (refer to the numbered items below)


13/26



To simplify this example, the service provider level is not shown. If it were, the service provider would be

represented by the overall network from Operator A in Location 1 through Operator B to Operator A in

Location 2.

The following is an example of how this fault can be detected at the different levels of the hierarchy:

1. The customer detects a service continuity fault using CCMs.2. Using Linktrace, the customer finds that the fault is beyond the MIPs at the border of Operator A.3. Provider A detects a service continuity fault using CCMs.4. Using Linktrace, Provider A determines that the fault is inside Operator Bs network.5. Operator B detects a service continuity fault using CCMs.Operator B uses MPLS OAM tools to determine the location of the fault in its MPLS network. See the MPLS

OAM section for details on MPLS-specific OAM tools. This statement is included here to emphasize the fact

that you need to use the appropriate OAM tools for the type of network being used. In this case, Operator B

has an MPLS network and needs to use MPLS OAM tools. Operator A has a Layer 2 Ethernet network and

can use 802.1ag CFM. Note that Operator Bs MPLS network is required to support 802.1ag CFM messages

over VPLS and VLL to allow customers and Operator A to use 802.1ag end-to-end.1

Note that the customer, Operator A, and Operator B can concurrently and independently detect the

continuity fault and run Linktrace to determine the location of the fault. The steps above are numbered to

allow for easy reference to the respective actions depicted inFigure 17. The numbering does not imply an

ordered sequence of events. That is, Operator A does not have to wait for the customer to tell it that the

service is broken before it runs its own Continuity Check.

Note that the CCMs shown inFigure 17can be set up to run continuously to detect potential continuity

faults or they can be set up on demand as needed.

IEEE 802.1ag Configuration Example

InFigure 18, a customer has a point-to-point service (VLL) over an MPLS network. In this example, the

customer runs CCM at 10 ms intervals at MD level 7 between CE1 and CE2. The service provider runs CCMat 10 ms intervals at MD level 4 between PE1 and PE2.

Figure 19andFigure 20show example configurations for CE1, CE2, PE1, and PE2 shown inFigure 18.

1 Brocade supports 802.1ag CFM over VPLS and VLL to allow Ethernet OAM to function end-to-end over an

MPLS core network.


14/26



Figure 18. Example of 802.1ag configuration

Figure 19. CE1 and CE2 configurations

MPLS

7 77

7

4

44

7

7VLL

Customer CCM @ 10 sec

Service provider CCM @ 10sec

1/1 1/1 2/1 2/1

CE1 CE2PE1 PE2

Customer down MEP

Customer MIP

Service Provider up MEP


15/26



Figure 20. PE1 and PE2 configurations

IEEE 802.1ag CFM versus ITU-T Y.1731 OAM

ITU-T Y.1731 OAM is a superset of IEEE 802.1ag CFM. 2

ITU-T Y.1731 Performance Management

ITU-T Y.1731s ETH-CC (Ethernet Connectivity

Check), ETH-LB (Ethernet Loopback), and ETH-LT (Ethernet Linktrace) OAM functions are equivalent to

802.1ag CCM, LBM, and LTM, respectively. Devices deploying 802.1ag CCM, LBM, and LTM can

interoperate with devices deploying Y.1731 ETH-CC, ETH-LB, and ETH-LT, respectively. However, Y.1731

ETH-CC supports either multicast or Unicast messages, while 802.1ag CCM supports multicast messages

only. Therefore, to interoperate 802.1ag CCM with Y.1731 ETH-CC, the Y.1731 device must be set up to use

ETH-CC multicast messages.

ITU-T Y.1731 Performance Management (PM) supports on-demand measurement of round-trip Frame Delay

(FD) and Frame Delay Variation (FDV). These measurements are made between defined MEPs (seeFigure

21).

The main benefit of Y.1731 PM is for Service Level Agreement (SLA) monitoring and verification of services

provided to customers in aggregation, metro, and core networks. SLA monitoring and verification is

essential for delay-sensitive applications, for example, voice, and for services with SLA guarantees.

The Brocade implementation supports a high-precision, hardware-based time-stamping mechanism that

provides measurements with microsecond granularity. It also supports delay measurements for Layer 2

bridging services and for VPLS and VLL services.

Figure 21. Y.1731 delay measurement

2 Besides CFM and other functionality, ITU-T Y.1731 also includes Performance Management, which is

addressed in this paper.

Brocade MLX

ETH-DMMEP 2MEP 3

Brocade MLX


16/26



Figure 22shows an example of the Y.1731 delay measurement between MEP3 and MEP2 shown inFigure

21. The command sends a selectable number (default is 10) of delay measurement PDUs (ETH-DM), which

are time-stamped in hardware at the source and destination MEPs to achieve high-precision measurement

independent of software delays. The command averages the individual measurements and lists the

resulting minimum, average, and maximum delays.

Figure 22. Y.1731 delay measurement example

IEEE 802.3ah Ethernet First Mile (EFM) Link OAM

IEEE 802.3ah Ethernet First Mile (EFM) link OAM monitors and supports troubleshooting individual links.

That is, 802.3ah OAM operates on a point-to-point link and does not propagate beyond a single hop. As

shown inFigure 23, this IEEE standard was originally developed to monitor the link between a service

provider and customer, where it is usually called the first mile link.

802.3ah EFM OAM supports the following functions:

OAM discovery Used to discover the 802.3ah EFM OAM capabilities of the peer device

Remote failure indication (critical events) Used to inform the peer node that the receive path of the link is non-operational Also includes communication of conditions such as dying gasp

Link monitoring Can generate event notifications (alarms) when defined error thresholds are exceed

Remote loopback testing Puts the peer in data loopback state

802.3ah supports two modes of operation:

Active mode Normally used by a device controlled by a service provider The device can source OAM PDU packets in order to initiate an EFM OAM discovery process

Passive mode Normally used by customer devices connected to a service provider device The device cannot source OAM PDU packets, but it can respond to received OAM PDUs


17/26



Figure 23. IEEE 802.3ah EFM OAM

Figure 24shows an example of the output of an 802.3ah EFM OAM show command. Note that the show

command displays not only local link OAM information, but also remote link OAM information.

Figure 24. Example of 802.3ah EFM OAM show command

Layer 2 OAM Summary

Table 1 presents a summary of the Layer 2 OAM tools described in this section.

Layer 2 Trace Port Loop

Detection

UDLD Single-Link

Keep-Alive

802.1ag

CFM

Y.1731 PM 802.3ah

EFM OAM

Intended

Application

Layer 2 network

troubleshooting

and detection of

misconfiguration

Layer 2 network

troubleshooting

and detection of

misconfiguration

Single-link

keep alive

Single-link

keep alive

Service

verification

Perfor-

mance

(SLA)

verification

Customer

access

verification

Supports Layer 2 topology

discovery, Layer 2

loop detection

Layer 2 loop

detection

Single-link

keep alive

Single-link

keep alive

Layer 2

connectivity

Check,

Linktrace,

loopback

One-waydelay and

delay

variation

Single-link

OAM: fault

detection,

discovery,

loopback

GenerationManual Automatic Automatic Automatic

CC: auto

LT, LB:

manual

Manual

Auto,

Manual

(LB)

Standard No No No Yes Yes Yes Yes

802.3ahOAM

802.3ahOAM


18/26



MPLSOAMTOOLS

This section addresses the MPLS OAM tools listed in Figure 4:

LSP Ping LSP Traceroute BFD for RSVP-TE LSPsLSP Ping

LSP Ping provides OAM functionality for MPLS networks based on RFC 4379. LSP Ping is used to detect

data plane failure and to check the consistency between the data plane and the control plane.

LSP Ping verifies that packets that belong to a particular Forwarding Equivalence Class (FEC) actually end

their MPLS path on a Label Switching Router (LSR) that is an egress for that FEC. LSP Ping sends MPLS

echo requests following the same data path that normal MPLS packets would traverse (Figure 25).

LDP LSP Ping and RSVP LSP Ping are supported, as shown inFigure 26andFigure 27respectively.

Figure 25. LSP Ping operation

Figure 26. LDP LSP Ping example

Figure 27. RSVP LSP Ping example

MPLS Network

P

LSR

LSP

PE PE

LERLER

Echo Request

Echo Reply


19/26



LSP Traceroute

LSP Traceroute provides OAM functionality for MPLS networks based on RFC 4379. LSP Traceroute is used

to isolate a data plane failure to a particular router and to provide LSP path tracing.

With LSP Traceroute, an echo request packet is sent to each transit LSR and the LER. The echo request

follows the same data path that normal MPLS packets would traverse. A transit LSR or an LER receiving the

echo request checks that it is indeed a transit LSR or LER for this path and returns echo replies (Figure 28).

LDP LSP Traceroute and RSVP LSP Traceroute are supported, as exemplified inFigure 29andFigure 30,

respectively.

Figure 28. LSP Traceroute operation

Figure 29. LDP LSP Traceroute example

Figure 30. RSVP LSP Traceroute example

LSP Ping and LSP Traceroute ConsiderationsThe following are common considerations for LSP Ping and LSP Traceroute:

Redundant RSVP LSPs. LSP Ping or LSP Traceroute on a LSP is performed on the currently active path. One-to-one Fast ReRoute (FRR) LSPs. LSP Ping or LSP Traceroute on a one-to-one FRR LSP is

performed on the active path. If a path switchover occurs while a Ping or Traceroute is in-progress, the

echo request is sent out on the old active path.

FRR bypass LSPs. You can Ping or Traceroute the protected LSP and bypass tunnel separately, e.g., byspecifying the name of the LSP.

MPLS Network

P

LSR

LSP

PE PE

LERLER

Echo request

Echo replies


20/26



Transit-originated detour. The user can initiate a Ping or Traceroute operation on a transit-originated,detour LSP. Because the session name does not uniquely identify a session on a transit LSR, the user

needs to specify the entire session ID (including the tunnel end-point, tunnel ID, and extended tunnel

ID) for the detour LSP to which the LSP Ping or Traceroute command is applied.

LSP re-optimization. If LSP re-optimization occurs while the Ping or Traceroute is in progress, the echorequest will be sent out on the current LSP instance until the new instance is created.

BFD for RSVP-TE LSPs

Bidirectional Forwarding Detection (BFD) RSVP-TE LSP defines a method for rapid detection of the failure of

the data path of an LSP (Figure 31). While LSP Ping can be used for this purpose, BFD for RSVP-TE LSP

provides the following advantages:

BFD for RSVP-TE LSP can be configured to dynamically detect data plane failure of MPLS RSVP LSPs. BFD for RSVP-TE LSP provides faster failure detection, since it does not require control plane

verification as LSP Ping does.

BFD for RSVP-TE LSP can be used to concurrently detect faults on a number of LSPs without manualinteraction as required using LSP ping.

BFD allows for the detection of a forwarding path failure in 300 milliseconds or less (depending on the

configuration).

Figure 31. BFD for RSVP-TE operation

BFD for RSVP-TE LSP should be used selectively to monitor unreliable paths such as those through non-

MPLS devices, for example, optical switches. InFigure 32, for example, the LSP traverses optical switches.

The optical switches keep the links to the MPLS routers up even in the event of a failure between the

optical switches. This would prevent the MPLS routers from supporting path switchover (since, as far as the

MPLS routers are concerned, the link between them is up). BFD for RSVP-TE LSP would detect the LSP path

failure and would trigger a path switchover.3

Since a link failure will trigger FRR directly, the only benefit of using BFD for RSVP-TE LSP when there are no

optical switches (or other transport types that would prevent MPLS routers from detecting the physical path

as down) would be to detect control plane failures.

3In configurations in which there is no alternative path, the LSP is brought down and the BFD session is deleted.

The LSP then follows the normal retry procedures to come back up.

MPLS Network

P

LSR

LSP

PE PE

LERLER

BFD


21/26



Figure 32. BFD for RSVP-TE LSP used to monitor paths through non-MPLS devices

BFD for RSVP-TE LSP can be enabled or disabled on the fly at the global MPLS level 4 Figure 33(see ) or for

each individual RSVP LSP (seeFigure 34) without affecting the LSP operational status. In addition, BFD for

RSVP-TE LSP parameters can be changed on the fly without changing the state of the BFD session.

Figure 33. Enabling BFD for RSVP LSP globally

Figure 34. Enabling BFD for a specific RSVP-TE LSP

MPLS OAM Summary

Table 2 presents a summary of the MPLS OAM tools described in this section.

LSP Ping LSP Traceroute BFD for RSVP-TE LSPs

Intended ApplicationTo detect data plane failure

and to check the consistency

between the data plane and

the control plane

To isolate the data plane

failure to a particular router

and to provide LSP path

tracingFast data plane failure

detection for RSVP LSPs

Supports

Connectivity verificationConnectivity troubleshooting,fault localization

Fast data plane failure

detection (link may be

up, but data path is

down)

Generation Manual Manual Automatic

Standard Yes Yes Yes

4The number of BFD sessions supported by the system must be taken into account when enabling BFD for RSVP-

TE globally.

LSP

Failure

BFD BFD


22/26



IP ANDVRFOAMTOOLS

This section addresses the IP and L3VPN (VRF) OAM tools listed in Figure 4:

IP and VRF Ping IP and VRF Traceroute BFD for OSPFv2, OSPFv3, IS-IS, and BGP4IP and VRF Ping

IP Ping is a tool used to verify connectivity at the IP level. The IP ping command sends an Internet Control

Message Protocol (ICMP) echo request to the IP address or selected hostname and waits for a reply (see

Figure 35). The Ping VRF option lets you ping an address on a specific L3VPN, that is, an address

associated with a VRF table.

Figure 36shows an example of IPv4 Ping, whileFigure 37shows an example of IPv6 Ping. Note that Ping

VRF is supported for both IPv4 and IPv6.

Figure 35. IP Ping operation

Figure 36. IPv4 Ping example

Figure 37. IPv6 Ping example

IP and VRF Traceroute

The IP Traceroute tool identifies the path that packets take through a network on a hop-by-hop basis. The

IP Traceroute tool works by sending ICMP echo packets with varying IP Time-to-Live (TTL) values to thedestination (seeFigure 38).

The Traceroute VRF option lets you traceroute an address on a specific L3VPN, that is, an address

associated with a VRF table.

Figure 39shows an example of IPv4 Traceroute, whileFigure 40shows an example of IPv6 Traceroute.

Note that Traceroute VRF is supported for IPv4 and IPv6.

Sourcerouter

Destinationrouter

Echo request

Echo reply


23/26



Figure 38. IP Traceroute operation

Figure 39. IPv4 Traceroute example

Figure 40. IPv6 Traceroute example

BFD for OSPFv2, OSPFv3, IS-IS, and BGP4

Bidirectional Forwarding Detection (BFD) defines a method for rapid detection of the failure of a forwarding

path by checking that the next-hop router is alive. Without BFD enabled, it can take from 3 to 30 seconds to

detect that a neighboring router is not operational (and packet losses would occur during that time).

BFD can detect data path failures when a link is up, but the data path is not, for example, failures due to

misconfiguration and path through optical switches (seeFigure 41). BFD allows for the detection of a

forwarding path failure in 300 ms or less (depending on the configuration). When BFD is enabled on a

routed interface, a BFD session is automatically established when a neighbor router is discovered.

Figure 41. BFD operation

Source

router

Destination

router

Echo request

Echo reply

Echo request

Echo reply

Failure

Link is up

BFDBFD BFD

BFDBFDBFD


24/26



Figure 42shows an example of BFD configuration. BFD can be enabled or disabled for all interfaces or

per interface for use with OSPFv2 (that is, IPv4), OSPFv3 (that is, IPv6), and IS-IS, as shown inFigure 43,

Figure 44, andFigure 45, respectively.

Figure 42. BFD configuration example

Figure 43. Enabling/disabling BFD for OSPFv2 for all interfaces (top) or per interface (bottom)

Figure 44. Enabling/disabling BFD for OSPFv3 for all interfaces (top) and per interface (bottom)

Figure 45. Enabling/disabling BFD for IS-IS for all interfaces (top) and per interface (bottom)


25/26



BFD for BGP4 supports single-hop and multi-hop BFD on Ethernet, POS, and Virtual Interfaces. BFD for

BGP4 can be enabled or disabled at the global BGP router level, for each individual peer, or for a peer

group, as shown inFigure 46, Figure 47, andFigure 48, respectively.

Figure 46. Enabling/disabling BFD globally for BGP4

Figure 47. Enabling/disabling BFD for a specific BGP4 peer

Figure 48. Enabling/disabling BFD for a BGP4 peer group

IP and VRF OAM Summary

Table 3 presents a summary of the IP and VRF OAM tools described in this section.

IP Ping

VRF Ping

IP Traceroute

VRF Traceroute

BFD for OSPFv2,

OSPFv3, IS-IS, BGP4

Intended Application Connectivity verification

at the IP level

Identification of the path that IP

packets take through a network

on a hop-by-hop basisFast data path failure

detection

SupportsConnectivity verification Connectivity troubleshooting,

fault localizationData path failure detection

(link may be up, but data

path is down)Generation Manual Manual AutomaticStandard Yes Yes Yes


26/26


SUMMARY

This paper reviewed OAM tools available for MPLS, IP, and Ethernet networks at various layers of the stack

and reviewed best practices for choosing the right OAM tool to use in a particular network deployment.

These tools provide unparalleled power for an operator to proactively manage networks and customer

Service Level Agreements (SLAs). These OAM tools address fault detection, fault verification, and fault

isolation; enable proactive detection of service degradation; and provide service performance monitoringand SLA verification.

2010 Brocade Communications Systems, Inc. All Rights Reserved. 11/10 GS-BP-356-00

Brocade, the B-wing symbol, BigIron, DCFM, DCX, Fabric OS, FastIron, IronView, NetIron, SAN Health, ServerIron, TurboIron, and Wingspan

are registered trademarks, and Brocade Assurance, Brocade NET Health, Brocade One, Extraordinary Networks, MyBrocade, VCS, and VDX

are trademarks of Brocade Communications Systems, Inc., in the United States and/or in other countries. Other brands, products, or

service names mentioned are or may be trademarks or service marks of their respective owners.

Notice: This document is for informational purposes only and does not set forth any warranty, expressed or implied, concerning

any equipment, equipment feature, or service offered or to be offered by Brocade. Brocade reserves the right to make changes

to this document at any time, without notice, and assumes no responsibility for its use. This informational document describes

features that may not be currently available. Contact a Brocade sales office for information on feature and product availability.

Export of technical data contained in this document may require an export license from the United States government.

Oam Best Practices

Documents

Transcript of Oam Best Practices