Fabric Path PPT by NETWORKERS HOME

Post on 17-Jul-2015

387 views 7 download

Transcript of Fabric Path PPT by NETWORKERS HOME

FABRICPATH

Why layer 2 in DC?

Typical DC Design

End to End L2

Limitation of Traditional L2

Cisco FabricPath Goal

Why FabricPath?

Control Plane

Key FabricPath control plane elements:

•Routing table – FabricPath IS-IS learns switch IDs (SIDs) and builds routing table

•Multidestination trees – FabricPath IS-IS elects roots and builds multidestination forwarding trees

•Mroute table – IGMP snooping learns group membership at the edge, FabricPath IS-IS floods group-membership LSPs (GM-LSPs) into the fabric

MAC-Based Routing ?•NO!

•Routing information consists of Switch IDs

•Forwarding in fabric based on Switch IDs, not MAC addresses

FabricPath Routing Table

•Contains shortest path(s) to each SID, based on link metrics / path cost

•Equal-cost multipath (ECMP) supported on up to 16 next-hop interfaces

FabricPath Routing Table

ECMP Load-Sharing

•ECMP path chosen based on hash function

•Hash uses SIP/DIP + L4 + VLAN by default

•Use show fabricpath load-balance unicast to determine ECMP path for a given packet

Mutlidestination TREE

MDT Root Selection•FabricPath network elects a primary root switch for the first

multidestination tree in the topology

•Switch with highest priority value becomes root for the tree

–Tie break: root priority → highest system ID → highest SID

•Primary root determines roots of additional trees and announces them in Router Capability TLV

–Roots spread among available switches to balance load

Root?Tree?..Is it STP??

•NO! – More like IP multicast routing

•Trees do NOT dictate forwarding path of unicast frames, only multidestination frames

•Multiple trees allow load-sharing for any multidestination frames

•Control plane state further constrains IP multicast forwarding (based on mrouter and receiver activity)

Data Plane

Key FabricPath data plane elements: •MAC table – Hardware performs MAC lookups at CE/FabricPath edge only

•Switch table – Hardware performs destination SID lookups to forward unicast frames to other switches

•Multidestination table – Hash function selects tree*, multidestination table identifies on which interfaces to flood based on selected tree

FP Mac Table•Edge switches perform MAC table lookups on ingress frames

•Lookup result identifies output interface or destination FabricPath switch

Encapsulation

SwitchID

•Every FabricPath switch automatically assigned a Switch ID –Optionally, network administrator can manually configure SIDs •FabricPath network automatically detects conflicting SIDs and

prevents data path initialization on violating switch •Encoded in “Outer MAC addresses” of FabricPath MAC-in-MAC

frames

•Enables deterministic numbering schemes, e.g.: –Spine switches assigned two-digit SIDs –Leaf switches assigned three-digit SIDs –VPC+ virtual SIDs assigned four-digit SIDs –etc.

More about SID

F Tag•Forwarding tag – Unique 10-bit number encoded in FabricPath header •Overloaded field that identifies FabricPath topology or multidestination

tree •For unicast packets, identifies which FabricPath IS-IS topology to use •For multidestination packets (broadcast, multicast, unknown unicast),

identifies which multidestination tree to use

•FTAG: (Forwarding TAG) Used for multidestination traffic; carries the ID of the tree chosen at the FabricPath ingress switch. DRAP is responsible to keep FTAGs unique/consistent. For known unicast, FTAG carries topology ID

Terminology• Classical Ethernet (CE)– Regular Ethernet with regular flooding,

regular STP, etc.• Leaf Switch– Connects CE domain to FP domain• Spine Switch– FP backbone switch with all ports in

the FP domain only• FP Core Ports– Links on Leaf up to Spine, or Spine to

Spine– i.e. the switchport mode fabricpath

links• CE Edge Ports– Links on Leaf connecting to regular

Classical Ethernet domain– i.e. not the switchport mode

fabricpath links

Fabricpath Support

Configuration

More in Encapsulation

Outer SA

More in Encapsulation

Outer DA

Conflict Resolution

Fabricpath Tree

Forwarding Tree + VLAN

Root Election /Tree Construction

Other Encapsulation

Reverse Path Forwarding Check

Topologies

•Routing table & Trees (FTAGs) are per topology

•Switch ID is shared across all topologies

•FP interface may belong to several topologies

•N7K: up to 8 topologies support starting in 6.2

•N5K/N6K: 2 topologies supported since 5.2.1; main use is to permit separate L2 pods to use same local vlan set

Config:---------------------------------

FabricPath Software Architecture & Hardware tables

on the Supervisor Engine: •FabricPath IS-IS routing protocol process that forms the core of the

FabricPath control plane •DRAP Dynamic Resource Allocation Protocol, ensures network-wide

unique and consistent Switch IDs and FTAGs –Resolves switch id conflicts

•IGMP Provides IGMP snooping support for building multicast forwarding database

•M2RIB Multicast Layer 2 RIB, contains the multicast Layer 2 routing information

•U2RIB Unicast Layer 2 RIB, containing the “best” unicast Layer 2 routing information

•L2FM Layer 2 forwarding manager, controls MAC address table •MFDM Multicast forwarding distribution manager, connects platform-

independent control-plane processes and platform-specific processes on I/O modules

on the Linecards: •U2FIB – Unicast Layer 2 FIB, managing the hardware unicast routing

table •MTM – MAC Table Manager, managing the hardware MAC address table •M2FIB – Multicast Layer 2 FIB, manages the hardware multicast routing table

FabricPath: Forwarding Tables

FabricPath uses 3 tables to forward frames

•MAC address table VLAN, MAC Address, Port (local or remote), FTAG (for

non-unicast)

•Switch-ID table remote switch-ID, local next-hop interfaces (up to 16)

•Multidestination tree table Per Tree: remote switch-ID, local next-hop/RPF interface Tree#1 (broadcast, unknown unicast, IP multicast) Tree#2 (IP multicast)

Forwarding: unicast CEFP

Unicast: Known Destination MAC

Forwarding: broadcast/multicast CEFP

Multidestination (broadcast, multicast, unicast flood)

Forwarding: FP->FP or FP->CE

• Multicast lookups are done using VLAN, FTAG, and ODA (each multicast macappears twice)

• SubSwitchID lookups are omitted here

• Remember about special LIDs (Sup, Flood, …)

• FF frames are forwarded out of CE ports only when DA is locally learned

Load-balancing

•Symmetric: idea is to make aband ba flows take same path by sorting addresses, before feeding them to hash

•Rotate: polarization avoidance; hash result is rotated by specified number of bytes. Number is derived from unique system MAC

Reducing impact of forwarding loops

•Transient loops might occur during convergence (as with L3 routing)

•To contain impact of these loops FabricPathuses TTL. Starting in 6.2(2), can set the initial TTL via fabricpath [multicast | unicast] ttl

•For Multidestination Trees Reverse Path Forwarding check performed on source switch ID

MAC Address Learning

•Learning MAC addresses is not required in FabricPathCore as switching is based on Switch ID

•FP Edge switches learn local MAC addresses (behind edge ports) conventionally

•FP Edge devices learn remote addresses (behind Core-facing ports) using conversational learning o For packets arriving from FP, source MAC (not outer SA!) is

learned when destination MAC of the frame is already known on any Edge port of this switch

•No learning from broadcasts (though existing entries will be updated)

•Normal Learning from multicasts (example: HSRP address)

Conversational MAC Address Learning

FabricPath Multicast Control Plane

•IGMP/IGMP snooping tracks connected hosts/routersinterest in receiving multicast

•ISIS distributes information from igmp snooping toother FP nodes using GM-LSPs. Intermediate nodesflood GM-LSPs

•A pruned subtree is created for each group (+flood,OMF) per vlan per FTAG

STP & FabricPath

• No STP inside FP network • BPDUs do not traverse FP network (dropped at FP edge, with the exception of TCNs) • FP network pretends to be 1 switch from STP point of view: all FP edge switches send

BPDUs with the same Bridge ID c84c.75fa.60xx (xx is domain ID in hex, default 00) • Before FP ports are up, switch will use its own Bridge ID (like STP without FP would

do) • Ports inside FP cannot be blocked, FP edge switches will always want to have STP

designated role, if superior BPDU is received such port will be blocked as L2GW inconsistent

STP, FabricPath & TCNs• When CE STP domains are connected to multiple FP

switches STP TCN handling might be needed to maintain accuracy of MAC address tables inside CE

• Example if link CE1-CE2 goes down, link CE2-CE3 will become forwarding. Now to reach MAC B, switches inside FP need to send traffic to S5 instead of S4…

• To achieve this, FP switches when receiving a TCN from CE will propagate it to all FP switches in the network (via ISIS)

• Each FP switch will flush all remote MAC addresses learned from switches in the same STP domain as domain originating the TCN

• In addition, if FP switch is also part of the same STP domain, it will propagate TCN to the CE domain

• TCNs are not propagated to CE in domain 0 (default domain)

Control Plane Protection•Both N7K, N6K, and N5K recognize and protect FP ISIS traffic at COPP level

•COPP needs to be updated when deploying FabricPath; standard profiles are FP-aware as of 5.2(1)

•In case of complex CE-side STP topologies (with blocking ports), usual STP safeguards are recommended (Bridge Assurance & Dispute / UDLD)

•On N7K-F1 cards: rate-limiters allow up to 4500 PPS worth of control plane FabricPath packets

VPC+: Why, What and How •Goal: provide redundant, active-active L2 links to separate FP

switches with active-active HSRP

•Challenge : depending on the path the packet AB takes, switch S3 will learn MAC A behind S1 or S2 (or MAC will be moving)

•Solution: introduce Emulated Switch S100 to represent devices behind VPCs: MAC A will appear behind S100 in S3 MAC address table. HSRP MAC is advertised with emulated switch as a source – taking advantage of VPC+ multipathing

VPC VPC+

•To enable VPC+ an Emulated Switch ID must be configured in VPC domain on both peers (must be the same on both peers and globally unique). ES represents ALL VPC+ channels of the domain

•Peer-link and VPC+ ports must be fabric-path capable •Peer-link is FP interface (no STP, only FP vlans are carried, VPC

check is no more). VPC+ channels are CE

•VPC+ domain must be the root for CE STP, otherwise VPC+ channels will be blocked as L2GW inconsistent

•FP switches use same STP bridge ID so peer-switch is implicit

VPC+: Prevention of Duplicate Packets

•How is packet received from VPC+ and flooded on S1 prevented from being flooded on S2 to same VPC+ again?

•N7K-F1 linecards: Each VPC+ will have its own sub-switch ID. Mac addresses will be

learned behind <es_id>.<subsw_id>.<lid>, for example 100.11.65535 (emulated switch 100, sub-switch 11, LID 65535). S2 will recognize ES + SubSwitch tuple as its own port and will not flood the frame back to VPC

•N7K-F2, N7K-F3 linecards & N5K, N6K:

By default same as above, as below with ‘fabricpath multicast load-balance’

Each VPC+ peer will be forwarding only for 1 FTAG and traffic coming from other peer will have different FTAG. For example flooded packet coming from S1 will have FTAG1, but S2 will only flood FTAG2 packets out of the VPC

VPC+ Failover •VPC+ member link goes down

–Traffic diverted over Peer-Link

•Peer-Link goes down (but Peer-Keepalive up)

–Primary: No action

–Secondary: Bring down VPC+ channels

–Stop advertising reachability to Emulated Switch

•Dual active is much less likely than with normal VPC: if Peer-Link and Peer-Keepalive go down, but peer is reachable via FP – secondary will not become primary

FabricPath: What command comes from where

MAC

Question ???