Participate in the session polling and Q&A · Participate in the session polling and Q&A We have 5...
Transcript of Participate in the session polling and Q&A · Participate in the session polling and Q&A We have 5...
Participate in the session polling and Q&A We have 5 questions that we would like your input
On your browser:
https://clsandiego15.cnf.io/ On your browser:
Search for BRKACI-3503
Extending ACI to Multiple Sites Dual Site Deployment Deep Dive
Santiago Freitas ([email protected]), Customer Solutions Architect
Patrice Bellagamba ([email protected]), Distinguished Systems Engineer
BRKACI-3503
Multi-Fabric Design Options Single APIC Cluster / Single Domain Multiple APIC Clusters / Multiple Domains
ACI Fabric 2 ACI Fabric 1
Dual-Fabric Connected with back to back vPC
DB Web App
ACI Fabric 2 ACI Fabric 1
Dual-Fabric with L2 Extension – L2 DCI
DB Web App
L2
DCI
L2
DCI
• Single fabric stretched to two sites. Works the same way as Single fabric deployed within a single DC
• One APIC cluster. One management and configuration point.
• Anycast GW on all leaf switches.
• Work with one or more transit leaf per site. Any leaf can be transit leaf.
• Number of transit leaf and links is redundancy and bandwidth capacity decision
Stretched ACI Fabric
Supported Distances and Interconnection Technologies Dark Fiber
Transceivers Cable Distance
QSFP-40G-LR4 10 km
QSFP-40GE-LR4 10 km
QSFP-40GLR4L 2 km
QSFP-40G-ER4 30 km in 1.0(4h) or earlier
40 km in 1.1 and later (planned)
For all these transceivers the cable type is SMF
Supported Distances and Interconnection Technologies DWDM
• DWDM system provides connectivity between two sites.
• SR with MTP-LC breakout cable between ACI node and DWDM system
• 1.0(3f) release or later, Max 10ms RTT between sites
• Under normal conditions 10 ms allows us to support two DCs up to 800 KMs apart
Supported Distances and Interconnection Technologies DWDM - Considerations
IS-IS adj
• IS-IS hello interval 10 seconds, Hold Time 30 seconds
• Timers not configurable, Enhancement CSCut62675 requested.
• If DWDM goes down…
• It must shutdown the ports facing the ACI Fabric, otherwise 30 seconds outage.
• If one attachment circuit goes down, remote port must be shutdown, otherwise 30
seconds outage.
Supported Distances and Interconnection Technologies Ethernet over MPLS (EoMPLS)
• Port mode EoMPLS used to stretch the ACI fabric over long distance.
• DC Interconnect links could be 10G (minimum) or higher with 40G facing the Leafs / Spines
• DWDM or Dark Fiber provides connectivity between two sites.
• 1.0(3f) release or later, Max 10ms RTT between sites.
• Under normal conditions 10 ms allows us to support two DCs up to 800 KMs apart.
• Other ports on the Router used for connecting to the WAN via L3Out
Please provide your input to the questions asked on the mobile App.
What is the distance between
your Data Centers?
What kind of links do you have
between your Data Centers?
• Fabric to Fabric connect is just a point to point Leaf to Spine logical link
• 40Gbs as seen by Fabric
• 10Gbps on long distance links
• ASR9K performs
• EoMPLS port Xconnect
• Speed Adaptation with QoS
• Validated platform is ASR 9K with XR 5.3.2*
• *CCO FCS Sept 2015, for deployment before that 5.2.4 + Eng. SMU for CSCut79961
Fabric to Fabric connection
interface FortyGigE0/2/0/0 <== 40G Facing the fabric
description To-Spine-2-Eth1/5
mtu 9216
load-interval 30
l2transport <== Critical command for fast failover
propagate remote-status
!
l2vpn
router-id 5.5.5.1
xconnect group ASR9k_Grp_1
p2p ASR9k_1_to_4
interface FortyGigE0/2/0/0
neighbor ipv4 5.5.5.4 pw-id 104
interface TenGigE0/2/1/0 <== 10G Towards remote site.
description To-ASR9k-4
cdp
mtu 9216
service-policy output QoS_Out_to_10G_DCI_Network
ipv4 address 5.5.2.1 255.255.255.252
load-interval 30
EoMPLS Xconnect
router ospf 1
log adjacency changes
router-id 5.5.5.1
nsf ietf
area 0
interface Loopback0
passive enable
!
interface TenGigE0/2/1/0
bfd fast-detect <== BFD for fast detection of DWDM/Indirect failures.
network point-to-point
mpls ldp sync
mpls ldp
log
hello-adjacency
graceful-restart
!
router-id 5.5.5.1
interface TenGigE0/2/1/0
DWDM Link protection
• Due to the difference of speed (40G => 10G), QoS is a must on DCI
• Fabric Control Traffic must be protected • Demonstration of stability versus link overload.
Speed Adaptation with QoS
Qos-group Significance Dot1p in VXLAN
0 Level3 User Class 0
1 Level2 User Class 1
2 Level1 User Class 2
3 APIC Class 3
4 Span Class 4
5 Sup / Control Class 5
6 iTraceRoute Pkts 6
7 Undefined 7
Fabric Classes
Not Configurable
User Matches traffic and assign to one of those 3 classes
class-map match-any SUP_Traffic
match mpls experimental topmost 5
match cos 5
end-class-map
!
class-map match-any SPAN_Traffic
match mpls experimental topmost 7 4 <== Span Class + Undefined merged
match cos 4 7
end-class-map
!
class-map match-any User_Data_Traffic_1
match mpls experimental topmost 1
match cos 1
end-class-map
!
class-map match-any User_Data_Traffic_2
match mpls experimental topmost 0
match cos 0
end-class-map
!
class-map match-any APIC+Traceroute_Traffic
match mpls experimental topmost 3 6
match cos 3 6
end-class-map
!
class-map match-any MPLS_CE_BGP+ASA+vASA+vF5_HA_Traffic
match mpls experimental topmost 2
match cos 2
end-class-map
Speed Adaptation with QoS class-map
User Class Level 1
(COS 2) used to
mark BGP, ASA
and F5 control
plane packets.
Custom QoS Policy
applied to EPG.
policy-map QoS_Out_to_10G_DCI_Network
class SUP_Traffic
priority level 1
police rate percent 15
class APIC+Traceroute_Traffic
priority level 2
police rate percent 15
class MPLS_CE_BGP+ASA+vASA+vF5_HA_Traffic
bandwidth 500 mbps
queue-limit 40 kbytes
class User_Data_Traffic_1
bandwidth 3200 mbps
queue-limit 40 kbytes
class User_Data_Traffic_2
bandwidth 3200 mbps
queue-limit 40 kbytes
class SPAN_Traffic
bandwidth 100 mbps
queue-limit 40 kbytes
class class-default
Speed Adaptation with QoS policy-map
interface TenGigE0/2/1/0
description To-ASR9k-4
cdp
mtu 9216
service-policy output QoS_Out_to_10G_DCI_Network
ipv4 address 5.5.2.1 255.255.255.252
load-interval 30
• One DVS stretched across two sites
• vCenter manages vSphere servers for both sites
VMM Integration
EPG-EPG Atomic Counters in Stretched ACI Fabric
• EPG to EPG atomic counter works when the EPG is not present on transit leaf.
• Other Atomic Counters works fine:
• Leaf-to-Leaf (or TEP-to-TEP) works with ALE2 based Nexus 9300.
• Between endpoints (EP to EP).
ALE2 = N9396PX, N9396TX,
N93128TX and N93128PX with 6-port
GEM N9K-6PQ, N9372TX, N9372PX
and N9332PQ.
Transit Leaf and WAN Traffic
• Same ISIS metric for inter-site links and local links
• When WAN router is connected to transit leaf from both sites, non-border leaf switches
will see 2-way ECMP for external subnets
• Recommended design: WAN Router is not connected to transit leaf, so Local WAN
router is 2 hops away and WAN router at another site is 4 hops away.
WAN Edge
Router
CDP: Disabled
LLDP: Disabled
Connecting a Router to a regular EPG port WAN edge router and firewall peering through the fabric
Connecting an external routing device to a regular EPG port on the fabric
requires CDP / LLDP to be disabled on the external device or the fabric port.
ACI Fabric
BD “Blue”
EPG A
Firewall
OSPF / BGP
Peering
We are treating the WAN router and the firewall as regular end points, inside a regular
EPG – no L3 Outsides / External EPG.
You MUST disable CDP / LLDP for the EP info to be learnt.
Disable
CDP/LLDP
on Fabric
Port
S-N Traffic Flow N-S is symmetric
Odd Tenants = DC 1 primary Even Tenant = DC 2 Primary
RealWeb EPG 10.1.4.1/24
WAN EPG Layer 2
Logical Topology Deep Dive ASA failover link and state link through the Fabric
EPG setup under Common Tenant
- Static Binding to Physical Ports
BD setup in Layer 2.
Leaf 3, DC1
Leaf 5, DC 2
Logical Topology Deep Dive WAN-CE to ASA, BGP peering through the Fabric
WAN EPG with L2 BD with static binding towards ASA and WAN CE
ASA/T4/act(config)#route-map set-localpref-200-inprefixes permit 10
ASA/T4/act(config-route-map)# set local-preference 200
ASA/T4/act(config-if)# interface TenGigabitEthernet0/7.1041
ASA/T4/act(config-if)# nameif outside
ASA/T4/act(config-if)# ip address 10.1.1.254 255.255.255.0 standby 10.1.1.253
ASA/T4/act(config)# router bgp 65001
ASA/T4/act(config-router)# address-family ipv4 unicast
ASA/T4/act(config-router-af)# neighbor 10.1.1.21 remote-as 65001
ASA/T4/act(config-router-af)# neighbor 10.1.1.31 remote-as 65001
ASA/T4/act(config-router-af)# neighbor 10.1.1.41 remote-as 65001
ASA/T4/act(config-router-af)# neighbor 10.1.1.51 remote-as 65001
ASA/T4/act(config-router-af)# redistribute static
ASA/T4/act(config-router-af)# neighbor 10.1.1.31 route-map set-localpref-200-inprefixes in
ASA/T4/act(config-router-af)# neighbor 10.1.1.51 route-map set-localpref-200-inprefixes in
ASA/T4/act(config)# route inside 10.1.3.0 255.255.255.0 10.1.2.3
BGP towards
CEs
Static Towards WEB
subnet, NH Fabric
Even numbered tenants use the
primary path into/out of the fabric
via DC2 and odd tenants use the
primary path into/out of the fabric
via the “left side” DC1
Logical Topology Deep Dive External L3 out towards ASA
External L3 Out Configuration Steps on ACI
Create Logical Node Profile with
border leafs Leaf-3 and Leaf-5,
where ASA is connected
Static Default route from each
Border Leaf node with Next Hop
pointing to ASA Inside Interface
IP
Logical Topology Deep Dive External L3 out towards ASA
External L3 Out Configuration Steps on ACI
On the Logical Interface Profile create
Secondary IP Address (Floating IP) under
each logical transit interface created
between Border Leaf and External
Physical ASA.
This secondary address is a “floating IP” owned by the border leafs.
This helps for seamless convergence during border leaf failures.
Remark: DC1-ASA/T4/act(config)# route inside 10.1.3.0 255.255.255.0 10.1.2.3
Logical Topology Deep Dive Load Balancer to Real Servers
Default Gateway located in the Fabric
Deployed as a Regular End Point, not part of Service Graph.
RealWebEPG
Logical Topology Deep Dive MP-BGP Route Reflector Placement
Spine 1 == DC 1
Spine 3 == DC 2
The fabric uses MP-BGP to distribute external routes within ACI fabric.
Current SW Release supports a max of two MP-BGP route reflectors.
In a stretched fabric implementation, place one route reflector at each site to
provide redundancy.
- 20 Tenants, each tenant with 1 Private Network (VRF)
- 20 Application Profiles (APs) per tenant each with
- 3 EPGs per AP
- 1 BD : Subnet per EPG
Total: 20 tenants, 20 private networks (VRFs), 1200 bridge domains, 1200 subnets and 1200 EPGs
End Points: 9600 endpoints distributed on multiple EPGs, across all Leaf switches
Verified Scalability Limits for Release 1.0(4h) available at
http://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/1-x/release/notes/apic_rn_104.html
Scale Tested Those number don’t replace Cisco verified scale numbers
Traffic Flow – Unicast, Test Traffic
W Flows-E
1. 10 IP endpoints per EPG
2. WebEPG AppEPG and AppEPG DbEPG
3. Stateless traffic
4. Flows spread across LEAF switches as shown above
5. Data rate (10 Gbps in DC1, 10 Gbps in DC2 and 10 Gbps across DCs)
Traffic Flow – Multicast, Test Traffic
FlowsMulticast
1. Per Tenant (T3 and T4), 500 Groups and 500 Flows
2. Traffic Rate: Sent at 500 Mbps
3. Stateless Traffic
4. Intra-EPG traffic
Firewall and Load Balancer failover Improving ASA failover time
- ASA 9.3(x) introduced BGP support for nonstop forwarding.
- ASA’s and DC1/DC2 CE routers were enabled for BGP Graceful restart.
- Reduced ASA failover unit poll timers from 15 seconds timeout to 5 seconds - failover polltime unit 1 holdtime 5
- Active ASA Powered down
- 7-8 seconds failover time
- ASA Recovery (failback): 2-3 seconds.
- Virtual F5 failure
- Failure: 8 seconds
- Failback: no losses
Intra-DC Link Failure
Leaf-1# show ip interface e1/49
IP Interface Status for VRF "overlay-1"
eth1/49, Interface status: protocol-down/link-down/admin-up, iod: 180,
Leaf-1#
Convergence on failover (worst case):
1040 ms Multicast
208 ms Unicast
Convergence on recovery (worst case):
253 ms Multicast
no losses Unicast
SPINE switch failure
Spine-1# show interface ethernet1/1 | include rate
30 seconds input rate 5015903136 bits/sec, 1297758 packets/sec <<< Note rate
30 seconds output rate 5078158032 bits/sec, 1297760 packets/sec <<< Note rate
input rate 5019981528 bps, 1299016 pps; output rate 5082398064 bps, 1299016 pps
Spine-1#
Spine-1# show interface ethernet1/2 | include rate
30 seconds input rate 2512703448 bits/sec, 650200 packets/sec <<< Note rate
30 seconds output rate 2541790328 bits/sec, 650155 packets/sec <<< Note rate
input rate 2509865664 bps, 649565 pps; output rate 2540928232 bps, 649520 pps
Spine-1#
Spine-1# show interface ethernet1/3 | include rate
30 seconds input rate 5522160848 bits/sec, 1429036 packets/sec <<< Note rate
30 seconds output rate 3559252656 bits/sec, 909389 packets/sec <<< Note rate
input rate 5522598512 bps, 1428972 pps; output rate 3557939512 bps, 909354 pps
Spine-1# show interface ethernet1/4 | include rate
30 seconds input rate 1003454536 bits/sec, 259872 packets/sec <<< Note rate
30 seconds output rate 3050673104 bits/sec, 780397 packets/sec <<< Note rate
input rate 1004067560 bps, 259885 pps; output rate 3049726472 bps, 779564 pps
Spine-1#
Spine 1 Failed/Restored
Convergence on failover (worst case):
1040 ms Multicast
650 ms Multicast with 11.1 image
571 ms Unicast
Convergence on recovery (worst case): 15196 ms Multicast with 11.0 image
505 ms Multicast with 11.1 image
no losses Unicast
LEAF switch failure
Leaf-1# show lldp nei
Capability codes:
(R) Router, (B) Bridge, (T) Telephone, (C) DOCSIS Cable Device
(W) WLAN Access Point, (P) Repeater, (S) Station, (O) Other
Device ID Local Intf Hold-time Capability Port ID
N3K-A1-8-32 Eth1/16 120 BR Eth1/1 <<< Eth1/16
Spine-1 Eth1/49 120 BR Eth1/1
Spine-2 Eth1/50 120 BR Eth1/1
Total entries displayed: 3
Leaf-1# show interface eth1/16 | include rate
30 seconds input rate 505351360 bits/sec, 144322 packets/sec <<<
30 seconds output rate 200479760 bits/sec, 57737 packets/sec <<<
input rate 399586040 bps, 114265 pps; output rate 158359176 bps, 45710 pps
Leaf-1#
Leaf 1 to be failed
Convergence on failover (worst case):
664 ms Multicast
286 ms Unicast
Convergence on recovery (worst case):
725 ms Multicast
33 ms Unicast
ASR 9K failure / recovery Identified ASR9K-2 in DC2 as the target device
Powered Off
From APICs, DCI link missing (as expected)
ASR 9K failure / recovery Identified ASR9K-2 in DC2 as the target device
Powered Off
DC1 ASR 1
RP/0/RSP0/CPU0:DC1-ASR9K-1#sh log
(snip)
LC/0/2/CPU0:Apr 14 10:17:43.609 : vic_0[365]: %L2-ETHERNET-3-TX_DISABLE :
Interface FortyGigE0/2/0/0, link no longer forced down due to remote signalling
LC/0/2/CPU0:Apr 14 10:23:20.404 : bfd_agent[125]: %L2-BFD-6-SESSION_STATE_DOWN
: BFD session to neighbor 5.5.2.2 on interface TenGigE0/2/1/0 has gone down.
Reason: Echo function failed
Spine-2# show interface eth1/5
Ethernet1/5 is down (link-failure) <<< I/F is brought down
admin state is up, Dedicated Interface
Hardware: 40000 Ethernet, address: 0000.0000.0000 (bia f40f.1bc1.e7b2)
MTU 9150 bytes, BW 40000000 Kbit, DLY 1 usec
reliability 255/255, txload 1/255, rxload 1/255
Convergence on failover (worst case):
720 ms Multicast
475 ms Unicast
Convergence on recovery (worst case):
725 ms Multicast
176 ms Unicast
RP/0/RSP0/CPU0:DC1-ASR9K-2#show int tenGigE 0/2/1/0
Fri Apr 10 11:35:58.657 UTC
TenGigE0/2/1/0 is down, line protocol is down
Interface state transitions: 6
ASR 9K 10GE (DCI) link failure 10G Link to FAIL is between ASR9k-2 in DC1 to ASR9k-1 in DC2
“l2transport propagate remote-status” will bring down the remote AC if the
local AC goes down. Also when the DCI link goes down will bring down attachment
circuits.
This command will improve ACI Fabric (IS-IS) convergence during ASR9K PE DCI link
and local link failures, otherwise 30 seconds outage.
Convergence on failover (worst case):
375 ms Multicast
314 ms Unicast
Convergence on recovery (worst case):
195 ms Multicast
No Loss Unicast
Convergence on failover (worst case):
720 ms Multicast
270 ms Unicast
Convergence on recovery (worst case):
89 ms Multicast
No Loss Unicast
Fabric 40GE-to-ASR9K link failure 40G Link to FAIL is between ASR9K-1 in DC1 to Spine-2 in DC1
Failed the DC1 ASR9K-1 40G link by physically removing
the fiber from the Spine-2 Eth1/5.
RP/0/RSP0/CPU0:DC1-ASR9K-1#show int fortyGigE 0/2/0/0
Fri Apr 10 16:45:40.812 UTC
FortyGigE0/2/0/0 is down, line protocol is down <<<
“l2transport propagate remote-status” command on DC1
ASR9K-1 local AC will automatically bring down the DC2
ASR9K-2 40G link to Leaf-5.
RP/0/RSP0/CPU0:DC2-ASR9K-1#LC/0/2/CPU0:Apr 10 16:44:22.204 : vic_0[365]: %L2-
ETHERNET-3-TX_DISABLE : Interface FortyGigE0/2/0/0, link forced down due to
remote signaling
Dual Link Failure - "Split Brain" scenario 40G Links to FAIL are between ASR9K-1 to Spine-2 in DC1 and ASR9K-2 to Leaf-4 in DC1
• APIC’s in DC1 takes around 10-15 seconds to realize loss of reachability to all
APIC/fabric nodes in DC2.
• DC1 APIC controllers can execute policy read and write operations.
• DC2 APIC controller can only perform read only operations.
• DC2 fabric nodes were able to learn endpoints in data plane. No disruption.
- DC2 APIC Controller doesn’t show the learnt endpoint.
• vCenter located in DC1 lost management connections with ESXi hosts in
DC2.This places the ESXi hosts into “Not Responding State” and the VM’s into
“Disconnected” state.
- Actual N-S Stateful traffic to DC2 tenants 2 and 4 was working fine.
• No Intra-DC Packet Loss(North-South and East-West) observed during 2 x 40G
DCI link failure.
• Configurations (e.g. New Tenant) performed on DC1 APIC controllers.
Dual Link Failure - "Split Brain" scenario Recovery
• Once DCI links comes up it takes 30 to 35 seconds for the APIC’s in DC1 to
see the APIC3 and fabric nodes in DC2 site.
Includes time taken for Leaf 4 / Spine 3 and Spine 2 / Leaf 5 to establish
LLDP adjacency with the peers.
• The APIC cluster synchronized configuration changes made in DC 1 APICs.
• APIC controllers in DC1 and DC2 synced up and APIC controller in DC2
started showing the learnt endpoint.
• External Physical ASA HA Keepalives and LAN Failover state were
recovered. Virtual F5’s HA keep lives were recovered
• The IBGP sessions between MPLS CE routers and ASA’s going over DCI
PW links were recovered.
• vCenter Server recovered the management connections with ESXi hosts in
DC2.This
• NO Intra-DC Packet Loss(North-South/East-West) observed during 2 x 40G
DCI link recovery.
Quality of Service (QoS) High Priority Traffic protected by QoS settings on ASR 9K and Fabric
• Overload the fabric with user traffic, i.e. COS 0 or COS 1, by sending more than DCI links
can handle. QoS on ASR 9K engaged to protect SUP_Traffic (COS 5), APIC+Traceroute_Traffic (COS 3 and 6),
MPLS_CE_BGP+vServiceNodes_HA_Traffic (COS 2) and limit SPAN_Traffic (COS 4 and 7).
With congestion on the 10G DCI links
• APIC Controller in DC1 was able to push policy changes to DC2 APIC and fabric nodes.
• Traceroute between DC1 and DC2 sites completed successfully.
• SPAN(ERSPAN) running from DC1 Leaf to DC2 Leaf sent successful.
• MPLS CE to Active ASA IBGP sessions remained up and stable.
• External ASA HA remained in sync.
• Internal vF5 HA remained in sync.
• When site 1 goes down, user can access and monitor the ACI fabric via the controller in
site 2 but user can’t make configuration changes.
Data Center Failure Site failure on the site with two APICs
• Connect a standby APIC appliance (4th APIC) in Site 2 after
the APIC cluster is formed and operational
• Standby appliance remains shutdown until needed.
• When site 1 is down, user de-commission APIC node 1 and
2 and commission new APIC node 2.
• The "standby" APIC appliance joins APIC cluster
AN
IMA
TE
D S
lide
• Site 2 now has majority of APIC (2 out
of 3). User can start to make changes.
Data Center Failure Restoring ability to make configuration changes
1. Simulated DC failure by failing all devices in DC1 – Powered Off
2. Promote DC2 standby APIC to active (to become APIC#2)
3. Check traffic flow is still possible WAN to DC2 and within DC2.
4. Make a configuration change – added a new Tenant.
5. Recover DC1.
1. Follow the procedure below to clean APICs and Switches.
2. Confirmed that Configuration changes are synced to DC 1 APIC/Switches
6. Check traffic can now flow via DC1 and within DC 1
7. Put previously promoted standby APIC in DC2 back into standby mode
Stretched Fabric APIC Cluster Recovery Procedures
http://www.cisco.com/c/en/us/td/docs/switches/datacenter/aci/apic/sw/kb/b_kb-aci-stretched-
fabric.html#concept_4B9644FE992A43D9A261F1531DBC9588
Data Center Failure Test Results
- One APIC cluster. One management and configuration point.
Anycast GW on all leaf switches. Works the same way as Single fabric deployed within a single DC.
- Cisco Validated Design.
Extensively tested and passed validation criteria.
- 10ms RTT between the sites
Under normal conditions 10 ms allows two DCs up to 800 KMs/500 Miles apart.
- Interconnection could be dark fiber, DWDM or EoMPLS pseudowire
If EoMPLS then DC Interconnect links could be 10G (minimum) or higher with 40G facing the Leaf/Spine QoS required, you need to protect critical control-plane traffic.
- APIC Release 1.0(3f) or later.
Summary - Single ACI fabric stretched to two sites
DEMO available
Stretched Fabric Link failures – https://www.youtube.com/watch?v=xgxPQNR_42c
vMotion over Stretched Fabric with EoMPLS - https://www.youtube.com/watch?v=RLkryVvzFM0
The solutions presented from this slide onwards are still under testing / validations.
Target: Q4CY2015.
Please contact the presenters if you need to perform a Proof of Concept earlier.
Disclaimer
Dual-Fabric Design Scenarios
• Two independent ACI fabrics.
Two management and configuration
domains.
• Design Goals:
• Active/Active workload.
• Extend L2 and subnet across sites.
• Anycast GW on both fabrics
• Interconnect Technologies:
• Dark Fiber or DWDM (back to back vPC)
• VXLAN/OTV/VPLS/PBB for L2 extension over IP
Dual-Fabric with Common Anycast GW IP
• Multiple Anycast GW IP assigned on ACI for same subnet
• Unique Primary IP and common secondary IP for same subnet between Fabrics
• Different GW MAC per Site.
• Unique SVI MAC and common virtual MAC (roadmap Q4CY2015)
• On the Bridge Domain, ARP and L2 Unknown Unicast Flood must be enabled.
VMAC: MAC-common VMAC: MAC-common VMAC: MAC-common VMAC: MAC-common
• Use static binding to extend EPG between the sites.
• VLAN ID to EPG mapping matches between fabrics.
• Fabric treats the remote end points as if they are locally attached.
• Simple and consistent contract on two fabrics.
Extending the EPG outside the fabric Contract Relationship with EPG static binding
Dual-Fabric with Active/Active GW VMM Consideration: Option 1-VMM Integration without Live Migration (vSphere 5.x)
vCenter
Server
VLAN
100
VLAN
100
VMM Domain: DC1
EPG WEB 100.1.1.0/24
vCenter
Server
VLAN
200
VLAN
200
VMM Domain: DC2
EPG WEB 100.1.1.0/24
One L2 Domain
One subnet
VLAN
300
ACI Fabric 1 ACI Fabric 2 APIC APIC
ESX ESX
DVS1 DVS2
ESX ESX
• One vCenter (actually one DVS) can only be
provisioned by one APIC cluster
• One DVS for ESXi host attached to each ACI fabric.
• VMM integration.
• L2 extended across two fabrics.
• No live VM migration across DVS prior to
vSphere 6.0.
One L2 Domain
One subnet
Dual-Fabric with Active/Active GW VMM Consideration: Option 2-VMM Integration with Live Migration (vSphere 6)
vCenter
Server
VLAN
100
VLAN
100
VMM Domain: DC1
EPG WEB 100.1.1.0/24
vCenter
Server
VLAN
200
VLAN
200
VMM Domain: DC2
EPG WEB 100.1.1.0/24
VLAN
300
ACI Fabric 1 ACI Fabric 2
• One vCenter/DVS for each fabric.
• VMM integration with vSphere 6 support on ACI is planned.
• Allow live migration between sites, enabled by Cross-vCenter vMotion.
APIC APIC
ESX ESX
DVS1 DVS2
ESX ESX
Live migration with vSphere 6
Please provide your input to the questions asked on the mobile App.
What Virtualization
Platform you expect to
be using in the next 12
months?
Which of the Dual-Site
deployment models you
plan to adopt in the next
12 months?
If you requirement is for
Dual-Fabric with L2 DCI
extension, how many
EPGs/VLANs do you need
to extend between the sites.
EPG static
binding
ACI Dual Fabric with vSphere 6.0 for Cross vCenter vMotion
vSphere / vCenter 6.0
DVS-DC1
vSphere / vCenter 6.0
DVS-DC2
APIC APIC
ESX-DC1
Nexus 9300 NX-OS Mode
EPG static
binding
EPG static
binding
VLAN to
VXLAN
mapping
EPG static
binding VXLAN overlay
with BGP-EVPN
DVS-DC1
Server 1
10.1.5.81
ESX-DC2
DVS-DC2
Server 2
10.1.5.92
Tech Preview
L3
DCI
• Anycast VTEP • Virtual Tunnel End Point
• VXLAN Src/Dest • Next-Hop in BGP EVPN
Address-Family
MP-BGP
EVPN
VTEP, anycast IP facing vPC edge
VXLAN encap
VXLAN Overlay – BGP EVPN Peering For Layer 2 DCI Extension
Nexus 9300
• Not all EPGs have to be extended
• Some subnets are local to a fabric.
• L3 Peering between the Fabrics is required.
• ACI support iBGP or OSPF with 11.0 release • eBGP on 11.1
Cross Fabric L3 Extension
ACI Fabric 1 ACI Fabric 2
vCenter
6.0 vCenter
6.0
EPG WEB, Static binding to
DCI ports using VLAN X EPG WEB, Static binding to
DCI ports using VLAN X
Contracts / Policy View
EPG to EPG Policy Synchronization across sites
EPG APP, Static binding to DCI
ports using VLAN Y EPG APP, Static binding to DCI
ports using VLAN Y
APP
WEB
Contract
EP1
EP2
APP
WEB
Contract
EP11
EP12
As the EPGs are extended via the static
binding, Fabric 1 sees EP12 as a local
EP of WEB EPG and EP11 as a local EP
of APP EPG.
Site 2 sees EP2 and EP1 as local EPs
as well.
Policy Enforcement Example: When EP1
communicates with EP12 the local
contracts ensure policy is enforced
(orange arrow).
Policy ACI Toolkit Policy
Participate in the “My Favorite Speaker” Contest
• Promote your favorite speaker through Twitter and you could win $200 of Cisco Press products (@CiscoPress)
• Send a tweet and include
• Your favorite speaker’s Twitter handle @thiagovazquez @pbellaga
• Two hashtags: #CLUS #MyFavoriteSpeaker
• You can submit an entry for more than one of your “favorite” speakers
• Don’t forget to follow @CiscoLive and @CiscoPress
• View the official rules at http://bit.ly/CLUSwin
Promote Your Favorite Speaker and You Could Be a Winner
Complete Your Online Session Evaluation
Don’t forget: Cisco Live sessions will be available for viewing on-demand after the event at CiscoLive.com/Online
• Give us your feedback to be entered into a Daily Survey Drawing. A daily winner will receive a $750 Amazon gift card.
• Complete your session surveys though the Cisco Live mobile app or your computer on Cisco Live Connect.
Continue Your Education
• Demos in the Cisco campus
• Walk-in Self-Paced Labs
• Table Topics
• Meet the Engineer 1:1 meetings
• Related sessions