Navaid Shamsee Network Architecture
-
Upload
jguardapuclla -
Category
Documents
-
view
220 -
download
0
Transcript of Navaid Shamsee Network Architecture
-
8/10/2019 Navaid Shamsee Network Architecture
1/73
1
Data CenterNetworking
Architecture and
Design Guidelines
-
8/10/2019 Navaid Shamsee Network Architecture
2/73
-
8/10/2019 Navaid Shamsee Network Architecture
3/73
3
Data Center Architecture Overview
Layers of the Data Center Multi-Tier Model
Layer 2 and layer 3 accesstopologies
Dual and single attached 1RU andblade servers
Multiple aggregation modules
Web/app/database multi-tier
environments
L2 adjacency requirements
Mix of over-subscriptionrequirements
Environmental implications
Stateful services for securityand load balancing
Blade Chassisw/ Integrated
Switch
L3 Access
Blade Chassisw/ Pass Thru
Mainframew/ OSA
L2 w/
Clustering andNIC Teaming
Network
DC Aggregation
DC Access
DC Core
-
8/10/2019 Navaid Shamsee Network Architecture
4/73
4
Core Layer Design
-
8/10/2019 Navaid Shamsee Network Architecture
5/73
5
Core Layer Design
Requirements
Is a separate DC CoreLayer required?
Consider:10GigE port density
Administrative domains
Anticipate future requirements
Key core characteristics
Distributed forwardingarchitecture
Low latency switching
10GE scalabilityAdvanced ECMP
Advanced hashing algorithms
Scalable IP multicast support
Scaling
Campus Core
DC Core
Aggregation
CampusDistribution
Campus Access Layer
Server Farm Access Layer
Scaling
-
8/10/2019 Navaid Shamsee Network Architecture
6/73
6
Core Layer Design
L2/L3 Characteristics
Layer 2/3 boundaries:
All Layer 3 links at core,
L2 contained at/belowaggregation module
L2 extension through core isnot recommended
CEF hashing algorithm
Default hash is on L3 IPaddresses only
L3 + L4 port is optional andmay improve load distribution
CORE1(config)#mls ip cefload full simple
Leverages automatic sourceport randomization in clientTCP stack
Campus Core
DC Core
Aggregation
Web
Servers
Application
Servers
Database
Servers
Access
L3
L2
CEF Hash
Applied to
Packets on
Equal Cost
Routes
-
8/10/2019 Navaid Shamsee Network Architecture
7/73
7
Core Layer Design
Routing Protocol Design: OSPF
NSSA helps to limit LSA propagation,but permits route redistribution (RHI)
Advertise default into NSSA,summarize routes out
OSPF default reference b/w is 100M,use auto-cost reference-bandwidthset to 10G value
VLANs on 10GE trunks have OSPFcost = 1G (cost 1000), adjustbandwidth value to reflect 10GE forinterswitch L3 vlan
Loopback interfaces simplifytroubleshooting (neighbor ID)
Use passive-network default: open uponly links to allow
Use authentication: more secure andavoids undesired adjacencies
Timers spf 1/1, interface hello-dead
= 1/3
Campus Core
DC Core
Aggregation
Web
Servers
Application
Servers
Database
Servers
Access
Default Default
DC Subnets
(Summarized) L3 vlan-ospf
Area 0
NSSA
L0=10.10.3.3 L0=10.10.4.4
L0=10.10.1.1 L0=10.10.2.2
-
8/10/2019 Navaid Shamsee Network Architecture
8/73
8
Core Layer Design
Routing Protocol Design: EIGRP
Advertise default into DC withinterface command on core:
If other default routes exist(from internet edge for
example), may need to usedistribute lists to filter out
Use passive-network default
Summarize DC subnets to
core with interface commandon agg:
ip summary-address eigrp 20 0.0.0.0 0.0.0.0 200
Cost of 200 required to be preferredroute over the NULL0 routeinstalled by EIGRP
ip summary-address eigrp 20 10.20.0.0 255.255.0.0
Campus Core
DC Core
Aggregation
Web
Servers
Application
Servers
Database
Servers
Access
DC Subnets(Summarized)
L3 vlan-eigrp
Default Default
-
8/10/2019 Navaid Shamsee Network Architecture
9/73
9
AggregationLayer Design
-
8/10/2019 Navaid Shamsee Network Architecture
10/73
10
Aggregation Layer Design
Spanning Tree Design
Rapid-PVST+ (802.1w) orMST (802.1s)
Choice of .1w/.1s based on scaleof logical+virtual ports required
R-PVST+ is recommended andbest replacement for 802.1d
Fast converging: inherits proprietaryenhancements (Uplink-fast,Backbone-fast)
Access layer uplink failures:~300ms 2sec
Most flexible design options
Combined with RootGuard,BPDUGuard, LoopGuard ,and UDLDachieves most stable STP environment
UDLD global only enables on fiberports, must enable manually on
copper ports
Root Primary
HSRP Primary
Active Context
Root Secondary
HSRP SecondaryStandby Context
Core
Rootguard
LoopGuard
BPDU Guard
UDLD Global
-
8/10/2019 Navaid Shamsee Network Architecture
11/73
11
Aggregation Layer Design
Integrated Services
L4-L7 services integrated in Cisco Catalyst 6500
Server load balancing, firewall and SSL services may be
deployed in:Active-standby pairs (ACE, FWSM 2.X)
Active-active pairs (ACE, FWSM 3.1)
Integrated blades optimize rack space, cabling, mgmt, providingflexibility and economies of scale
Influences many aspects of overall design
+
Services: Firewall, Load Balancing,SSL Encryption/Decryption
-
8/10/2019 Navaid Shamsee Network Architecture
12/73
12
Aggregation Layer Design
Active-Standby Service Design
Active-standby services
Application control engine
Firewall service module (3.x)
SSL module
Under utilizes accesslayer uplinks
Under utilizes service modulesand switch fabrics
Multiple service module pairs
does permit active-activedesign.
Core
Root Primary
HSRP Primary
Active Context
Root Secondary
HSRP Secondary
Standby Context
-
8/10/2019 Navaid Shamsee Network Architecture
13/73
13
VLAN6:
Root Primary
HSRP Primary
Active Context
Aggregation Layer Design
Active-Active Service Design
Active-active Service Design
Application control engine
SLB+SSL+FW
4G/8G/16G switch fabricoptions
Active-standby distributionper context
Firewall service module (3.x)
Two active-standby groupspermit distribution ofcontexts across twoFWSMs (not per context)
Permits uplink load balancingwhile having services applied
Increases overall serviceperformance
VLAN5:
Root Primary
HSRP Primary
Active Context
VLAN 6:
Root Secondary
HSRP Secondary
Standby
Context
vlan6 vlan6
VLAN 5:
Root Secondary
HSRP Secondary
Standby Context
vlan5 vlan6 vlan6 vlan5
Core
-
8/10/2019 Navaid Shamsee Network Architecture
14/73
14
vlan6 vlan6
vlan5 vlan6 vlan6 vlan5
Core
Aggregation Layer Design
Establishing Path Preference
Use route health injectionfeature of ACE
Introduce route-map toRHI injected route to setdesired metric
Aligns advertised route
of VIP with active contexton ACE, FWSM and SSLservice modules
Avoids unnecessary use
of inter-switch link andasymmetrical flows
3. Route Map on HostRoute Sets PreferredMetric of Routeroute-map RHI permit 10
match ip address 44set metric-type type-1
1. ACE Probes to
Real Servers inVIP to DetermineHealth
4. If Context FailoverOccurs, RHI andRoute PreferenceFollow
2. If Healthy,Installs HostRoute to VIP onLocal MSFC
Establish Route Preference for Service Enabled Applications
-
8/10/2019 Navaid Shamsee Network Architecture
15/73
15
Core and Aggregation Layer Design
STP, HSRP and Service Context Alignment
Align server access toprimary components in
aggregation layer:Vlan root
Primary HSRP instance
Active service context
Provides morepredictable design
Simplifies troubleshooting
More efficient traffic flowDecreases chance of flow ping-pong across inter-switch link
Root Primary
HSRP Primary
Active Context
Core
Root Secondary
HSRP Secondary
Standby Context
-
8/10/2019 Navaid Shamsee Network Architecture
16/73
16
Aggregation Layer Design
Scaling the Aggregation Layer
Aggregation modules provide:
Spanning tree scaling
HSRP Scaling
Access layer density
10GE/GEC uplinks
Application services scaling
SLB/firewallFault domain sizing
Core layer provides inter-aggmodule transport:
Low latency distributed forwardingarchitecture (use DFCs)
100,000s PPS forwarding rate
Provides inter-agg moduletransport medium in
multi-tier model
Campus Core
DC Core
Server Farm Access Layer
Scaling
AggregationModule 1
AggregationModule 2
AggregationModule 3
-
8/10/2019 Navaid Shamsee Network Architecture
17/73
17
Agg2Agg1
VRF-Green
VRF-Blue
VRF-Red
802.1QTrunks
VLANs IsolateContexts on Access
Alternate Primary
Contexts onAgg1 and 2 to
Achieve Active-Active Design
Firewall and SLBContexts for Green,
Blue, and Red
MPLS orOther Core
DC Core
Aggregation Layer Design
Using VRFs in the DC (1)
Enables virtualization/partitioning of networkresources (MSFC,ACE, FWSM)
Permits use ofapplication serviceswith multiple access
topologies
Maps well to pathisolation MAN/WANdesigns such aswith MPLS
Security policymanagement anddeployment by usergroup/vrf
-
8/10/2019 Navaid Shamsee Network Architecture
18/73
18
Aggregation Layer Design
Using VRFs in the DC (2)
Maps the virtualizedDC to the MPLS
MAN/WAN cloud
PE PE
WAN/BranchCampus
Blue VRFGreen VRF Blue VRFGreen VRF
Core: P-Nodes
Red VRF Red VRF
Agg Module 1DC Core
802.1QTrunks
Agg Module 2
802.1QTrunks
-
8/10/2019 Navaid Shamsee Network Architecture
19/73
19
Access LayerDesign
-
8/10/2019 Navaid Shamsee Network Architecture
20/73
20
Access Layer Design
Defining Layer 2 Access
L3 routing is first performedin the aggregation layer
L2 topologies consist oflooped and loop-free models
Stateful services at Agg canbe provided across the L2
access (FW, SLB, SSL, etc.) VLANs are extended across
inter-switch link trunk forlooped access
VLANs are not extendedacross inter-switch link trunkfor loop-free access
L3 Agg2
DC Core
Primary RootPrimary HSRP
Active Services
Secondary RootSecondary HSRPStandby Services
L2
Agg1Inter-Switch Link
-
8/10/2019 Navaid Shamsee Network Architecture
21/73
21
L3 Agg2
DC Core
Primary RootPrimary HSRP
Active Services
L2
Agg1
Access Layer Design
Establish a Deterministic Model
Align active componentsin traffic path on common
Agg switch:Primary STP root
Aggregation-1(config)#spanning-tree
vlan 1-10 root primary
Primary HSRP (outbound)standby 1 priority X
Active service modules/contexts
Path preference (inbound)
Use RHI & Route mapRoute-map preferred-path
match ip address x.x.x.x
set metric -30
(also see slide 15 )
PathPref
L3+L4 Hash
Def gwy
-
8/10/2019 Navaid Shamsee Network Architecture
22/73
22
Access Layer Design
Balancing VLANs on Uplinks: L2 Looped Access
Distributes traffic loadacross uplinks
STP blocks uplink path for VLANsto secondary root switch
If active/standby servicemodules are used; considerinter-switch link utilization to
reach active instanceMultiple service modules may bedistributed on both agg switches toachieve balance
Consider b/w of inter-switch link in afailure situation
If active/active service modulesare used;
(ACE and FWSM3.1)
Balance contexts+hsrp acrossagg switches
Consider establishing inboundpath preference
802.1q Trunks
Blue-Primary RootBlue-Primary HSRP
Red-Sec RootRed-Sec HSRP
Red-Primary RootRed-Primary HSRPBlue-Sec RootBlue-Sec HSRP
L3 Agg2
L2
Agg1
L3+L4 HashDC Core
Def gwy Def gwy
-
8/10/2019 Navaid Shamsee Network Architecture
23/73
23
Access LayerDesign
L2 Looped Design Model
-
8/10/2019 Navaid Shamsee Network Architecture
24/73
24
F
Access Layer Design
Looped Design Model
VLANs are extended betweenaggregation switches, creatingthe looped topology
Spanning tree is used to preventactual loops (rapid PVST+, MST)
Redundant path exists through asecond path that is blocking
Two looped topology designs:
Triangle and square
VLANs may be load balancedacross access layer uplinks
Inter-switch link utilization mustbe considered as this may beused to reach active services
Primary STP Root
Primary HSRP
Active Services
Secondary STP Root
Secondary HSRP
Standby Services
FF F F
F B F F FBB
F F
.1Q TrunkF F
Looped Triangle Looped Square
L3
L2Inter-Switch Link
-
8/10/2019 Navaid Shamsee Network Architecture
25/73
25
Access Layer Design
L2 Looped Topologies
Looped Triangle Access
Supports VLAN extension/L2 adjacency acrossaccess layer
Resiliency achieved with dual homing and STP
Quick convergence with 802.1W/S
Supports stateful services at aggregation layer
Proven and widely used
Looped Square Access
Supports VLAN extension/L2 adjacency acrossaccess layer
Resiliency achieved with dual homing and STP
Quick convergence with 802.1W/S
Supports stateful services at aggregation layer
Active-active uplinks align well to ACE/FWSM 3.1
Achieves higher density access layer, optimizing
10GE aggregation layer density
L3L2
L2 Looped Triangle
VLAN10 VLAN20
VLAN10
Row 2, Cabinet 3 Row 9, Cabinet 8
L2 Looped Square
L3L2
VLAN10
VLAN20
VLAN10
Row 2, Cabinet 3 Row 9, Cabinet 8
-
8/10/2019 Navaid Shamsee Network Architecture
26/73
26
Access Layer Design
Benefits of L2 Looped Design
Services like firewall andload balancing can easilybe deployed at theaggregation layer andshared across multipleaccess layer switches
VLANs are primarily
contained between pairsof access switches but
VLANs may be extendedto different accessswitches to support
NIC teaming
Clustering L2 adjacency
Administrative reasons
Geographical challenges Row A, Cab 7 Row K, Cab 3
L3
L2
DC Core
-
8/10/2019 Navaid Shamsee Network Architecture
27/73
27
Access Layer Design
Drawbacks of Layer 2 Looped Design
Main drawback: if a loop occursthe network may becomeunmanageable due to the infinitereplication of frames
802.1w Rapid PVST+ combinedwith STP related features and bestpractices improve stability and
help to prevent loop conditionsUDLD
Loopguard
Rootguard
BPDUguard
Limited domain size
Stay under STP watermarks for logicaland virtual ports
3/2 3/2
3/1 3/1
Switch 1 Switch 2
DST MAC 0000.0000.4444
DST MAC 0000.0000.4444
0000.0000.3333
-
8/10/2019 Navaid Shamsee Network Architecture
28/73
28
Access LayerDesign
L2 LoopFree Design Model
-
8/10/2019 Navaid Shamsee Network Architecture
29/73
29
Access Layer Design
Loop-Free Design
Alternative to looped design
2 Models: LoopFree U and
LoopFree Inverted U Benefit: spanning tree is
enabled but no port is blockingso all links are forwarding
Benefit: less chance ofloop condition due tomisconfigurations or otheranomalies
L2-L3 boundary varies
by loop-free model used:U or Inverted-U
Implication considerations withservice modules, L2 adjacency,
and single attached servers
LoopFree U LoopFree
Inverted U
L3
L2
DC Core
L3
L2
-
8/10/2019 Navaid Shamsee Network Architecture
30/73
30
Access Layer Design
Loop-Free Topologies
Loop-Free Inverted U Access
Supports VLAN extension
No STP blocking; all uplinks active
Access switch uplink failure blackholes single attached servers
Supports all service moduleimplementations
Loop-Free U Access
VLANs contained in switch pairs
(no extension outside of switch pairs) No STP blocking; all uplinks active
Autostate implications for certainservice modules (CSM)
ACE supports autostate and per
context failover
L3
L2
VLAN10
VLAN20
L3L2
VLAN10
VLAN20
VLAN10
L2 Loop-Free U L2 Loop-Free Inverted U
-
8/10/2019 Navaid Shamsee Network Architecture
31/73
31
Access Layer Design
Loop-Free U Design and Service Modules (1)
If the uplink connecting access and aggregation goes down, the VLAN interface on
the MSFC goes down as well due to the way autostate works CSM and FWSM has implications as autostate is not conveyed (leaving black hole)
Tracking and monitoring features may be used to allow failover of service modulesbased on uplink failure but would you want a service module failover for one accessswitch uplink failure?
Not recommended to use loop-free L2 access with active-standby service moduleimplementations. (See slide on ACE next)
HSRP
SVI for Interface vlan 10Goes to Down State ifUplink is the Only
Interface in vlan 10
MSFC on Agg1
L3L2
VLAN 20, 21
Agg1, Active
Services
Agg2, Standby
Services
VLAN 10
HSRP Secondary on
Agg2 Takes Over as def-
gwy, But How to Reach
Active Service Modules?
VLAN 11
-
8/10/2019 Navaid Shamsee Network Architecture
32/73
32
Access Layer Design
Loop-Free U Design and Service Modules (2)
With ACE:
Per context failover with autostate
If uplink fails to Agg1, ACE can switchover to Agg2 (under 1sec)
Requires ACE on access trunk side for autostate failover
May be combined with FWSM3.1 for active-active design
HSRP
L3L2
VLAN 20, 21VLAN 10, 11
HSRP Secondary on Agg2
Takes Over as def-gwy
ACE Context 1 ACE Context 1
-
8/10/2019 Navaid Shamsee Network Architecture
33/73
33
DC Core
Access Layer Design
Drawbacks of Loop-Free Inverted-U Design
Single attached serversare black-holed if access
switch uplink fails Distributed
EtherChannel canreduce chance ofblack holing
NIC teaming improvesresiliency in this design
Inter-switch link scaling
needs to be consideredwhen using active-standby service modules
L3L2
-
8/10/2019 Navaid Shamsee Network Architecture
34/73
34
Access Layer Design
Comparing Looped, Loop-Free
(1,2)
(3)
(4)
Uplink vlanson Agg
Switch inBlocking or
Standby
State
VLANExtensionSupportedAcrossAccess
Layer
ServiceModuleBlack-
Holing onUplink
Failure (5)
SingleAttached
ServerBlack-
Holing onUplink
Failure
AccessSwitch
Density perAgg Module
MustConsider
Inter-SwitchLink Scaling
LoopedTriangle
- + + + - +
LoopedSquare
+ + + + + -
Loop-Free U + - - + + +
Loop-FreeInverted U
+ + + +/- + -(1,2)
(3)
(4)
1. Use of distributed EtherChannel greatly reduces chances of black holing condition2. NIC teaming can eliminate black holing condition3. When service modules are used and active service modules are aligned to Agg14. ACE module permits L2 loopfree access with per context switchover on uplink failure
5. Applies to when using CSM or FWSM in active/standby arrangement
-
8/10/2019 Navaid Shamsee Network Architecture
35/73
35
Access LayerDesign
L3 Design Model
-
8/10/2019 Navaid Shamsee Network Architecture
36/73
36
DC Core
DC Aggregation
DC Access
Access Layer Design
Defining Layer 3 Access
L3 access switches connectto aggregation with adedicated subnet
L3 routing is first performed inthe access switch itself
.1Q trunks between pairs of L3access switches support L2
adjacency requirements (limitedto access switch pairs)
All uplinks are active up toECMP maximum of 8, nospanning tree blocking occurs
Convergence time is usuallybetter than spanning tree (rapidPVST+ is close)
Provides isolation/shelter forhosts affected by broadcasts
L3L2
DC Core
-
8/10/2019 Navaid Shamsee Network Architecture
37/73
37
L3 Access withMulticast Sources
L3 Access withMulticast Sources
Access Layer Design
Need L3 for Multicast Sources?
Multicast sources on L2access works well when
IGMP snooping is available IGMP snooping at access
switch automatically limitsmulticast flow to interfaces
with registered clientsin VLAN
Use L3 when IGMPsnooping is not availableor when particular L3administrative functionsare required
DC Core
DC Aggregation
DC AccessL3L2
DC Core
-
8/10/2019 Navaid Shamsee Network Architecture
38/73
38
Access Layer Design
Benefits of L3 Access
Minimizes broadcast domainsattaining high level of stability
Meet server stabilityrequirements or isolateparticular applicationenvironments
Creates smaller failuredomains, increasing stability
All uplinks are availablepaths, no blocking (up to
ECMP maximum)
Fast uplink convergence:failover and fallback, noarp table to rebuild for agg
switches
DC Core
DC Aggregation
DC AccessL3L2
DC Core
-
8/10/2019 Navaid Shamsee Network Architecture
39/73
39
Access Layer DesignDrawbacks of Layer 3 Design
L2 adjacency is limited toaccess pairs (clustering andNIC teaming limited)
IP address spacemanagement is more difficultthan L2 access
If migrating to L3 access,IP address changes maybe difficult on servers(may break apps)
Would normally require
services to be deployed ateach access layer pair tomaintain L2 adjacency withserver and provide statefulfailover
DC Core
DC Aggregation
DC AccessL3L2
DC Core
-
8/10/2019 Navaid Shamsee Network Architecture
40/73
40
Access Layer DesignL2 or L3? What Are My Requirements?
Difficulties in managing loops Staff skillset; time to resolution
Convergence properties
NIC teaming; adjacency
HA clustering; adjacency
Ability to extend VLANs Specific application requirements
Broadcast domain sizing
Oversubscription requirements
Link utilization on uplinks
Service module support/placement
RapidPVST+or
MST
OSPF
,EIGRP
Aggregation
Access
Layer 2 Layer 3
The Choice of One Design Versus the Other One Has to Do With:
L3L2
L3L2
Aggregation
Access
-
8/10/2019 Navaid Shamsee Network Architecture
41/73
41
Access LayerDesign
BladeServers
Bl d S R i t
-
8/10/2019 Navaid Shamsee Network Architecture
42/73
42
Blade Server RequirementsConnectivity Options
Using Pass-Through Modules Using Integrated Ethernet Switches
Blade Server Chassis Blade Server Chassis
Aggregation
Layer
Interface 1
Interface 2
External L2Switches
Integrated L2Switches
Bl d S R i t
-
8/10/2019 Navaid Shamsee Network Architecture
43/73
43
Blade Server RequirementsConnectivity Options
Layer 3 access tier may beused to aggregate bladeserver uplinks
Permits using 10GE uplinksinto agg layer
Avoid dual tier Layer 2access designs
STP blocking
Over-subscription
Larger failure domain
Consider Trunk Failoverfeature of integrated switch
AggregationLayer
Integrated L2Switches
Layer 3Access
L3L2
10GE
L3
L2
Layer 2Access
Bl d S R i t
-
8/10/2019 Navaid Shamsee Network Architecture
44/73
44
Blade Server RequirementsTrunk Failover Feature
Switch takes down serverinterfaces if correspondinguplink fails, forcing NICteaming failover
Solves NIC teaminglimitations; prevents black-holing of traffic
Achieves maximumbandwidth utilization:
No blocking by STP, but STPis enabled for loop protection
Can distribute trunk failovergroups across switches
Dependent upon the NICfeature set for NIC
Teaming/failover
Interface 1
Interface 2
Blade Server Chassis
Aggregation
Integrated L2Switches
-
8/10/2019 Navaid Shamsee Network Architecture
45/73
45
Density andScalabilityImplications in
the Data Center
-
8/10/2019 Navaid Shamsee Network Architecture
46/73
Density and Scalability Implications
-
8/10/2019 Navaid Shamsee Network Architecture
47/73
47
Density and Scalability ImplicationsServer Farm Cabinet Layout
~30-401RUServersperRack
N- Racks per Row
Considerations:How Many Interfaces per ServerTop of Rack Switch placement or
End of Row/Within the RowSeparate Switch for OOB NetworkCabling overhead vs. under floorPatch systemsCooling capacityPower distribution/ Redundancy
Density and Scalability Implications
-
8/10/2019 Navaid Shamsee Network Architecture
48/73
48
Density and Scalability ImplicationsCabinet Design with 1RU Switching
Minimizes the number of cables torun from each cabinet/rack
If NIC teaming support: two -1RUswitches are required
Will two 1RU switches provideenough port density?
Cooling requirements usually donot permit a full rack of servers
Redundant switch power supplyare option
Redundant switch CPUconsiderations: not an option
GEC or 10GE uplinkconsiderations
Single Rack-2 Switches Dual Rack-2 Switches
Cablingre
mainsincabine
ts
Servers Connect Directly to a 1RU Switch
Density and Scalability Implications
-
8/10/2019 Navaid Shamsee Network Architecture
49/73
49
Density and Scalability ImplicationsNetwork Topology with 1RU Switching Model
Pro: Efficient CablingPro: Improved CoolingCon: Number of Devices/Mgmt
Con: Spanning Tree Load
Access
Aggregation
Cabinet 1 Cabinet 2
With ~1,000 Servers/25 Cabinets = 50 Switches
Cabinet 25
4 Uplinks per Cabinet = 100
Uplinks Total to Agg Layer
GEC or 10GE Uplinks?
DC Core
Density and Scalability Implications
-
8/10/2019 Navaid Shamsee Network Architecture
50/73
50
Density and Scalability ImplicationsCabinet Design with Modular Access Switches
Cable bulk at cabinet floorentry can be difficult tomanage and block cool air
flow Typically spaced out by
placement at ends of rowor within row
Minimizes cabling to
Aggregation
Reduces number ofuplinks/aggregation ports
Redundant switch powerand CPU are options
GEC or 10GE UplinkConsiderations
NEBS ConsiderationsCables Route Under Raised Floor or in Overhead Trays
Servers Connect Directly to a Modular Switch
Density and Scalability Implications
-
8/10/2019 Navaid Shamsee Network Architecture
51/73
51
Density and Scalability ImplicationsNetwork Topology with Modular Switches in the Access
Access
~8 Access Switches to Manage
Pro: Fewer Devices/MgmtCon: Cabling ChallengesCon: Cooling Challenges
With ~1,000 Servers/9 Slot Access Switches= 8 Switches
2 Uplinks per 6509 = 16
Uplinks to Agg Layer
Aggregation
DC Core
GEC or 10GE Uplinks?
Density and Scalability Implications
-
8/10/2019 Navaid Shamsee Network Architecture
52/73
52
Density and Scalability Implications1RU and Modular Comparison
Aggregation
1RU
Access
ModularAccess
Fewer Uplinks
More Uplinks
Fewer I/O Cables
More I/O Cables
Higher STP Proc
Lower STP Proc
Density and Scalability Implications
-
8/10/2019 Navaid Shamsee Network Architecture
53/73
53
Density and Scalability ImplicationsDensity: How Many NICs to Plan For?
Three to four NICs per serverare common
Front end or public interface
Storage interface (GE, FC)
Backup interface
Back end or private interface
integrated Lights Out (iLO)
for OOB mgmt
May require more than two1RU switches per rack
30 servers@ 4 ports = 120 ports
required in a single cabinet(3x48 port 1RU switches)
May need hard limits oncabling capacity
Avoid cross cabinet and othercabling nightmares
Single Rack-2 Switches Dual Rack-2 Switches
Front End Interface
Back End Interface
Backup Network
OOBManagement
Storage HBA orGE NIC
Cablingre
mainsincabine
ts
Density and Scalability Implications
-
8/10/2019 Navaid Shamsee Network Architecture
54/73
54
Density and Scalability ImplicationsHybrid Example with Separate OOB
1RU end of row approach:
Smaller failure domains
Low power consumption (kw/hr)
GEC or 10GE uplinks
Permits server distribution
via patchEliminates TOR port density issue
Requires solid patch & cablingsystem
Separate OOB switch:
Usually 10M Ethernet (iLo)
Very low utilization
Doesnt require high performance
Separate low end switch ok
Isolated from production network
Hybrid modular + 1RU
Provides flexibility in design
Dual CPU + power for critical apps
Secondary port for NIC teaming
OOBOOBOOB OOB
Aggregation
Density and Scalability Implications
-
8/10/2019 Navaid Shamsee Network Architecture
55/73
55
Density and Scalability ImplicationsOversubscription and Uplinks
What is the oversubscription ratio peruplink?
Develop an oversubscription reference model
Identify by application, tier, or other means
Considerations
Future- true server capacity (PCI-X, PCI- Express)
Server platform upgrade cycle will increaselevels of outbound traffic
Uplink choices available
Gigabit EtherChannel 10GE
10Gig EtherChannel
Flexibility in adjusting oversubscription ratio
Can I upgrade to 10GE easily? 10GEtherChannel?
Upgrade CPU and switch fabric? (sup1-2-720-?)
Density and Scalability Implications
-
8/10/2019 Navaid Shamsee Network Architecture
56/73
56
Density and Scalability ImplicationsSpanning Tree
1RU switching increase chances oflarger spanning tree diameter
BladeServer switches are logicallysimilar to adding 1RU switches intothe access layer
A higher number of trunks willincrease STP logical port counts inaggregation layer
Determine spanning tree logical and
virtual interfaces before extendingVLANs or adding trunks
Use aggregation modules to scaleSTP and 10GE density
-
8/10/2019 Navaid Shamsee Network Architecture
57/73
57
Scaling Bandwidthand Density
Scaling B/W with GEC and 10GE
-
8/10/2019 Navaid Shamsee Network Architecture
58/73
58
gOptimizing EtherChannel Utilization
Ideal is graph on top right
Bottom left graph more typical
Analyze the traffic flows in and out ofthe server farm:
IP addresses (how many?)
L4 port numbers (randomized?)
Default L3 hash may not be optimalfor GEC: L4 hash may improve
10 GigE gives you effectively the fullbandwidth without hash implications
agg(config)# port-channel load balance
src-dst-port
Scaling B/W with GEC and 10GE
-
8/10/2019 Navaid Shamsee Network Architecture
59/73
59
gUsing EtherChannel Min-Links
Min-Links feature is availableas of 12.2.18SXF
Set the minimum number ofmember ports that must be inthe link-up state or declare thelink down
Permits higher bandwidthalternate paths to be used orlower bandwidth paths to beavoided
Locally significant configuration Only supported on LACP
EtherChannels
Router# configure terminal
Router(config)# interface port-channel 1Router(config-if)# port-channel min-links 2Router(config-if)# end
Scaling B/W with GEC and 10GE
-
8/10/2019 Navaid Shamsee Network Architecture
60/73
60
gMigrating Access Layer Uplinks to 10GE
How do I scale as I migrate from GEC to 10GE uplinks?
How do I increase the 10GE port density at the agg layer?
Is there a way to regain slots used by service modules?
Access Pair 1
Aggregation
DC Core
Scaling B/W with GEC and 10GE
-
8/10/2019 Navaid Shamsee Network Architecture
61/73
61
Service Layer Switch
Move certain services out ofaggregation layer
Ideal for ACE & FWSMmodules
Opens slots in agg layer for10GE ports
May need QOS or separatelinks for FT paths
Extend only necessary L2VLANs to service switches
via .1Q trunks (GEC/TenG)
RHI installs route in localMSFC only, requiring L3peering with aggregation
DC Core
ServiceSwitch2
(Redundant)
ServiceSwitch1
Access
Aggregation
Scaling B/W with GEC and 10GEC lid t t ACE
-
8/10/2019 Navaid Shamsee Network Architecture
62/73
62
Consolidate to ACE
Consider consolidatingmultiple service modulesonto ACE Module
SLB
Firewall
SSL
4/8/16G Fabric
Connected
Active-Active Designs
Higher CPS +Concurrent CPS
Single TCP termination,lower latency
Feature gap may notpermit till later release
DC Core
Access
Aggregation
-
8/10/2019 Navaid Shamsee Network Architecture
63/73
63
Increasing HA inthe Data Center
Increasing HA in the Data CenterS Hi h A il bilit
-
8/10/2019 Navaid Shamsee Network Architecture
64/73
64
Server High Availability
1. Server network adapter
2. Port on a multi-port
server adapter3. Network media
(server access)
4. Network media (uplink)
5. Access switch port
6. Access switch module
7. Access switch
With Data Center HARecommendations
Without Data Center HARecommendations
L2
L3
These Network Failure Issues Can BeAddressed by Deployment of DualAttached Servers Using Network AdapterTeaming Software
Common Points of Failure
Increasing HA in the Data CenterC NIC T i C fi ti
-
8/10/2019 Navaid Shamsee Network Architecture
65/73
65
On Failover, Src MAC Eth1 = Src MAC Eth0IP Address Eth1 = IP Address Eth0
Eth1: StandbyEth0: Active
SFTSwitch Fault Tolerance
IP=10.2.1.14
MAC =0007.e910.ce0f
On Failover, Src MAC Eth1 = Src MAC Eth0IP Address Eth1 = IP Address Eth0
Eth1: StandbyEth0: Active
AFTAdapter Fault Tolerance
H
eartbeats
Heartbeats
One Port Receives, All Ports TransmitIncorporates Fault Tolerance
One IP Address and Multiple MAC Addresses
Eth1-X: ActiveEth0: Active
ALBAdaptive Load Balancing
Heartbeats
IP=10.2.1.14
MAC =0007.e910.ce0f
IP=10.2.1.14
MAC =0007.e910.ce0f
IP=10.2.1.14
MAC =0007.e910.ce0e
Default GW10.2.1.1
HSRP
Default GW10.2.1.1
HSRP
Default GW10.2.1.1
HSRP
Common NIC Teaming Configurations
Note: NIC manufacturer drivers are changing and may operate differently.Also, server OS have started integrating NIC teaming drivers which may
operate differently.
Increasing HA in the Data CenterSer er Attachment M ltiple NICs
-
8/10/2019 Navaid Shamsee Network Architecture
66/73
66
Server Attachment: Multiple NICs
L2
L3
EtherChannelsAll Links Active:
Load Balancing
Only One Link Active:
Fault Tolerant Mode
EtherChannel Hash MayNot Permit Full Utilizationfor Certain Applications
(Backup Example)
You Can Bundle Multiple Links to Allow GeneratingHigher Throughputs Between Servers and Clients
Increasing HA in the Data CenterFailover: What Is the Time to Beat?
-
8/10/2019 Navaid Shamsee Network Architecture
67/73
67
Failover: What Is the Time to Beat?
The overall failover time is thecombination of convergenceat L2, L3, + L4 components
Stateful devices can replicateconnection information and typicallyfailover within 3-5sec
EtherChannels < 1sec
STP converges in ~1 sec (802.1w)
HSRP can be tuned to
-
8/10/2019 Navaid Shamsee Network Architecture
68/73
68
Failover Time Comparison
STP-802.1wOne sec
OSPF-EIGRPOne sec
ACE Module with Autostate
HSRPThree sec (using 1/3)
FWSM ModuleThree sec
CSM ModuleFive sec
WinXP/2003 ServerTCP StackNine sec
OSPF/EIGRP
Sub-Second
Spanning Tree
~1sec
HSRP
~ 3s
(may betuned to less)
Fa
iloverTime TCP Stack
Tolerance
~ 9s
FireWallService
Module
~ 3s
Content
Service
Module
~ 5s
ACE
~ 1s
Increasing HA in the Data CenterNon Stop Forwarding/Stateful Switch Over
-
8/10/2019 Navaid Shamsee Network Architecture
69/73
69
Non-Stop Forwarding/Stateful Switch-Over
NSF/SSO is a supervisorredundancy mechanism for intra-chassis supervisor failover
SSO synchronizes layer 2 protocolstate, hardware L2/L3 tables(MAC, FIB, adjacency table), ACLand QoS tables
SSO synchronizes state for:trunks, interfaces, EtherChannels,port security, SPAN/RSPAN, STP,UDLD, VTP
NSF with EIGRP, OSPF, IS-IS,
BGP makes it possible to have noroute flapping during the recovery
Aggressive RP timers may notwork in NSF/SSO environment
NSF-Aware NSF-Aware
Increasing HA in the Data CenterNSF/SSO in the Data Center
-
8/10/2019 Navaid Shamsee Network Architecture
70/73
70
NSF/SSO in the Data Center
SSO in the access layer:
Improves availability for singleattached servers
SSO in the aggregation layer:
Consider in primary agg layer switch
Avoids rebuild of arp, igp, stp tables
Prevents service module switchover
(can be up to ~6sec depending onwhich module)
SSO switchover time less than two sec
12.2.18SXD3 or higher
Possible implicationsHSRP state between Agg switches isnot tracked and will show switchoveruntil control plane recovers
IGP Timers cannot be aggressive(tradeoff)
Increasing HA in the Data CenterBest Practices: STP HSRP Other
-
8/10/2019 Navaid Shamsee Network Architecture
71/73
71
Rapid PVST+
UDLD Global
Spanning Tree Pathcost Method=Long
Agg1:
STP Primary Root
HSRP Primary
HSRP Preempt and DelayDual Sup with NSF+SSO
Agg2:
STP Secondary Root
HSRP Secondary
HSRP Preempt and DelaySingle Sup
Rapid PVST+: Maximum Number of STP Active Logical
Ports- 8000 and Virtual Ports Per Linecard-1500
Blade Chassiswith Integrated
Switch
FT
Rootguard
LoopGuardPortfast +BPDUguard
LACP+L4 Hash
Dist EtherChannel
Min-Links
L3+ L4 CEF Hash
LACP+L4 Port HashDist EtherChannel for FT
and Data VLANs
Data
Best Practices: STP, HSRP, Other
-
8/10/2019 Navaid Shamsee Network Architecture
72/73
72
Q and A
-
8/10/2019 Navaid Shamsee Network Architecture
73/73
73