VMware NSX - Lessons Learned from real project

12
© 2014 VMware Inc. All rights reserved. NSX Architecture Design Lessons Learned from real project David Pasek Infrastructure Architect VCDX #200 End to End QoS Solution for VMware vSphere with NSX on top of CISCO UCS

Transcript of VMware NSX - Lessons Learned from real project

©  2014  VMware  Inc.  All  rights  reserved.

NSX  Architecture  Design  Lessons  Learned  from  real  project

David  PasekInfrastructure  ArchitectVCDX  #200

End  to  End  QoS Solution  for  VMware  vSphere  with  NSX  on  top  of  CISCO  UCS

Agenda

1 Project  Overview

2 NSX  Conceptual  &  Logical  Design

3 Deep  Dive  in  to  Network  QoS – Design  Decision  Point

4 Q  &  A

CONFIDENTIAL 2

• Private  Cloud  – EMC  FEHC-­CA  with  custom  enhancements• vSphere  VM  as  a  Service• Hyper-­V  VM  as  a  Service• Physical  Server  as  a  Service• Backup  as  a  Service• Storage  as  a  Service

• Environment  /  Facilities• Two  datacenters  in  metro  distance  (<5  ms)• Remote  Offices  (Technical  Rooms)  in  MPLS  distance

• Products  and  Technologies• CMP:  vRealizeAutomation,  vRealize Orchestrator,   vRealize Business• Infrastructure  Virtualization:  VMware  vSphere,  Hyper-­V,  NSX-­v• Servers:  Cisco  UCS• Networking:  Cisco  Nexus• Storage:  EMC  VIPR,  EMC  VPLEX,  EMC  VNX,  VMware  VSAN• Backup:  EMC  Avamar,  EMC  Networker,  EMC  DataDomain• Security:  NSX  +  PaloAlto Networks

Project  Overview

Overall  Project  High  Level  Concept

Datacenter A Datacenter B

vSphere Resource Pool - GOLD TIERVMware vSphere Metro Cluster Stretched across two datacenters

Storage Stretched across two datacenters (VPLEX)

Technical Room Resource Pool - TR TIER(vSphere + VSAN)Remote Location

Existing Core Network

Cloud Consumers

Cloud Administrators

vRealize Automation

vRealize Business Std. + Adv.

IT Finance

vRealize AutomationvCenter OrchestratorvRealize LogInsight

vRealize Operations Manager

vSphere Resource Pool - SILVER TIERCluster in single datacenterStorage in single datacenter (different storage tiers)

vSphere Resource Pool - SILVER TIERCluster in single datacenterStorage in single datacenter (different storage tiers)

Cloud Management Infrastructure ClusterVMware vSphere Metro Cluster Stretched across two datacenters

Storage Stretched across two datacenters (VPLEX)

Cloud Management Software StackCloud Management Platform, vSphere Management, NSX Management workloads

Hyper-V Resource PoolCluster in single datacenterStorage in single datacenter

Hyper-V Resource PoolCluster in single datacenterStorage in single datacenter

Physical Servers Resource PoolServer in single datacenterStorage in single datacenter

Physical Servers Resource PoolServer in single datacenterStorage in single datacenter

NSX-­v  Conceptual  Architecture

Datacenter A (CDP-A) Datacenter B (CDP-B)

CORE NETWORK(dynamic routing protocol has

to be implemented)

PaloAlto FWPhysical Appliance

PaloAlto FWPhysical Appliance

NSX Edge GW NSX Edge GW NSX Edge GW NSX Edge GW

VIRT

UAL

NET

WO

RK O

VERL

AY

PHYSICAL NETWORK UNDERLAY

ESXi Host ESXi HostESXi Host ESXi Host

GOLD vSphere Cluster - STRETCHED

ESXi HostESXi Host ESXi Host

SILVER vSphere ClusterLOCAL

ESXi Host ESXi HostESXi Host ESXi Host

LOGICAL SWITCH (VXLAN SEGMENT)

vNIC

vPaloAlto L7 FW vPaloAlto L7 FW vPaloAlto L7 FW vPaloAlto L7 FW vPaloAlto L7 FW vPaloAlto L7 FW vPaloAlto L7 FW vPaloAlto L7 FW

LOGICAL SWITCH (VXLAN SEGMENT)

NSX FW NSX FW NSX FW NSX FW NSX FW NSX FW NSX FW NSX FW

NSX DLRDistributed Logical RouterEast-West Routing in DCs

Traffic SteeringNSX DISTRIBUTED LOGICAL FIREWALL

NSX DISTRIBUTED LOGICAL FIREWALL

L3 Fabric

ECMP+

Dynamic Routing between PAN,

NSX Edge GWs and NSX DLRs

PaloAlto PanoramaCentralized

Security Management

VMware NSX Manager

Centralized Virtual Network Management

NSX EdgeL2 VPN

NSX EdgeL2 VPN

NSX Edge L2 VPNHighly Available

TR VPN Termination

NSX Edge Services GWsHighly Available

North South Routing

NSX Edge GWL2 VPN

LOGICAL SWITCH (VXLAN SEGMENT)

Technical Room (TR)

L2 VPN TUNNEL

(TR<—>DC)

FEHC Management vSphere Cluster - STRETCHED

ESXi Host ESXi Host

SILVER vSphere Cluster - LOCAL

LOGICAL SWITCH (VLAN SEGMENT)

ESXi HostESXi Host ESXi Host

SILVER vSphere ClusterLOCAL

LOGICAL SWITCH (VXLAN SEGMENT)

NSX DISTRIBUTED LOGICAL FIREWALL

NSX-­v  Security  ConceptvRA Business Group: HR

Logical Network

Micro Security ZoneTechnical Service - SAP[NSX Security Group of all VMs

having tag MSZ-SAP]

Micro Security ZoneTechnical Service - A

[NSX Security Group of all VMs having tag MSZ-A]

vRA Business Group: FINANCE

Logical Network

Micro Security ZoneTechnical Service - B

[NSX Security Group of all VMs having tag MSZ-B]

NSX Distributed Logical Router

MSZ-SAP MSZ-SAP MSZ-SAP

MSZ-SAP MSZ-SAP

MSZ-A MSZ-A

MSZ-B MSZ-B MSZ-B

Default NSX Security Policy

NAME SOURCE DESTINATION SERVICE ACTIONDefault Any Any Any Block

SECURITY TAGS

Security tags for technical services: MSZ-<Technical-Service-from-CMDB> For example: MSZ-SAP, MSZ-A, MSZ-B

Security tags for applications: APP-<gkpke.APP-SEC-TAG[x]> For example: APP-MSSQL, APP-IIS, APP-EXCHANGE

APP-MSSQLAPP-MSSQL NSX SECURITY GROUPS

We have NSX Security Group for each Technical Service.

This security group forms Micro Security Zone for particular Technical Service.

For example: MSZ-SAP, MSZ-A, MSZ-B

All VMs tagged with the Security Group name will belong to this security group.

NSX Security Policy for Micro Security Zones

NAME SOURCE DESTINATION SERVICE ACTIONInside MSZ-A MSZ-A MSZ-A Any AllowInside MSZ-B MSZ-B MSZ-B Any AllowInside MSZ-SAP MSZ-SAP MSZ-SAP Any Allow

Other NSX Security Groups and Policies

Other NSX security groups and polices can be created based on applications tags and other metadata available for NSX.

Physical or Hyper-V Serverbelonging in to Micro

Security Zone

End  to  End  Network  QoS -­ Design  Decision  Point• Requirements• End  to  end  network  QoS is  required  to  achieve  guarantees  for  particular  network  traffics.    These  traffics  are  

• FCoE Storage

• vSphere  Management

• vSphere  vMotion

• VM  production

• VM  guest  OS  agent  based  backup  <==  this  is  the  most  complex  requirement  in  context  of  QoS

• Constraints• CISCO  Nexus  7k

• VMware  NSX-­v

• CISCO  UCS  servers  B200  M4  with  virtual  interface  card  VIC1340  (2x10Gb  ports  -­ each  port  connected  to  different  fabric  interconnect)

• Cloud  Automation  (vRA,  vRO)

End  to  End  Network  QoS – Option  1  of  3

UCS Blade ServerB200 M4

NIC-A1 - 10Gb NIC port

vHBA0FCoE

CoS 3 40%Mark as CoS 3

vNIC0Mgmt

VLAN 100CoS 1 10%Mark CoS 1

vNIC2vMotion

VLAN 101CoS 2 10%Mark CoS 2

vNIC4VM TrafficVLAN 102

CoS 0 20%Mark CoS 0

NIC-B1 - 10Gb NIC port

vHBA1FCoE

CoS 3 40%Mark as CoS 3

vNIC1Mgmt

VLAN 100CoS 1 10%Mark CoS 1

vNIC3vMotion

VLAN 101CoS 2 10%Mark CoS 2

vNIC5VM TrafficVLAN 102

CoS 0 20%Mark CoS 0

UCS Fabric Interconnect A (EHM) UCS Fabric Interconnect B (EHM)

vFC vEth vEth vEth vFC vEth vEth vEth

CIS

CO

UC

S

CISCO Nexus 7k CISCO Nexus 7k

Eth Eth Eth EthFc Fc

SAN A SAN B

vPC Domain

vPCvPC

vNIC7Backup

VLAN 103CoS 4 20%Mark CoS 4

vNIC6Backup

VLAN 103CoS 4 20%Mark CoS 4

VMw

are

vSph

ere

- ESX

i

vmkernelMgmt

(Native VLAN)

vmkernelvMotion

(Native VLAN)vmkernel

VTEP

VMware Distributed vSwitchDVS portgroup (Native VLAN)

VTEPDVS portgroup (native VLAN)

Backup

vEth vEth

VMw

are

NSX

NSX Logical Switch (VXLAN)logical segment - Business Group

VM vNICProduction

VM vNICBackup

UCS uplink & N7K downlinkQoS SettingsCoS 0: 50% (VM Traffic)CoS 1: 10% (Mgmt)CoS 2: 10% (vMotion)CoS 4: 30% (Backup)

vmnic0 vmnic2 vmnic4 vmnic6 vmnic1 vmnic3 vmnic5 vmnic7

VMware Standard vSwitch VMware Standard vSwitch VMware Distributed vSwitch

Cisco VIC 1340 (4x10Gb port)

DVS portgroup Virtual Wire - Business Group 1

CISCO UCS QoS PolicesBandwidth Management & QoS Marking

UCS QoS Policy UP (Uplinks): CoS 0: 50% (VM Traffic) CoS 1: 10% (Mgmt) CoS 2: 10% (vMotion) CoS 4: 30% (Backup)

UCS QoS Policy 1 (vNIC): CoS 0: 20% (VM Traffic) CoS 1: 10% (Mgmt) CoS 2: 10% (vMotion) CoS 3: 40% (FCoE) CoS 4: 20% (Backup)

UCS all vNIC Templates: Host Control: None

End  to  End  Network  QoS – Option  2  of  3

UCS Blade ServerB200 M4

10Gb NIC port (NIC-A1)

vHBA0FCoE

CoS 3 40%Mark as CoS 3

10Gb NIC port (NIC-B1)

vHBA1FCoE

CoS 3 40%Mark as CoS 3

UCS Fabric Interconnect A (EHM) UCS Fabric Interconnect B (EHM)

vFC vEth vEth vEth vFC vEth vEth vEth

CIS

CO

UC

S

CISCO Nexus 7k CISCO Nexus 7k

Eth Eth Eth EthFc Fc

SAN A SAN B

vPC Domain

vPCvPCVM

war

e vS

pher

e - E

SXi

vmkernelMgmt

vmkernelvMotion

vmkernelVTEP

DVS portgroup VLAN 102, Mark as CoS 0

VTEP

DVS portgroup VLAN 103, Mark as COS 4

Backup

vEth vEth

VMw

are

NSX

NSX Logical Switch (VXLAN)logical segment - Business Group

VM vNICProduction

VM vNICBackup

UCS uplink & N7K downlinkQoS SettingsCoS 0: 40% (VM Traffic)CoS 1: 10% (Mgmt)CoS 2: 10% (vMotion)CoS 4: 40% (Backup)

vmnic0 vmnic1

VMware Distributed vSwitch (DVS)

DVS portgroup VLAN 100, Mark as CoS 1

Mgmt

DVS portgroup VLAN 101, Mark as CoS 2

vMotion

Cisco VIC 1340 (4x10Gb port)

DVS portgroup Virtual Wire - Business Group 1

DVS per PortGroup MarkingCoS 0: System: VM TrafficCoS 1: System: MgmtCoS 2: System: vMotionCoS 4: User-def: Backup

vmnic2 vmnic3

CISCO UCS QoS PolicesBandwidth Management & QoS Marking

UCS QoS Policy UP (Uplinks): CoS 0: 40% (VM Traffic) CoS 1: 10% (Mgmt) CoS 2: 10% (vMotion) CoS 4: 40% (Backup)

UCS QoS Policy 1 (vNIC 0,1): CoS 0: 20% (VM Traffic) CoS 1: 10% (Mgmt) CoS 2: 10% (vMotion) CoS 3: 40% (FCoE) CoS 4: 20% (Backup)

UCS all vNIC Templates: Host Control: None

vNIC0trunk

CoS0 20%CoS1 10%CoS2 10%CoS4 20%

vNIC1trunk

CoS0 20%CoS1 10%CoS2 10%CoS4 20%

End  to  End  Network  QoS – Option  3  of  3

UCS Blade ServerB200 M4

10Gb NIC port (NIC-A1)

vHBA0CoS 3 40%

FCoEMark as CoS 3

10Gb NIC port (NIC-B1)

vHBA1CoS 3 40%

FCoEMark as CoS 3

UCS Fabric Interconnect A (EHM) UCS Fabric Interconnect B (EHM)

vFC vEth vEth vEth vFC vEth vEth vEth

CIS

CO

UC

S

CISCO Nexus 7k CISCO Nexus 7k

Eth Eth Eth EthFc Fc

SAN A SAN B

vPC Domain

vPCvPC

VMw

are

vSph

ere

- ESX

i

vmkernelMgmt

vmkernelvMotion

vmkernelVTEP

DVS portgroup VLAN 102

VTEP

vEth vEth

VMw

are

NSX

NSX Logical Switch (VXLAN)logical segment - Business Group

VM vNICProduction & Backup

UCS uplink & N7K downlinkQoS SettingsCoS 0: 40% (VM Traffic)CoS 1: 10% (Mgmt)CoS 2: 10% (vMotion)CoS 4: 40% (Backup)

vmnic0 vmnic1

VMware Distributed vSwitch (DVS)

DVS portgroup VLAN 100, Mark as CoS 1

Mgmt

DVS portgroup VLAN 101, Mark as CoS 2

vMotion

Cisco VIC 1340 (4x10Gb port)

DVS portgroup Virtual Wire - Business Group 1

if DST IP = Backup Server mark as CoS 4 else CoS 0

DVS per PortGroup MarkingCoS 0: System: VM TrafficCoS 1: System: MgmtCoS 2: System: vMotionCoS 4: User-def: Backup

vmnic2 vmnic3

CISCO UCS QoS PolicesBandwidth Management & QoS Marking

UCS QoS Policy UP (Uplinks): CoS 0: 40% (VM Traffic) CoS 1: 10% (Mgmt) CoS 2: 10% (vMotion) CoS 4: 40% (Backup)

UCS QoS Policy 1 (vNIC 0,1): CoS 0: 20% (VM Traffic) CoS 1: 10% (Mgmt) CoS 2: 10% (vMotion) CoS 3: 40% (FCoE) CoS 4: 20% (Backup)

UCS all vNIC Templates: Host Control: None

vNIC0trunk

CoS0 20%CoS1 10%CoS2 10%CoS4 20%

vNIC1trunk

CoS0 20%CoS1 10%CoS2 10%CoS4 20%

End  to  End  Network  QoS – Final  Decision• Decision• Option  3  – QoS (802.1p)  marking  in  VDS  and  end-­2-­end  bandwidth  management  in  UCS

• Justification• Decision  is  fully  compliant  with  End  to  end  network  QoS requirement

• VXLAN  protocol  is  designed  to  keep  L2  CoS tags  by  copying  inner  Ethernet  header  into  outer  Ethernet  header    =>  virtual  overlay  CoS tag  is  kept  even  in  physical  network  underlay  and  it  can  be  leveraged  in  Cisco  UCS  bandwidth  management  (aka  DCB  ETS  -­ Enhanced  Transmission  Selection)  to  guarantee  bandwidth  for  particular  CoS traffics.  

• Single  vNIC in  VM  has  positive  impact  on

• NSX  Security  Policies

• Simple  In-­guest  OS  routing  (default  gateway  only)  without  need  for  additional  static  routes

• vRealize Automation  Custom  Integrations  are  simpler  (single  hostname,  simpler  integration  with  IPAM,  etc.)

• Impact• DVS  QoS Policy  (conditional  802.1p  marking)  has  to  be  configured  manually  for  each  DVS  portgroup used  as  NSX  virtual  wire  (aka  VXLAN)  – can  be  automated  by  custom  integration  (SOLUTION  IMPROVEMENT)

• Detail  Test  Plan  has  to  be  prepared  to  validate  correct  QoS behavior  (RISK  MITIGATION)

Questions  and  AnswersBlog  post  with  additional  details:http://blog.igics.com/2015/12/end-­to-­end-­qos-­solution-­for-­vmware.html

Twitter:  @david_pasek

Blog:  http://blog.igics.com