VMware andBrocade Network’Virtualization’ … vmware nsx...
-
Upload
truonghanh -
Category
Documents
-
view
310 -
download
1
Transcript of VMware andBrocade Network’Virtualization’ … vmware nsx...
Reference Design Guide 1
VMware® and Brocade® Network Virtualization Reference Whitepaper
Table of Contents
EXECUTIVE SUMMARY 2
VMWARE NSX WITH BROCADE VCS: SEAMLESS TRANSITION TO SDDC 2
VMWARE'S NSX NETWORK VIRTUALIZATION PLATFORM 3
OVERVIEW 3 COMPONENTS OF THE VMWARE NSX 4 DATA PLANE 4 CONTROL PLANE 5 MANAGEMENT PLANE 5 CONSUMPTION PLATFORM 5 FUNCTIONAL SERVICES OF NSX FOR VSPHERE 5
WHY DEPLOY BROCADE NETWORK FABRIC WITH VMWARE NSX 5
DESIGN CONSIDERATIONS FOR VMWARE NSX AND BROCADE NETWORK FABRIC 6
DESIGN CONSIDERATIONS FOR BROCADE NETWORK FABRIC 6 BROCADE VCS FABRIC AND VDX SWITCHES 6 SCALABLE BROCADE VCS FABRICS 6 FLEXIBLE BROCADE VCS FABRIC BUILDING BLOCKS FOR EASY MIGRATION 7 BROCADE VDX SWITCHES DISCUSSED IN THIS GUIDE 7 MIXED SWITCH FABRIC DESIGN 8 MULTI-‐FABRIC DESIGNS 10 DEPLOYING THE BROCADE VDX 8770 AND VCS FABRICS AT THE CLASSIC AGGREGATION LAYER 11 VCS FABRIC BUILDING BLOCKS 12
VMWARE NSX NETWORK DESIGN CONSIDERATIONS 16
DESIGNING FOR SCALE AND FUTURE GROWTH 16 COMPUTE RACKS 17 EDGE RACKS 19 INFRASTRUCTURE RACKS 24 LOGICAL SWITCHING 25 TRANSPORT ZONE 27 LOGICAL SWITCH REPLICATION MODES 27 LOGICAL SWITCH ADDRESSING 30 WITH NETWORK ADDRESS TRANSLATION 30 LOGICAL ROUTING 32 CENTRALIZED ROUTING 33 LOGICAL SWITCHING AND ROUTING DEPLOYMENTS 35
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 2
LOGICAL FIREWALLING, ISOLATION AND MICRO-‐SEGMENTATION 38 ADVANCED SECURITY SERVICE INSERTION, CHAINING AND STEERING 39 LOGICAL LOAD BALANCING 40 CONCLUSION 41
Executive Summary This document is targeted at networking and virtualization architects interested in deploying VMware Network virtualization in a vSphere hypervisor environment based on the joint solution from VMware NSX and Brocade “Virtual Cluster Switching” (VCS) technology. VMware’s Software Defined Data Center (SDDC) vision leverages core data center virtualization technologies to transform data center economics and business agility through automation and non-‐disruptive deployment that embraces and extends existing compute, network and storage infrastructure investments. VMware’s NSX is the component providing the networking virtualization pillar of this vision. With NSX customers can build an agile “overlay “infrastructure for Public and Private cloud environments leveraging Brocade’s robust and resilient “Virtual Cluster Switching” (VCS) for the physical “underlay” network. Together, Brocade and VMware help customers leverage the promise of VMware’s Software Defined Data Center (SDDC) vision to enable the power, intelligence, and analytics of networks with a flexible, end-‐to-‐end solution.
VMware NSX with Brocade VCS: Seamless Transition to SDDC New technologies and applications are driving constant change in organizations both large and small, and nowhere are the effects felt more keenly than in the network. Large-‐scale server virtualization is generating unpredictable bandwidth requirements driven by virtual machine (VM) mobility. The move toward cloud computing is demanding a high-‐performance network interconnect that can be driven by servers and VMs that number in the tens of thousands. Modern virtualized multi-‐tiered applications are generating massive levels of east/west inter-‐server traffic. Unfortunately, traditional network topologies and solutions were not designed to support these highly virtualized environments with mobile VMs and demanding modern workloads. VMware NSX has emerged as an attractive solution to these challenges, bringing dramatic improvements over the inefficiencies, rigidity, fragility, and management challenges of classic hierarchical Ethernet networks. For optimal performance, it is recommended to run NSX on a resilient physical network or fabric underlay for providing robust network connectivity. Brocade’s VCS® Fabric technology is ideal for this scenario, enabling organizations to migrate to a highly available and automated fabric at their own pace, without disrupting their existing data center network architecture. Here are some typical instances when customers may choose to transition to a Brocade network fabric as part of their evolution to NSX SDDC architecture:
• Transitioning from Gigabit Ethernet (GbE) to 10 GbE -‐ Many organizations are consolidating multiple workloads onto fewer more powerful servers, creating a demand for greater network bandwidth.
• Scaling the network -‐ The elasticity, manageability, flexibility, and scalability of Ethernet fabrics make them ideal for new virtualization and cloud computing environments.
• Adding storage -‐ Storage virtualization and those organizations developing Ethernet Storage Area Networks (SANs) require a true lossless fabric.
• Adopting Network Virtualization -‐ Network virtualization introduce additional parameters to set up and manage, and typically require new skill sets as well. Ethernet fabrics provide a simpler, highly resilient, low-‐latency foundation to virtualize the network to reach SDDC.
This combined solution by Brocade and VMware NSX delivers the required IT agility through automated, zero-‐touch VM discovery, configuration, and mobility, which is demanded by today’s constantly evolving workloads.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 3
VMware's NSX Network Virtualization Platform
Overview IT organizations have gained significant benefits as a direct result of server virtualization. Server consolidation reduced physical complexity, increased operational efficiency, and the ability to dynamically re-‐purpose underlying resources to quickly and optimally meet the needs of increasingly dynamic business applications are just a handful of the gains that have already been realized. Now, VMware’s Software Defined Data Center (SDDC) architecture is extending virtualization technologies across the entire physical data center infrastructure. VMware NSX, the network virtualization platform is a key product in the SDDC architecture. With NSX, virtualization now delivers for networking the same value and advantages it has provided for compute and storage. In much the same way that server virtualization programmatically creates, snapshots, deletes and restores software-‐based virtual machines (VMs), NSX network virtualization programmatically creates, snapshots, deletes, and restores software-‐based virtual networks. The result is a completely transformative approach to networking that not only enables data center managers to achieve orders of magnitude better agility and economics, but also allows for a vastly simplified operational model for the underlying physical network. With the ability to be deployed on any IP network, including both existing traditional networking models and next generation fabric architectures from any vendor, NSX is a completely non-‐disruptive solution.
Figure 1 Server and Network Virtualization Analogy
Figure 1 draws an analogy between compute and network virtualization. With server virtualization, a software abstraction layer (server hypervisor) reproduces the familiar attributes of an x86 physical server (e.g., CPU, RAM, Disk, NIC) in software, allowing them to be programmatically assembled in any arbitrary combination to produce a unique virtual machine (VM) in a matter of seconds. With network virtualization, the functional equivalent of a “network hypervisor” reproduces the complete set of Layer 2 to Layer 7 networking services (e.g., switching, routing, access control, firewalling, QoS, and load balancing) in software. As a result, these services can be programmatically assembled in any arbitrary combination, to produce unique, isolated virtual networks in a matter of seconds. Not surprisingly, similar benefits are also derived. For example, just as VMs are independent of the underlying x86 platforms and allow IT to treat physical hosts as a pool of compute capacity, virtual networks are independent of the underlying IP network hardware and allow IT to treat the physical network as a pool of transport capacity that can be consumed and repurposed on demand. Unlike legacy architectures, virtual networks can be provisioned, changed, stored, deleted and restored programmatically without reconfiguring the underlying physical hardware or topology. By matching the capabilities and benefits derived from familiar server and storage virtualization solutions, this transformative approach to networking unleashes the full potential of the software defined data center. With VMware NSX, you already have the network you need to deploy a next-‐generation software defined data center. This paper will highlight the design factors you should consider to fully leverage your existing network investment and optimize that investment with VMware NSX.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 4
Components of the VMware NSX VMware NSX is a distributed system. It consists of the components as shown in the Figure 1below:
Figure 2 NSX Components
Data Plane The NSX Data plane consists of the NSX vSwitch. The vSwitch in NSX for vSphere is based on the vSphere Distributed Switch (VDS) with additional components to enable rich services. The add-‐on NSX components include kernel modules (VIBs) which run within the hypervisor kernel providing services such as distributed routing, distributed firewall and enable VXLAN bridging capabilities.
The NSX VDS vSwitch abstracts the physical network and provides access-‐level switching in the hypervisor. It is central to network virtualization because it enables logical networks that are independent of physical constructs such as VLAN. Some of the benefits of the vSwitch are:
• Support for overlay networking with protocols such as VXLAN and centralized network configuration. Overlay networking enables the following capabilities:
o Creation of a flexible logical layer 2 (L2) overlay over existing IP networks on existing physical infrastructure without the need to re-‐architect any of the data center networks
o Provision of communication (east–west and north–south) while maintaining isolation between tenants o Application workloads and virtual machines that are agnostic of the overlay network and operate as if they were
connected to a physical L2 network
• NSX vSwitch facilitates massive scalability of hypervisors and their attached workloads.
• Multiple features—such as Port Mirroring, NetFlow/IPFIX, Configuration Backup and Restore, Network Health Check, QoS, and LACP—provide a comprehensive toolkit for traffic management, monitoring and troubleshooting within a virtual network.
Additionally, the data plane also consists of gateway devices that can either provide L2 bridging from the logical networking space (VXLAN) to the physical network (VLAN). The gateway device is typically a NSX Edge virtual appliance. NSX Edge offers L2, L3, perimeter firewall, load-‐balancing and other services such as SSL VPN, DHCP, etc.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 5
Control Plane The NSX control plane runs in the NSX controller and does not have any data plane traffic passing through it. The NSX controller nodes are deployed in a cluster of odd members in order to enable high-‐availability and scale. Any failure of the controller nodes does not impact any data plane traffic.
Management Plane The NSX management plane is built upon the NSX manager. The NSX manager provides the single point of configuration and is the target for REST API entry-‐points in a vSphere NSX environment.
Consumption Platform The consumption of NSX can be driven directly via the NSX manager UI which is available via the vSphere Web UI itself. Typically end-‐users tie in network virtualization to their cloud management platform for deploying applications. NSX provides a rich set of integration into virtually any CMP via the REST API. Out of the box integration is also available through VMware vRealize automation vRA (previously known as vCloud Automation Center).
Functional Services of NSX for vSphere In this design guide we will discuss how all of the components described above give us the following functional services:
• Logical Layer 2 – Enabling extension of a L2 segment / IP Subnet anywhere in the fabric irrespective of the physical network design
• Distributed L3 Routing – Routing between IP subnets can be done in a logical space without traffic going out to the physical router. This routing is performed in the hypervisor kernel with a minimal CPU / memory overhead. This functionality provides an optimal data-‐path for routing traffic within the virtual infrastructure. Similarly the NSX Edge provides a mechanism to do full dynamic route peering using OSPF and BGP with the physical network to enable seamless integration.
• Distributed Firewall – Security enforcement is done at the kernel and VNIC level itself. This enables firewall rule enforcement in a highly scalable manner without creating bottlenecks onto physical appliances. The firewall is distributed in kernel and hence has minimal CPU overhead and can perform at line-‐rate.
• Logical Load-‐balancing – Support for L4-‐L7 load balancing with ability to do SSL termination. • SSL VPN services to enable L2 VPN services.
Why Deploy Brocade Network Fabric with VMware NSX Open and Reliable Infrastructure: Brocade as a leader in the networking and data center space has over a decade of experience building high performance and reliable networks for the most demanding workloads and some of the world’s largest data centers. Brocade VDX switches support both open standards and more elegant options for customers in terms of deployment for cloud based architectures and the SDDC. For example Brocade supports standard Link aggregation for connection with legacy networking equipment but also offers Brocade Trunking for more efficient utilization of links and higher performance to better serve customer needs. Equal Cost Multipath (ECMP) is also supported to provide predictable performance and resiliency across the network as a whole. By supporting industry standards Brocade provides interoperability and consistency for customers while still being able to provide higher level functionality for particularly intensive SDDC environments that other network vendors don’t offer.
Agile: Brocade networks are highly agile and can start as simply as one switch which provides a foundation for a running image of the network. As additional network elements are added they inherit the running configuration. This provides a level of automation that allows users to scale their SDDC without having to configure each element. By leveraging ECMP, trunking and fabric elasticity it eliminates architectural complexity from small enterprise deployments to large multi-‐tenant cloud provider environments. With the ability to support up to 8,000 physical ports in a single domain and up to 384,000 MAC addresses in a single chassis you can build massively scalable Virtual environments that provide zero touch VM discovery, network configuration and VM mobility. VCS Fabric automation provides self-‐healing and self-‐provisioning capability that allows for customers to reduce up to 50% of the operational cost associated with traditional networks. This allows VMware customers to focus on the managing virtualized applications and infrastructure instead of the physical underlay.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 6
Efficient: Brocade VCS Fabric support Equal Cost Multipath (ECMP) and makes use of all links in the network with multipathing and traffic load balancing at all layers in the network. Brocade VDX switches provide the industry’s deepest buffers allowing customers to be confident that even when bursts of traffic occur at peak times the network can minimize latency and packet loss. By supporting 10/40/100GbE ports and efficient Layer1-‐3 load balancing Brocade networks ensure proper performance for even the largest most demanding environments.
Highly Manageable: Proactive network monitoring helps minimize business disruption by focusing on early indicators. Brocade’s capability to support sFlow monitoring and integration with VMware vRealize Operations & vRealize Operations Insight, users can understand where traffic is traversing through the fabric, where bandwidth is being the most heavily consumed, and most importantly—where potential hot spots are forming.
Design Considerations for VMware NSX and Brocade Network Fabric VMware NSX network virtualization can be deployed over existing data center networks. In this section, we discuss how the logical overlay networks using VXLAN encapsulation can be deployed over common data center network topologies. We first address requirements for the physical network and then look at the network designs that are optimal for network virtualization. Finally, the logical networks and related services and scale considerations are explained.
Design Considerations for Brocade Network Fabric
Brocade VCS Fabric and VDX Switches Brocade VCS fabrics provide advanced Ethernet fabric technology, eliminating many of the drawbacks of classic Ethernet networks in the data center. In addition to standard Ethernet fabric benefits, such as logically flat networks without the need for Spanning Tree Protocol (STP), Brocade VCS Fabric technology also brings advanced automation with logically centralized management. Brocade VCS Fabric technology includes unique services that are ideal for simplifying traffic in a cloud data center, such as scalable network multi-‐tenancy capabilities, automated VM connectivity and highly-‐efficient multipathing at Layers 1, Layer 2 and Layer 3 with multiple Layer 3 gateways.
The VCS architecture conforms to the Brocade strategy of “revolution through evolution,” therefore Brocade VDX switches with Brocade VCS Fabric technology connect seamlessly with existing data center Ethernet products, whether offered by Brocade or other vendors. At the same time, the VCS architecture allows newer datacenter solutions to be integrated quickly. For example, Brocade VDX switches are hardware-‐enabled to support emerging SDN protocols, such as Virtual Extensive LAN (VXLAN). Logical Chassis technology and northbound Application Programming Interfaces (APIs) can provide operationally scalable management and access to emerging management frameworks such as VMware vRealize Automation vRA (previously known as vCloud automation Center vCAC)
Scalable Brocade VCS Fabrics Brocade VCS fabrics offer dramatic improvements over the inefficiencies, inherent limitations, and management challenges of classic hierarchical Ethernet networks. Implemented on Brocade VDX switches, Brocade VCS fabrics drastically simplify the deployment and management of scale-‐out architectures.
• Brocade VCS fabrics are elastic, self-‐forming, and self-‐healing, allowing administrators to focus on service delivery instead of basic network operations and administration. All-‐active connections and load balancing throughout Layers 1–3 provide resilience that is not artificially hampered by arbitrary limitations at any network layer. The distributed control plane ensures that all nodes are aware of the health and state of their peers and that they forward traffic accordingly across the shortest path in the topology. Nodes can be added and removed non-‐disruptively, automatically inheriting predefined configurations and forming new links upon entry or removal of a node.
• Brocade VCS fabrics offers uniform, multidimensional scalability that enables the broadest diversity of deployment scenarios and operational flexibility. Large or small, Brocade VCS fabrics work and act the same, offering operational efficiencies that span a very wide range of deployed configurations and requirements.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 7
• Brocade VCS fabrics are easy to manage, with a shared control plane and unified management plane that allow the fabric nodes to function and to be managed as a single entity, regardless of fabric size. Open APIs and OpenStack support facilitate orchestration of VCS fabrics within The On-‐Demand Data Center™.
Brocade VCS fabrics offer considerable scale and capacity, as shown in Table 1.
Criteria Brocade Switches Number of switches in a cluster Up to 32 Number of ports in a cluster 8,000+ Switching fabric capacity 10.7+ Tbps Data forwarding capacity 7.7 Tbps MAC addresses 384,000 Maximum ports per switch 384 x 10GbE or 216 x 40GbE or 48 x 100GbE Table 1 Brocade VCS Fabric Scalability
Flexible Brocade VCS Fabric Building Blocks for Easy Migration Brocade VCS fabrics can be deployed as a large single domain or multiple smaller fabric domains can be configured to suit either application needs or administrative boundaries (see Figure 3). A single larger domain affords a simple, highly efficient configuration that avoids STP while smoothly supporting significant east-‐west traffic common to modern applications. Data Center Bridging (DCB) is supported on all nodes, allowing for unified storage access over Ethernet. Multiple Brocade VCS domains can be configured to easily scale out the data center, while offering multiple active Layer 3 gateways, contained failure domains, and MAC address scalability—all while avoiding STP.
Figure 3 Brocade VCS fabrics easily accommodate a wide range of configurations, from a single large VCS domain to multiple smaller domains.
Brocade VDX Switches Discussed in This Guide Brocade VDX 6720 Switch Available in both 1U and 2U versions, the Brocade VDX 6720 provides either 24 (1U) or 60 (2U) 1/10 GbE SFP+ ports, which can be acquired with the flexible and innovative Brocade Ports on Demand (PoD) licensing.
Brocade VDX 6730 Switch The Brocade VDX 6730 adds Fibre Channel (FC) support, with the 1U version offering 24 1/10 GbE SFP+
L3 G/W
L3 G/W
L3 G/W
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 8
ports and eight 8 gigabit-‐per-‐second (Gbps) FC ports, and the 2U version offering 60 1/10 GbE SFP+ ports and sixteen 8 Gbps FC ports. The Brocade VDX 6730 also supports PoD licensing.
Brocade VDX 6740 Switch The Brocade VDX 6740 offers 48 10 GbE SFP+ ports and four 40 GbE Quad SFP+ (QSFP+) ports in a 1U form factor. Each 40 GbE SFP+ port can be broken out into four independent 10 GbE SFP+ ports, providing an additional 16 10 GbE SFP+ ports, which can be licensed with Ports on Demand.
Brocade VDX 8770 Switch Available in 4-‐slot and 8-‐slot versions, the 100 GB-‐ready Brocade VDX 8770 dramatically increases the scale that can be achieved in Brocade VCS fabrics, with 10 and 40 Gigabit Ethernet wire-‐speed switching, numerous line card options, and the ability to connect over 8,000 server ports in a single switching domain.
As shown in Figure 4, organizations can easily deploy Brocade VCS fabrics at the access layer, incrementally expanding the fabric over time. As the Brocade VCS fabric expands, existing network infrastructure can remain in place, if desired. Eventually, the advantages of VCS fabrics can extend to the aggregation layer, delivering the benefits of Brocade VCS Fabric technology to the entire enterprise, while allowing legacy aggregation switches to be redeployed elsewhere. Alternatively, a VCS fabric can be implemented initially in the aggregation tier, leaving existing access tier switches in place.
Figure 4 Incremental deployment of Brocade VCS Fabric.in brown-‐field environment
Mixed Switch Fabric Design For dense server deployments and highly virtualized environments, multiple Brocade VDX switch types can be combined to form one single VCS Fabric and leverage administrative simplicity through a single logical chassis.. For instance, a small and cost-‐effective Brocade VCS fabric can be piloted using the family of Brocade VDX 6700 products alone—and eventually scaled out, using the Brocade VDX 8770, as the fabric grows and the organization moves toward deploying larger and larger virtualized environments and cloud services. The configuration shown in Figure 5 shows a typical “leaf-‐spine” fabric. Here the “leaf” or access layer uses the Brocade VDX 6740 switch links to provide redundant Gigabit Ethernet access to servers, while also providing redundant 10 Gigabit Ethernet links to the “Spine” layer which uses the Brocade VDX 8770s while the “Core” layer uses Brocade MLX series switches with MCT technology (For more information on Brocade Multi-‐chassis Trunking please see http://www.brocade.com/downloads/documents/white_papers/Brocade_Multi-‐Chassis_Trunking_WP.pdf). Note that the, Brocade VDX 8770 switches can be deployed both at the spine and leaf layers. When used as a leaf switch, the Brocade
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 9
VDX 8770 switch greatly expands the VCS fabric, providing high-‐volume connectivity for large numbers of servers. When used at the spine, the Brocade VDX 8770 can provide Layer 3 routing capabilities .Deploying Layer 3 routing at the spine layer shields the core switches from unnecessary routing traffic, thus enabling additional network scale and enhancing application performance. Multiple active Layer 3 gateways at the Spine layer provide high availability through an architecturally hierarchical, but logically flat, network.
Figure 5 The Brocade VDX 8770 switch can be added to existing small-‐to-‐medium-‐scale VCS fabrics at both the leaf and spine for additional scale while containing Layer 3 traffic.
For very data intensive applications with very low-‐latency requirements, the Brocade VDX 8770 switch can be paired with Brocade VDX 6730 switches for connecting to FC arrays, as shown in Figure 6. This highly-‐redundant dual-‐fabric configuration offers the benefits of both Brocade VCS fabrics and FC fabrics.
Brocade VDX 6740
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 10
Figure 6 Highly redundant, dual-‐fabric design for an HPC environment can consolidate multiple data stores into a single managed service.
Multi-‐fabric Designs The Brocade VDX 8770 can be used to accomplish phased data center deployments of VCS fabrics, or to accomplish truly massive scalability through multi-‐fabric Brocade VCS Fabric deployments. By deploying the Brocade VDX 8770 switch as a spine switch, multiple fabrics can be interconnected to provide additional scale and Layer 3 flexibility. Figure 7 illustrates separate fabrics built from Brocade VDX 6740 and 8770 switches. As shown, Virtual LAG (vLAG ) connect the separate fabric domains using both 40 Gbe connections and 10 Gbe DCB connections for storage access. Note : Link aggregation allows you to bundle multiple physical Ethernet links to form a single logical trunk providing enhanced performance and redundancy. The aggregated trunk is referred to as a Link Aggregation Group (LAG). Virtual LAG (vLAG) is a feature that is included in Brocade VCS Fabric technology that extends the concept of LAG to include edge ports on multiple VCS switches.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 11
Figure 7 Illustrates separate fabrics built from Brocade VDX 6700 and 8770 switches.
Deploying the Brocade VDX 8770 and VCS Fabrics at the Classic Aggregation Layer Many medium-‐to-‐large data centers are looking for opportunities to move toward cloud computing solutions, while realizing the benefits of Ethernet fabrics. Often these organizations need to improve the performance of their existing networks, but they also want to protect investments in existing networking technology. Even in traditional hierarchical deployment scenarios, the combination of the Brocade VDX 8770 switch and Brocade VCS Fabric technology can offer significant benefits in terms of future-‐proofing the network, advancing network convergence, and offering a migration path to 40 GbE, eventually, 100 GbE technologies.
The Brocade VDX 8770 switch can provide many advantages, especially for those organizations that are tied to tiered network architecture for now, but want to deploy hybrid architecture for investment protection. Deploying a Brocade VCS fabric at the traditional aggregation layer can dramatically improve the performance of the existing network, while protecting both investments in existing infrastructure as well as new investments in Brocade VCS technology. Advantages of deploying Brocade VCS Fabric technology at the traditional aggregation layer include:
• Multiple Layer 3 gateways for redundancy and optimal load balancing • Standard Layer 2 and Layer 3 functionality • Wire-‐speed performance • High-‐density 10GbE 40GbE and 100GbE • ~4 µsec latency within the VCS fabric • Resiliency through high availability • Reduced demand on core switches for east-‐west traffic
Brocade MLX Series
Core
vLAG
Brocade VDX 8770
Brocade VDX 6740
10Gbps DCB FCOE/iSCSI Storage
Brocade VDX 8770
With 1Gbps, 10Gbps and
DCB Connectivity
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 12
Figure 8 Dual Brocade VDX 8770 switches configured as a VCS fabric at the aggregation/distribution layer convey many benefits to traditional tiered networks.
VCS Fabric Building Blocks Multiple data center templates can be defined, tested and deployed out a common set of building blocks. This promotes reusability of building blocks (and technologies), reduces testing and simplifies support. VCS Fabric flattens the network using Brocade’s VCS Fabric technology. Within a single fabric, both layer 2 and layer 3 switching are available on any or all switches in the fabric. VCS Fabric of ToR switches can be configured to create a layer 2 fabric with layer 2 links to an aggregation block. In this set of building blocks the aggregation and access switching are combined into a single VCS Fabric of VDX switches. A single fabric is a single logical management domain simplifying configuration of the network.
VCS Fabric Topologies Fabric topology is also flexible. For example, a leaf-‐spine topology is a good design choice for virtualized data centers where consistent low latency, constant bandwidth is required between end devices. Fabric resiliency is automatic so link or port failures on inter-‐switch links or Brocade ISL Trunks are detected and traffic is automatically rerouted on the remaining least cost paths. Below is an example of a leaf-‐spine topology for a VCS Fabric.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 13
Figure 9 Leaf-‐Spine VCS Fabric Topology with L3 at Spine
Each leaf switch at the bottom is connected to all spine switches at the top. The connections are Brocade ISL Trunks for resiliency which can contain up to 16 links per trunk. All servers can connect with each other with two switch hops in between. As shown, all leaf switches are at layer 2 and spine switches create the L2/L3 boundary. However, the L2/L3 boundary can be at the leaf switch as well as shown below.
Figure 10 Leaf-‐Spine VCS Fabric Topology with L3 at Leaf
In this option, VLAN traffic is routed across the spine and each leaf switch includes layer 3 routing services. Brocade ISL Trunks continue to provide consistent latency and large cross-‐sectional bandwidth with link resiliency. However, ECMP at layer 3 provides multipath forwarding rather than ECMP at layer 2. An alternative is a collapsed spine typically using VDX 8770 switches as shown below.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 14
Figure 11 Collapsed Spine VCS Fabric Topology
The VDX 8770 is a modular chassis switch with high density of 10 GbE and/or 40 GbE ports. A collapsed spine topology can be an efficient building block for server virtualization with NAS storage pools. Multiple racks of virtualized servers and NAS servers are connected to a middle of row (MoR) or end of row (EoR) cluster of VDX 8770 switches. The collapsed spine topology lends itself to data center scale out that relies on pods of compute, storage and networking connected to a common data center routing core. For cloud computing environments, pod-‐based scale-‐out architectures are attractive. The following describe several VCS Fabric building blocks.
VCS Fabric Leaf-‐Spine Topology A VCS Fabric leaf-‐spine topology can be used to create a scalable fabric with consistent latency, high bandwidth multipath switch links and automatic link resiliency. This block forms the spine with each spine switch connecting to all leaf switches. Fabric connections in red are Brocade ISL Trunks with up to 16 links per auto-‐forming trunk. Layer 2 traffic moves across the fabric while layer 3 traffic exits the fabric on port configured for a routing protocol. As shown by the black arrows, uplinks to the core router would be routed, for example using OSPF. And connection to an IP Services block would also use layer 3 ports on spine switches. The blue links show layer 2 ports that can be used to attach NAS storage to the spine switches. This option creates a topology for NAS storage that is similar to best practices for SAN storage fabrics based on a core/edge topology. For most applications, storage IOPS and bandwidth is less per server than a NAS port can service. An economical use of NAS ports, particularly when using 10 GbE ports, is to fan-‐out multiple servers to each NAS port. Therefore, attaching NAS storage nodes to the spine switches facilitates this architecture.
Figure 12 VCS Fabric, Spine Block, Leaf-‐Spine Topology
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 15
Collapsed Spine This is a collapsed spine with a two switch VCS Fabric. Typically, high port count modular switches such as the VDX 8770 series would be used. This block works efficiently for data centers that scale-‐out by replicating a pod of compute, storage and networking. Each pod is connected via layer 3 routing to the data center core routers. Local traffic within the pod does not transit the core routers, but inter-‐pod traffic does. The collapsed spine uses VRRP/VRRP-‐E for IP gateway resiliency with the VCS Fabric providing layer-‐2 resiliency. As shown, the collapsed spine can be used effectively when connecting a large number of compute nodes to NAS storage as is commonly found in cloud computing environments and data analytics configurations such as a Hadoop cluster. The blue arrows represent 10 GbE links that use vLAG for link resiliency within the VCS Fabric and NIC Teaming for NAS server and compute server resiliency. As shown, IP Services blocks can be attached to the spine switches providing good scalability for load balancing and IDS/IPS services.
Figure 13 VCS Fabric Spine: Collapsed Spine Topology
These VDX switches as leaf nodes can be used with the VCS Fabric Spine for Leaf-‐Spine Topology. They can also be used to convert the VCS Fabric Spine for Collapsed Spine into a Leaf-‐Spine topology.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 16
VMware NSX Network Design Considerations Network virtualization consists of three major aspects; decouple, reproduce, and automate. All three functions are vital in achieving the desired efficiencies. This section focuses on decoupling, which is key to simplifying and scaling the physical infrastructure. While the NSX network virtualization solution can be successfully deployed on top of different network topologies, the focus for this document is on the Arista routed access design where the leaf/access nodes provide full L3 functionality. In this model the network virtualization solution should not span VLANs beyond a single rack inside the switching infrastructure and provide the VM mobility with overlay network topology.
Designing for Scale and Future Growth When designing a new environment, it is essential to choose an architecture that allows for future growth. The approach presented is intended for deployments that begin small with the expectation of growth to a larger scale while retaining the same overall architecture. This network virtualization solution does not require spanning of VLANs beyond a single rack. Elimination of this requirement has a widespread impact on the design and scalability of the physical switching infrastructure. Although this appears to be a simple requirement, it has widespread impact on how a physical switching infrastructure can be built and on how it scales. Note the following three types of racks within the infrastructure:
• Compute • Edge • Infrastructure
Figure 14 Data Center Design -‐ layer-‐3 in Access Layer
In Figure 14, to increase the resiliency of the architecture, Brocade recommends to deploy a pair of ToR switches in each rack and leverage technologies such as Brocade vLAG to dual connect them to all the servers which are part of the same rack.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 17
Compute Racks Compute racks are the section of the infrastructure where tenant virtual machines are hosted. Central design characteristics include:
• Interoperability with an existing network • Repeatable rack design • Connectivity for virtual machines without use of VLANs • No requirement for VLANs to extend beyond a compute rack
A hypervisor typically sources three or more types of traffic. This example consists of VXLAN, management, vSphere vMotion, and storage traffic. The VXLAN traffic is a new traffic type that carries all the virtual machine communication, encapsulating it in the UDP frame. The following section will discuss how the hypervisors connect to the external network and how these different traffic types are commonly configured. Connecting Hypervisors The servers in the rack are connected to the access layer switch via a number of Gigabit Ethernet (1GbE) interfaces or 10GbE interfaces. Physical server NICs are connected to the virtual switch on the other end. For best practices on how to connect the NICs to the virtual and physical switches, refer to the VMware vSphere Distributed Switch Best Practices technical white paper. http://www.vmware.com/files/pdf/techpaper/vsphere-‐distributed-‐switch-‐best-‐practices.pdf The connections between each server in the rack and the leaf switch are usually configured as 802.1q trunks. A significant benefit of deploying VMware NSX network virtualization is the drastic reduction of the number of VLANs carried on those trunk connections.
Figure 15. Example -‐ Host and Leaf Switch Configuration in a Rack
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 18
In Figure 15, 802.1q trunks are now used for carrying few VLANs, each dedicated to a specific type of traffic (e.g., VXLAN tunnel, management, storage, VMware vSphere vMotion®). The leaf switch terminates and provides default gateway functionality for each VLAN; it has a switch virtual interface (SVI or RVI) for each VLAN. This enables logical isolation and clear separation from an IP addressing standpoint. The hypervisor leverages multiple routed interfaces (VMkernel NICs) to source the different types of traffic. Please refer to the “VLAN Provisioning” section for additional configuration and deployment considerations of VMkernel interfaces. VXLAN Traffic After the vSphere hosts have been prepared for network virtualization using VXLAN, a new traffic type is enabled on the hosts. Virtual machines connected to one of the VXLAN-‐based logical layer-‐2 networks use this traffic type to communicate. The traffic from the virtual machine is encapsulated and sent out as VXLAN traffic. The external physical fabric never detects the virtual machine IP or MAC address. The virtual tunnel endpoint (VTEP) IP address is used to transport the frame across the fabric. In the case of VXLAN, the tunnels are initiated and terminated by a VTEP. Traffic that flows between virtual machines in the same data center is typically referred to as east–west traffic. For this type of traffic, both the source and destination VTEP are situated in hypervisors located in compute racks. Traffic leaving the data center will flow between a tenant virtual machine and an NSX Edge, and is referred to as north–south traffic. VXLAN configuration requires an NSX VDS vSwitch. One requirement of a single-‐VDS–based design is that the same VLAN ID is defined for each hypervisor to source VXLAN encapsulated traffic (VLAN ID 88 in the example in Figure 15). Because a VDS can span hundreds of hypervisors, it can reach beyond a single leaf switch. Note that the use of the same VLAN ID does not mean that the different VTEPs across hypervisors are necessarily in the same broadcast domain (i.e. VLAN). It simply means they encapsulate their traffic using the same VLAN ID. The host VTEPs -‐ even if they are on the same VDS —can use IP addresses in different subnets, thus offering the capability to leverage an end-‐to-‐end L3 fabric. Management Traffic Management traffic can be categorized into two types; one is sourced and terminated by the management VMkernel interface on the host, the other is involved with the communication between the various NSX components. The traffic that is carried over the management VMkernel interface of a host includes the communication between vCenter Server and hosts as well as communication with other management tools such as NSX Manager. The communication between the NSX components involves the heartbeat between active and standby edge appliances. Management traffic stays inside the data center. A single VDS can span multiple hypervisors that are deployed beyond a single leaf switch. The management interfaces of hypervisors participating in a common VDS and connected to separate leaf switches could reside in the same or in separate subnets. vSphere vMotion Traffic During the vSphere vMotion migration process, the running state of a virtual machine is transferred over the network to another host. The vSphere vMotion VMkernel interface on each host is used to move this virtual machine state. Each vSphere vMotion VMkernel interface on the host is assigned an IP address. The number of simultaneous vMotion migrations than can be performed is limited by the speed of the physical NIC. On a 10GbE NIC, eight simultaneous vSphere vMotion migrations are allowed. Note: VMware has previously recommended deploying all the VMkernel interfaces used for vMotion as part of a common IP subnet. This is not possible when designing a network for network virtualization using layer-‐3 at the access layer, where it is mandatory to select different subnets in different racks for those VMkernel interfaces. Until VMware officially relaxes this restriction, it is recommended that customers requiring vMotion over NSX go through VMware's RPQ (“Request for Product Qualification”) process so that the customer's design can be validated on a case-‐by-‐case basis. Storage Traffic A VMkernel interface is used to provide features such as shared or non-‐directly attached storage. Typically this is storage that can be attached via an IP connection (e.g., NAS, iSCSI) rather than FC or FCoE. The same rules that apply to management traffic apply to storage VMkernel interfaces for IP address assignment. The storage VMkernel interface of servers inside a rack (i.e., connected to a
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 19
leaf switch) is part of the same subnet. This subnet cannot span beyond this leaf switch, therefore the storage VMkernel interface IP of a host in a different rack is in a different subnet. For an example of the IP address for these VMkernel interfaces, refer to the “VLAN Provisioning” section.
Edge Racks Tighter interaction with the physical infrastructure occurs while bridging between the overlay world and the physical infrastructure. The main functions provided by an edge rack include:
• Providing on-‐ramp and off-‐ramp connectivity to physical networks • Connecting with VLANs in the physical world • Hosting centralized physical services
Tenant-‐specific addressing is exposed to the physical infrastructure where traffic is not encapsulated in VXLAN (e.g., NAT not used at the edge). In the case of a layer-‐3 edge, the IP addresses within the overlay are exposed to the physical fabric. The guiding principle in these cases is to separate VXLAN (overlay) traffic from the un-‐encapsulated (native) traffic. As shown in Figure 16, VXLAN traffic hits the data center internal Ethernet switching infrastructure. Native traffic traverses a dedicated switching and routing infrastructure facing the WAN or Internet and is completely decoupled from the data center internal network.
Figure 16 . VXLAN Traffic and the Data Center Internal Ethernet Switching Infrastructure
To maintain the separation, NSX Edge virtual machines can be placed in NSX Edge racks, assuming the NSX Edge has at least one native interface. For routing and high availability, the two interface types—overlay and native—must be examined individually. The failover mechanism is based on the active standby model, where the standby Edge takes over after detecting the failure of the active Edge. Layer-‐3 NSX Edge Deployment Considerations When deployed to provide layer-‐3 routing services, the NSX Edge terminates all logical networks and presents a layer-‐3 hop between the physical and the logical world. Depending on the use case, either NAT or static/dynamic routing may be used to provide connectivity to the external network. In order to provide redundancy to the NSX Edge, each tenant should deploy a HA redundant pair of NSX Edge devices. There are three models of HA redundancy supported by NSX : Stateful Active/Standby HA, Standalone and ECMP with the latter one
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 20
representing a newer functionality introduced from NSX SW release 6.1 onward.
Figure below highlights the reference topology that will be used for describing the various HA models for the NSX Edge router deployment between the Distributed Logical Router (DLR) and the physical network.
Figure 17 : Reference Topology for NSX Edge HA models
The next three sections illustrate briefly the HA models mentioned above. Stateful Active/Standby HA Model This is the redundancy model where a pair of NSX Edge Services Gateways is deployed for each tenant; one Edge functions in Active mode (i.e. actively forwards traffic and provides the other logical network services), whereas the second unit is in Standby state, waiting to take over should the active Edge fail. Health and state information for the various logical network services is exchanged between the active and standby NSX Edges leveraging an internal communication protocol. The first vNIC interface of type “Internal” deployed on the Edge is used by default to establish this communication, but the user is also given the possibility of explicitly specifying the Edge internal interface to be used. Note: it is mandatory to have at least one Internal interface configured on the NSX Edge to b able to exchange keepalives between the Active and Standby units. Deleting the last Internal interface would break this HA model. The Figure 18 below highlights how the Active NSX Edge is active both from a control and data plane perspectives. If the Active NSX Edge fails (for example because of an ESXi host failure), both control and data planes must be activated on the Standby unit that takes over the active duties.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 21
Figure 18 : NSX Edge Active Standby HA Model (left) and Traffic Recovery (right)
Standalone HA Model (NSX 6.0.x Releases) The standalone HA model inserts two independent NSX Edge appliances between the DLR and the physical network and it is supported when running NSX 6.0.x SW releases.
Figure 19 : NSX Edge Standalone HA Model
In this case, both NSX Edge devices are active, both from a data and control planes point of view and can establish routing adjacencies with the physical router and the DLR Control VM. However, in all the 6.0.x NSX SW releases the DLR cannot support Equal Cost Multi-‐Pathing. As a consequence, even when receiving routing information from both
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 22
NSX Edges for IP prefixes existing in the physical network, the DLR only installs in its forwarding table one possible next-‐hop (active path). This implies that all traffic in the south-‐to-‐north direction will only flow through a single NSX Edge and cannot leverage both appliances. Traffic load balancing may instead happen in the north-‐to-‐south direction since most physical routers and switches are ECMP capable by default.
Figure 20 : Traffic Flows with Standalone HA model
ECMP HA Model (NSX 6.1 Release Onward)
NSX software release 6.1 introduces support for a new Active/Active ECMP HA model, which can be considered the improved and evolved version of the previously described Standalone one.
Figure 21 : NSX Edge ECMP HA Model (Left) and Traffic Recovery after Edge Failure (right)
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 23
In the ECMP model, the DLR and the NSX Edge functionalities have been improved to support up to 8 equal cost paths in their forwarding table. Focusing for the moment on the ECMP capabilities of the DLR, this means that up to 8 active NSX Edges can be deployed at the same time and all the available control and data planes will be fully utilized, as shown in Figure 21.
This HA model provides two main advantages:
1. An increased available bandwidth for north-‐south communication (up to 80 Gbps per tenant). 2. A reduced traffic outage (in terms of % of affected flows) for NSX Edge failure scenarios.
Notice from the diagram in Figure 21 that traffic flows are very likely to follow an asymmetric path, where the north-‐ to-‐south and south-‐to-‐north legs of the same communications are handled by different NSX Edge Gateways. The DLR distributes south-‐to-‐north traffic flows across the various equal cost paths based on hashing of the source and destination IP addresses of the original packet sourced by the workload in logical space. The way the physical router distributes north-‐to-‐south flows depends instead on the specific HW capabilities of that device.
Traffic recovery after a specific NSX Edge failure happens in a similar fashion to what described in the previous standalone HA model, as the DLR and the physical routers would have to quickly time out the adjacency to the failed unit and re-‐hash the traffic flows via the remaining active NSX Edge Gateway.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 24
Infrastructure Racks Infrastructure racks host the management components, including vCenter Server, NSX Manager, NSX Controller, CMP, and other shared IP storage–related components. It is key that this portion of the infrastructure does not have any tenant-‐specific addressing. If bandwidth-‐intense infrastructure services are placed in these racks—IP-‐based storage, for example—bandwidth of these racks can be dynamically scaled, as discussed in the “High Bandwidth” subsection of the “Data Center Fabric Attributes” section. VLAN Provisioning Every compute rack has four different subnets, each supporting a different traffic type; tenant (VXLAN), management, vSphere vMotion, and storage traffic. Provisioning of IP addresses to the VMkernel NICs of each traffic type is automated using vSphere host profiles. The host profile feature enables creation of a reference host with properties that are shared across the deployment. After this host has been identified and required sample configuration performed, a host profile can be created and applied across in the deployment. This allows quick configuration of a large numbers of hosts. As shown in , the same set of four VLANs—storage, vSphere vMotion, VXLAN, management—is provided in each rack.
Figure 22 : Host Infrastructure Traffic Types and IP address Assignment
Table 2 IP Address Management and VLANs
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 25
Multi-‐tier Edges and Multi-‐tier Application Design Considerations Classical multi-‐tier compute architectures have functions that are logically separated, where each function has different requirements for resource access, data segregation, and security. Classical three-‐tier compute architecture typically comprises a presentation tier, an application or data access tier, and a database tier. Communication between the application tier and the database tier should be allowed, while an external user has access to only the presentation tier, which is typically a web-‐based service. The recommended solution to comply with data access policies is to deploy a two-‐tier edge design. The inner edge enables VXLAN-‐to-‐VXLAN east–west traffic among the presentation, database, and application tiers, represented by different logical networks. The outer edge connects the presentation tier with the outer world for on-‐ramp and off-‐ramp traffic. Communication within a specific logical network enables virtual machines to span across multiple racks to achieve optimal utilization of the compute rack infrastructure. Note: At the current time, a logical network can span only a single vCenter domain. Figure 23 shows the placement of the logical elements of this architecture.
Figure 23 : Two Options for Logical Element Placement in a Multitier Application
It is preferable that the outer edges be physically placed in the edge racks. Inner edges can be centralized in the Edge racks or distributed across the compute racks, where web and application compute resources are located.
Logical Switching The logical switching capability in the NSX platform provides customers the ability to spin up isolated logical layer-‐2 networks with the same flexibility and agility as spinning up virtual machines. This section describes the various components that enable logical switching and the communication among those components.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 26
Components As shown in figure below , there are three main components that help decouple the underlying physical network fabric and provide network abstraction. This decoupling is achieved by encapsulating the virtual machine traffic using the VXLAN protocol.
Figure 24 : Logical Switching Components
NSX Manager The NSX Manager is the management plane component responsible for configuring logical switches and connecting virtual machines. It also provides API interface, which automates deployment and management of these switches through a cloud management platform. Controller Cluster The Controller Cluster in the NSX platform is the control plane component responsible for managing the hypervisors’ switching and routing modules. The Controller Cluster consists of controller nodes that manage specific logical switches. The controller manages the VXLAN configuration mode. Three modes are supported which are: unicast, multicast and hybrid. The recommendation and details of these modes are discussed in the section “Logical Switch Replication Mode”. It is important to note that the data path (VM user traffic) does not go through controller even though controller is responsible for managing the VTEP configuration. In “Unicast mode” there is no need for multicast support from the physical network infrastructure. Consequently in “Unicast mode” there are no requirements to provision multicast group IP addresses or enable PIM routing or IGMP snooping features on physical switches or router.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 27
Transport Zone As part of the host preparation process, the hypervisor modules are deployed and configured through the NSX Manager. After logical switching components are installed and configured, the next step is to define the span of logical switches by creating a transport zone. The transport zone consists of a set of clusters. For example, if there are 10 clusters in the data center, a transport zone can include some or all of those 10 clusters. In this scenario a logical switch can span the whole data center. Figure below shows a deployment after the NSX components are installed to provide logical switching. The Edge Services Router in the edge rack provides the logical switches access to the WAN and other network services.
Figure 25 : Logical Switching components in the racks
Logical Switch Replication Modes When two VMs connected to different ESXi hosts need to communicate directly, unicast VXLAN encapsulated traffic is exchanged between the VTEP IP addresses associated to the two hypervisors. Traffic originated by a VM may need to be sent to all the other VMs belonging to the same logical switch, specifically for three types of layer-‐2 traffic:
• Broadcast • Unknown Unicast • Multicast
Note: These types of multi-‐destination traffic types may be referred to using the acronym BUM (Broadcast, Unknown unicast, Multicast). In an NSX deployment with vSphere, there should never be a need to flood unknown unicast traffic on a given logical network since the NSX controller is made aware of the MAC addresses of any actively connected VM. For these three scenarios, traffic originated by a given ESXi host must be replicated to multiple remote hosts (hosting other VMs part of the same logical network). NSX supports three different replications modes to enable multi-‐destination communication on VXLAN backed logical switches – unicast, hybrid and multicast. By default a logical switch inherits its replication mode from the transport zone, however this can be overridden. Unicast Mode For unicast mode replication, the ESXi hosts part of the NSX domain is divided in separate groups (segments) based on IP subnet addresses of VTEP interfaces. The NSX controller selects a specific ESXi host in each segment to serve as the Unicast Tunnel End Point (UTEP). The UTEP is responsible for replicating multi-‐destination traffic to all the ESXi hosts in its segment (i.e., whose VTEPs belong to the same subnet of the UTEP’s VTEP interface) and to all the UTEPs belonging to different segments.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 28
In order to optimize the replication behavior, every UTEP will replicate traffic only to ESXi hosts on the local segment that have at least one VM actively connected to the logical network where multi-‐destination traffic is destined. In addition, traffic will only be replicate to the remote UTEPs if there is at least one active VM connected to an ESXi host part of that remote segment. The NSX controller is responsible for providing to each ESXi host the updated list of VTEPs address for replication of multi-‐destination traffic. Unicast mode replication requires no explicit configuration on the physical network to enable distribution of multi-‐destination VXLAN traffic. This mode is well suited for smaller deployments with fewer VTEPs per segment and few physical segments. It may not be suitable for extremely large scaled environments as the overhead of replication increases with the number of segments. Figure below illustrates a unicast mode logical switch. In this example there are 4 VMs on logical switch 5001. When VM1 sends a frame to all VMs on the logical switch, the source VTEP replicates the packet only to the other VTEPs belonging to the local segment and to the UTEPs part of remote segments (only one remote segment is shown in this example).
Figure 26 : Unicast Mode Logical Switch
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 29
Multicast Mode When Multicast mode is chosen for the logical switch, NSX relies on the layer-‐2 and layer-‐3 multicast capabilities of the physical network to ensure VXLAN traffic is sent to all the VTEPs. In this mode layer-‐2 multicast is used to replicate traffic to all VTEPs in the local segment (i.e., VTEP IP addresses are part of the same IP subnet). IGMP snooping must be configured on the physical switches. It is recommended to have an IGMP querier per VLAN. To ensure multicast traffic is delivered to VTEPs in a different subnet from the source VTEP, multicast routing and PIM must be enabled. Figure below shows a multicast mode logical switch. IGMP snooping enables the physical switch to replicate multicast traffic to all VTEPs in the segment and PIM allows multicast traffic to be delivered to VTEPs in remote segments. Using multicast mode eliminates additional overhead on the hypervisor as the environment scales.
Figure 27 : Multicast Mode Logical Switch
Hybrid Mode Hybrid Mode offers the simplicity of unicast mode (i.e., no IP multicast routing configuration in physical network) while leveraging the layer-‐2 multicast capabilities of physical switches. In hybrid mode, the controller selects one VTEP per physical segment to function as a Multicast Tunnel End Point (MTEP). When a frame is sent over VXLAN to VTEPs in multiple segments, the MTEP creates one copy per MTEP and forwards the encapsulated frame to the IP address of the remote MTEPs. The source MTEP also encapsulates one copy of the original frame with an external destination IP address of the multicast address associated with the logical switch. This is then sent to the upstream physical switch. Layer-‐2 multicast configuration in the physical network is used to ensure that the VXLAN frame is delivered to all VTEPs in the local segment. This is illustrated in Figure below where the MTEP in segment 10.20.10.0/24 sends one copy to the MTEP in segment 10.20.11.0/24.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 30
Figure 28 : Hybrid Mode Logical Switch
Logical Switch Addressing IP address management is a critical task in a large cloud environment with multiple tenants or big enterprises with multiple organizations and applications. This section focuses on IP address management of the virtual machines deployed on the logical switches. Each logical switch is a separate layer-‐2 broadcast domain that can be associated with a separate subnet using private or public IP space. Depending on whether private or public space is used for the assignment to the logical networks, users must choose either the NAT or non-‐NAT option on the NSX Edge Services Router. The IP address assignment depends on whether the virtual machine is connected to a logical switch through a NAT or a non-‐NAT configuration.
With Network Address Translation In the deployments where organizations have limited IP address space, NAT is used to provide address translation from private IP space to the limited public IP addresses. An Edge Services Router can provide individual tenants with the ability to create distinct pools of private IP addresses, which may be mapped to the publicly routable external IP address of the external NSX Edge Services Router interface. Figure below shows a three-‐tier app deployment with each tier virtual machine connected to separate logical switch. The web, app and DB logical switches are connected to the three internal interfaces of the NSX Edge Services Router; its external interface is connected to the Internet via an external data center router.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 31
Figure 29 : NAT and DHCP configuration on NSX Edge Services Router
Configuration details of the NSX Edge Services Router include: • Web, app and DB logical switches are connected to the Internal interfaces of the NSX Edge Services Router • The NSX Edge Services Router uplink interface is connected to the VLAN port group in subnet 192.168.100.0/24 • Enable DHCP service on this internal interface of by providing a pool of IP addresses (e.g., 10.20.10.10 to 10.20.10.50) • The NAT configuration on the external interface enables VMs on a logical switch to communicate with devices on the external
network. This communication is allowed only when the requests are initiated by the VMs connected to the internal interface of the NSX Edge Services Router
In situations where overlapping IP and MAC address support is required, one NSX Edge Services Router per tenant is recommended. Figure below shows an overlapping IP address deployment with two tenants and two separate NSX Edge Services Routers.
Web
External Network
VM
VM
VM
VM
DB App
VM
VM
Perimeter NSX Edge Services
VLAN
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 32
Figure 30 : Overlapping IP and MAC addresses
Without Network Address Translation The static and dynamic routing features of the NSX platform are appropriate for organizations that are not limited by routable IP addresses, have VMs with public IP addresses, or do not want to deploy NAT.
Logical Routing The NSX platform supports two different modes of logical routing, known as distributed routing and centralized routing. Distributed routing provides better throughput and performance for the east-‐west traffic while centralized routing handles north-‐south traffic. This section will provide more details on the two modes as well as describe common routing topologies. For the additional network services required for the applications in the datacenter please refer to the logical firewall and logical load balancer sections.
Distributed Routing The distributed routing capability in the NSX platform provides an optimized and scalable way of handling east-‐west traffic within a data center. Communication between virtual machines or resources within the datacenter is referred to as east-‐west traffic. The amount of east-‐west traffic in the data center is growing. The new collaborative, distributed, and service oriented application architecture demands higher bandwidth for server-‐to-‐server communication. If these servers are VMs running on a hypervisor and are connected to different subnets, the communication must go through a router. If a physical router is used to provide routing services, the VM communication must get to the physical router and return to the server after routing decision. This suboptimal traffic flow is referred to as “hairpinning”.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 33
The distributed routing on the NSX platform prevents the hairpinning by providing hypervisor level routing functionality. Each hypervisor has a routing kernel module that performs routing between the logical interfaces (LIFs) defined on that distributed router instance. The components section below describes both the various modules in distributed routing and the communication between them.
Centralized Routing The NSX Edge Services Gateway provides the traditional centralized routing support in the NSX platform. Along with the routing services the NSX Edge Services Router also supports other network services including DHCP, NAT, and load balancing.
Routing Components Figure below show the multiple components for logical routing. Some of the components are related to distributed routing and some others to centralized routing. The figure below illustrates the components involved in logical routing and the interaction between those components.
Figure 31 : Logical Routing Components
Logical routing component interaction steps (reference figure above) 1. Dynamic Routing protocol is configured on the logical router (LR) instance
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 34
2. Controller pushes new Logical router (LR) configuration including Logical Interfaces (LIFs) to the ESXi hosts 3. OSPF/BFP Peering is established between the NSX edge and Logical router (LR) control VM. The protocol address is used for
control communication 4. Routing tables learnt are pushed to the controller cluster for distribution 5. Controller sends the route updates to the hosts 6. Routing kernel modules on the hosts handle the data path traffic
NSX Manager The NSX Manager helps configure and manage logical routing services. Deployment is possible as either a distributed or centralized logical router. If distributed router is selected, the NSX Manager deploys the logical router control VM and pushes the logical interface configurations to each host through the controller cluster. In the case of centralized routing, NSX Manager simply deploys the NSX Edge Services Router VM. The API interface of the NSX Manager helps automate deployment and management of these logical routers through a cloud management platform. Logical Router Control VM The logical router control VM is the control plane component of the routing process. It supports the dynamic routing protocols OSPF and BGP. The logical router control VM communicates with the next hop router using the dynamic routing protocol and pushes the learned routes to the hypervisors through the controller cluster. High Availability (HA) may be configured while deploying the control VM. Two VMs are deployed in active-‐standby mode when HA mode is selected. Logical Router Kernel Module The logical router kernel module is configured as part of the preparation process through the NSX manager. The kernel modules are similar to the line cards in a modular chassis supporting layer-‐3 routing. The kernel modules have routing information base (RIB) that is pushed through the controller cluster. Data plane functionality of route and ARP entry lookup is performed by the kernel modules. Controller Cluster The Controller cluster is responsible for distributing routes learned from the control VM across the hypervisors. Each controller node in the cluster takes responsibility of distributing the information for a particular logical router instance. In a deployment where there are multiple logical router instance deployed, the load is distributed across the controller nodes. NSX Edge Services Router The NSX Edge Services Router is the centralized services router that provides support DHCP, NAT, firewall, load balancing, and VPN capabilities along with routing protocols OSPF and BGP.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 35
Figure 32 : Logical Routing Components in the Racks
Logical Switching and Routing Deployments Various topologies can be built using logical switching and logical routing features of the NSX platform. Examples for two routing topologies that utilize both distributed and centralized logical routing capabilities are provided:
• Physical Router as Next Hop • Edge Services Router as Next Hop
Physical Router as Next Hop As shown in figure below, an organization is hosting multiple applications and wants to provide connectivity among the different tiers of the application as well as to the external network. Separate logical switches provide layer-‐2 network connectivity for the VMs in the particular tier. The distributed logical routing configuration allows the VMs on two different tiers to communicate with each other. Dynamic routing protocol support on the logical router enables the exchange of routes with the physical next hop router. This allows external users to access the applications connected to the logical switches in the data center.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 36
Figure 33 : Physical router as next hop
NSX Edge Services Gateway as Next Hop A service provider environment may have multiple tenants with each requiring a different number of isolated logical networks or network services such as load balancers, firewalls, and VPNs. In these environments, the NSX Edge Services Gateway provides network services capabilities along with dynamic routing protocol support. Figure below shows two tenants connected to the external network through the NSX Edge Services Router. Each tenant has its logical router instance that provides routing within the tenant. The dynamic routing protocol configuration between the tenant logical router and the NSX Edge Services gateway provides the connectivity from the tenant VMs to the external network.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 37
Figure 34 : Tenants connected via Edge Router
In this example the NSX Edge Services Gateway establishes a single routing adjacency with the routers on the physical infrastructure, independent from the number of the deployed logical router instances. The east-‐west traffic routing is handled by the distributed router in the hypervisor and the north-‐south traffic flows through the NSX Edge Services Gateway. Scalable Topology
The service provider topology can be scaled out as shown in figure below. The diagram shows nine tenants served by NSX Edge on the left and the other nine by the Edge on the right. The service provider can easily provision another NSX Edge to serve additional tenants.
Figure 35 : Scalable Topology
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 38
Logical Firewalling, Isolation and Micro-‐segmentation The VMware NSX platform includes distributed kernel enabled firewalling (DFW) with line rate performance, virtualization, and identity aware with activity monitoring. Other network security features native to network virtualization are also available. Network Isolation Isolation is the foundation of most network security, providing solutions for compliance, containment, or interaction of distinct environments. Access lists (ACLs) and firewall rules on physical devices have traditionally been used to enforce isolation policies. As defined by the Distributed Firewall (DFW), individual virtual networks are isolated from other virtual network segments, as well as from the underlying physical network, by default -‐ conforming to the security principles of least privilege and the “zero-‐trust” model. Virtual networks are created in isolation and remain isolated unless specifically connected together by policy. No physical subnets, VLANs, ACLs, or firewall rules are required to enable this isolation. The isolation controls and management interfaces of the DFW exist in the hypervisor and are not accessible through the workload data plane. An isolated virtual network can be made up of workloads distributed anywhere in the data center. Workloads in the same virtual network can reside on the same or separate hypervisors. Workloads in multiple isolated virtual networks can reside on the same hypervisor. Isolation between virtual networks occurs at layer 2 of the OSI stack and thus allows for overlapping IP addresses and ranges among isolated segments. One use of this capability makes it possible to have isolated development, test, and production virtual networks, each with different application versions, but with the same IP addresses, all operating at the same time on the same underlying physical infrastructure. Virtual networks are also isolated from the underlying physical infrastructure. Because traffic between hypervisors is encapsulated, physical network devices operate in a distinct address space from the workloads connected to the virtual networks. For instance, a virtual network could support IPv6 application workloads on top of an IPv4 physical network. This isolation also protects the underlying physical infrastructure from any possible attack initiated by workloads attached to any virtual network and functions independently from any VLANs, ACLs, or firewall rules that would traditionally be required. Network Segmentation Segmentation is easy with network virtualization. Segmentation is related to isolation but is typically applied within a multi-‐tier virtual network. Network segmentation is traditionally a function of a physical firewall or router, designed to allow or deny traffic between network segments or tiers (e.g., segmenting traffic between a web tier, application tier, and database tier). Traditional processes for defining and configuring segmentation are time consuming and highly prone to human error, resulting in a large percentage of security breaches. Implementation requires deep and specific expertise in device configuration syntax, network addressing, application ports, and protocols. Additionally, traditional network segmentation relies on correct topology for the placement of expensive physical control devices that are not originally designed for the management challenge of distributed operation within the data center perimeter. Network segmentation, like isolation, is a core capability of VMware NSX network virtualization. A virtual network can support a multi-‐tier network environment through definition by the NSX DFW. Examples include multiple layer-‐2 segments with layer-‐3 segmentation or micro-‐segmentation on a single layer-‐2 segment using distributed firewall rules. These could represent a web tier, application tier and database tier. Physical firewalls and access control lists deliver a proven segmentation function, trusted by network security teams and compliance auditors. Confidence manual processes continues to fall as attacks, breaches, and downtime attributed to human error rise In a virtual network, network services (e.g., layer-‐2, layer-‐3, ACL, firewall, QoS) that are provisioned with a workload are programmatically created and distributed to the hypervisor vSwitch. Network services, including layer-‐2 and 3 segmentation and firewalling, are enforced at the virtual interface. Communication within a virtual network never leaves the virtual environment, removing the requirement for network segmentation to be configured and maintained in the physical network or firewall.
Taking Advantage of Abstraction
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 39
Network security has traditionally required the security team to have a deep understanding of network addressing, application ports, and protocols as they are bound to network hardware, workload location, and topology. Network virtualization abstracts application workload communication from the physical network hardware and topology, allowing network security to break free from these physical constraints and apply network security based on user, application, and business context.
Advanced Security Service Insertion, Chaining and Steering The base VMware NSX network virtualization platform provides basic stateful firewalling features to deliver segmentation within virtual networks. In some environments, there is a requirement for more advanced network security capabilities. In these instances, customers can leverage VMware NSX to distribute, enable, and enforce advanced network security services in a virtualized network environment. NSX distributes network services into the vSwitch to form a logical pipeline of services applied to virtual network traffic. Third party network services can be inserted into this logical pipeline, allowing physical or virtual services to be directly consumed. Network security teams are often challenged to coordinate network security services from multiple vendors in relationship to each other. A powerful benefit of the NSX approach is its ability to build policies that leverage NSX service insertion, chaining, and steering to drive service execution in the logical services pipeline, based on the result of other services. This makes it possible to coordinate otherwise completely unrelated network security services from multiple vendors.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 40
Logical Load Balancing Load balancing is another network service available within NSX. This service offers distribution workload across multiple servers as well as high-‐availability of applications:
Figure 36 : NSX Load Balancing
The NSX load balancing service is specially designed for cloud, being fully programmable via API and offering a common, central point of management and monitoring as other NSX network services. The NSX load balancing service provides the following functionality:
• Multiple architecture support (one-‐armed/proxy mode or two-‐armed/inline mode) • Large feature set • Broad TCP application support, including LDAP, FTP, HTTP, and HTTPS • Multiple load balancing distribution algorithms; round robin, least connections, source IP hash, and URI • Health checks for TCP, HTTP, and HTTPS including content inspection • Persistence through source IP, MSRDP, cookie, and SSL session-‐id • Throttling of maximum connections and connections/sec • L7 manipulation including URL block, URL rewrite, and content rewrite • Optimization of SSL offload Each NSX Edge scales up to: • Throughput: 9Gbps • Concurrent connections: 1 million • New connections per second: 131k
Figure below details examples of tenants with different applications with different load balancing needs. Each of these applications is hosted in the same cloud with the network services offered by NSX.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 41
Figure 37 : NSX Load Balancing
The NSX load balancing service is fully distributed. Multiple benefits from this architecture include:
• Each tenant has its own load balancer. • Individual tenant configuration changes do not impact other tenants. • A load increase on one tenant load balancer does not impact the scalability of other tenant’s load balancers. • Each tenant load balancing service can scale up to the maximum performance limits.
When utilizing load balancing services, other network services are still fully available. The same tenant can mix its load balancing service with other network services such as routing, firewalling, and VPN.
Conclusion The VMware network virtualization solution addresses current challenges with physical network infrastructure and brings flexibility, agility and scale through VXLAN-‐based logical networks. Along with the ability to create on-‐demand logical networks using VXLAN, the vCloud Networking and Security Edge gateway helps users deploy various logical network services such as firewall, DHCP, NAT and load balancing on these networks. This is possible due to its ability to decouple the virtual network from the physical network and then reproduce the properties and services in the virtual environment. Brocade VCS Ethernet Fabric is the ideal network underlay technology to support VMware NSX in Cloud Datacenter due to following reasons:
• Brocade leveraged decades of experience in building fabric based networks to design and build VCS Ethernet Fabric as high performance, reliable and highly automated physical network underlay to deploy VMware NSX. With features like ECMP, VCS Fabric can provide predictable performance and resiliency across network as whole,
• Customers deploying VMware NSX can leverage VMapper feature available in Brocade VCS Fabric to automatically provision and configure underlying physical network. This feature by integrating Automatic Port Profile Migration (AMPP) with VMware infrastructure enables automatic propagation of networking policies to physical underlay.
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net
Reference Design Guide 42
• Simple architecture of Brocade VCS Fabric brings agility and scalability to make this an ideal underlay for VMware NSX. Customer deploying NSX can start as simply as one switch to build small network and without changing underlay architecture will be able to build large scale network to run VMware NSX across their Cloud Datacenter.
• Brocade IP Analytics Management pack for vRealize Operations Manager, Brocade IP Content Pack for vRealize Operations Insight, and other tools such as Brocade Network Advisor will help in proactively monitor the network to minimize business disruption and making the physical network underlay efficient. The Brocade VCS integration with vRealize Operations Manager is complementary to VMware NSX integration with vRealize Operations Manager to help customers to manage their Cloud Network as a whole.
Brocade VCS Fabric in combination with VMware NSX has emerged as an attractive solution to the challenges brought by new technologies and applications. This combined solution is driving and bringing dramatic improvements over the inefficiencies, rigidity, fragility, and management challenges of classic hierarchical Ethernet networks. Organizations are finding Brocade VCS Fabric to address key requirements such as scaling the network, transitioning from Gigabit Ethernet (GbE) to 10 GbE and beyond, deploying Ethernet/IP Storage and Network Virtualization. Brocade® VCS® Fabric technology and VMware NSX, are ideal for these scenarios, enabling organizations to migrate to a highly automated, fabric-‐based design at their own pace, without disrupting their existing data center network architecture. References
I. VMware NSX Product and Partner Resources http://www.vmware.com/products/nsx/resources.html
II. VMware® NSX Network Virtualization Design Guide http://www.vmware.com/files/pdf/products/nsx/vmw-‐nsx-‐network-‐virtualization-‐design-‐guide.pdf
III. Brocade VCS Fabric Design Guide: Selecting and Deploying Brocade VDX Switches http://www.brocade.com/forms/getFile?p=documents/design-‐guides/deploying-‐vdx-‐switches-‐dg.pdf
Core Telecom Systems partnered with Brocade (888) 375-8826 www.coretelecom.net