How to Design a Scalable Private Cloud
-
Upload
afcom -
Category
Technology
-
view
877 -
download
0
Transcript of How to Design a Scalable Private Cloud
This presentation was given during the Spring, 2012 Data Center World Conference and Expo. Contents contained are owned by AFCOM and Data Center World and can only be reused with the express permission of ACOM. Questions or for permission contact: [email protected].
Interested in learning more about Cloud?
Look at the Cloud sessions offered at the upcoming Fall 2012 Data Center World Conference at:
www.datacenterworld.com.
How to Design a Scalable Private Cloud
Mark SandDatacenter ArchitectCitrix Systems Inc.
• Private vs. Public Clouds (Infrastructure as a Service - IaaS)• The private cloud is a virtual environment deployed within an organization that
is restricted to users within the company and usually resides behind the corporate firewall. The private cloud also consists of an easy to use web portal that allows end users to auto provision and manage the lifecycle of their VMs, and may or may not incorporate a chargeback model.
• The Public cloud is a virtual environment that is publically available for any consumer to purchase computing resources, usually on a pay per use basis, via an easy to use web portal. The public cloud allows any consumer to purchase, manage, and monitor the lifecycle of their VMs through a user friendly web portal.
Defining the Private & Public Clouds
• Proper planning and design are critical components to successfully implementing a scalable Cloud environment
• Here are some key design areas that we will address:• Capacity planning and sizing• Virtual Platform (hypervisor) • Datacenter locations (will this be a global Cloud or hosted from one DC)• Networking • SAN (NAS/Fibre)• Server Hardware• Power• Monitoring & Management Solutions• Documenting the solution
Designing the Cloud Infrastructure
• Accurate capacity planning and sizing will ensure that you implement a scalable, supportable, and successful environment
• Key sizing criteria:• Number of VMs you are looking to host per virtual server• Number and types of clusters/pools• Estimated yearly growth for VMs• Amount of storage required to host all of the VMs for current and future growth• Amount of estimated network bandwidth required to host the VMs for current
and future growth
Capacity Planning and Sizing the Environment
• Cluster/pool(s) configuration:• We support a mix of 2,4,8, and 16GB VMs in each of our cluster/pool(s)• We average approximately 20 VMs per host
Current Capacity and Sizing Example
Cluster/Pool Number of Hosts Total Storage
Production 20 20TBs
QA 8 9.5TBs
Dev 11 15TBs
DMZ 6 4TBs
DR 15 4TBs
• Average Yearly Growth Statistics:• VMs account for approximately 85% of our yearly server growth
• We add approximately 5 -10 TBs of storage (spread across all cluster/pool(s))
• We have not needed to add any additional network bandwidth since the environment was implemented
Current Capacity and Sizing cont.
• Selecting the proper virtual platform (hypervisor):• There are several hypervisors out there that have benefits and drawbacks so
each organization should choose whichever option best fits their needs
• Datacenter Locations:• Determine if the cloud will be hosted from several global datacenters or if it will
be hosted from one central datacenter • If the cloud will be hosted from different locations then it is also important to
follow a set of standards for each of the areas we will be talking about (network, storage, server HW, etc.)
Virtual Platform & Datacenter Locations
• US Private Cloud• We currently have a large private cloud environment that is hosted out of our
corporate datacenter as well as a smaller private cloud that is hosted in two additional datacenters in the US
• Global Private Cloud• We currently have a private cloud environment in three of our regional
datacenters
• Global Standards•We have standardized on the same server hardware/configuration and networking devices for the global private cloud; however, we were required to create two different storage standards
Datacenter Locations Example
• Define the type of uplinks that will be used:• 1GB Uplink• Multiple 1GB uplinks configured as a port channel• 10GB uplink
• Number/type of uplinks for each of the hosts functions:• Virtual Server Management Interface• VM traffic• NFS/iSCSI traffic for environments utilizing NAS
• Utilize redundant uplinks from separate switches
• Evaluate the proper size & number of VLANs required
Network Design
• Network Components• Management Network
•2 x switches with 2 x 1GB uplinks connected to each switch. Each switch is connected to a different distribution layer switch to ensure network redundancy
• VM Traffic•2 x blade switches with 4 x 1GB uplinks configured as a 2GB port channel is connected to each switch.
We have three dedicated /24 VLANs for new VMs, and we also trunk existing VLANs to the switches in order to account for servers that were P2V’ed and are unable to change their IP address
• Storage Traffic (regional datacenters only)•2 x blade switches with 4 x 1GB uplinks configured as a 2GB port channel is connected to each switch
Network Description Example
Network Diagram Example – Corporate DC
Note: Storage is connected via an HBA to our fibre channel SAN (not depicted here)
Network Diagram Example – Regional DCs
Note: Storage for the regional servers are connected to our NAS via NFS
• NAS vs. Fibre arrays:• Each technology has benefits and drawbacks, so each organization should
choose whichever option best fits their needs
• Define a standard LUN size
• Define a standard naming convention when creating LUN/volume(s)
• For NAS ensure to configure a dedicated VLAN for VM storage traffic
• For fibre channel SANs ensure that you have two independent SAN fabrics (A&B) & utilize multipathing
SAN/NAS Design
• Two fully populated blade enclosures connect into our fibre channel SAN via 4GB SAN switches
• We standardized on 2TB LUNs (storage repositories / data stores) in our corporate DC and 1TB LUNs regionally
Storage Example – Corporate DC
Storage Diagram Example – Corporate DC
• Scale Out vs. Scale Up Methodologies•Scale Out - several host servers are configured with standard to moderate
virtualization specs (2 x CPUs & 48 to 128GBs of RAM) that make up a pool/cluster• Pros: The servers are less expensive so you can usually grow the pool faster, and you will sustain less downtime
for VMs if a server fails • Cons: There are more servers to manage in each pool/cluster
•Scale Up - only a few host servers are configured with large virtualization specs (4 CPUs or greater & 128GBs of RAM or greater) that can handle a large number of VMs• Pros: You can run a large number of VMs on the host server due to the vast resources each server has available• Cons: The servers are costly so you will likely not be able to grow the pool/cluster as fast, and you will potentially
have a larger outage for VMs if a host fails
Server Hardware
• Minimum specs for virtualization (blade or rack mount):• 2 x Quad Core CPUs• 48GBs of RAM (96GBs or greater is preferred for large environments)• Enough 1GB/10GB NICs that will allow you to have two connections to each
uplink so you can bond the NICs for redundancy• HBA for servers that will connect to the SAN via fibre
• Ensure you plan for an additional host server to account for failover (HA) for each cluster/pool
Server Hardware Cont.
Server Diagram Example – Corporate DC
• Server specs:• 2 x Six Core CPUs• 96GBs of RAM• 6 x NICs (2 x embedded & 1 quad
port mezzanine card)• 1 x dual port HBA mezzanine card
• Interconnect specs:• 4 x network switches (1GB)• 2 x 4GB San switches• 1 x 1GB Ethernet pass-thru
module (for backups)
Server Diagram Example – Regional DC
• Server specs:• 2 x Quad Core CPUs• 96GBs of RAM• 8 x NICs (2 x embedded, 1 quad
port & 1 dual port NIC mezzanine card)
• Interconnect specs:• 6 x network switches (1GB)
• It is important to properly size the power circuits the host servers will use since they draw more power than standard servers
• Ensure that the environment utilizes two load balanced circuits or two independent circuits for redundancy
• Ensure each circuit is terminated from a different feeder
• Separate the virtual host servers into at least two different racks
Power Design
Power Diagram Example – Corporate DC
• Each rack contains:• 2 x L6-30 208v (A&B) Single Phase Circuits• Each A&B circuit is load balanced• 4 x 30amp208v single phase PDUs
• The two blade enclosures that house all of the virtual hosts are located in two different racks
• The health of the virtual environment is critical so it is key to monitor and alert on some of the following areas:• Physical hardware - detect if a DIMM, disk, CPU, etc. goes bad• VMs – verify they are online and not over utilized/subscribed• Virtual Platform – detect failures within the hypervisor• Capacity – verify that each host/cluster/pool is not running out of resources
(storage, RAM,CPUs, etc.) that would prevent provisioning new VMs
• It usually requires a mix of native and 3rd party tools to successfully monitor all aspects of a virtual environment
Monitoring Solution
• Centralized VM and host management is extremely important; however, all of the major virtualization vendors do provide a centralized management solution
• Auto provisioning of VMs• This is a key component of the Cloud and is not always adequately addressed
by the centralized management solution provided by the virtualization vendors• This also often requires a combination of custom (internal) developed
applications and 3rd party products• A good provisioning tool will take into account the utilization of a charge back
model for VMs, as well as, address proper approvals to control the growth of VMs
Management Solution
• How to address VM sprawl?• Place proper controls/approvals on who and how many VMs a user can request• Automatically track the number, hostname, and type of VM a user creates via
the self/auto provisioning process• Monitor the utilization of all VMs, and then either automatically power the
underutilized VMs down or follow-up with the VM owner
• We have had our own challenges with trying to implement a fully automated solution that incorporates all of our needs, and this is something that large companies within the IT industry have struggled with as well.
Management Solution cont.
• During the design and implementation phase of the environment it is important to take detailed notes and diagram each of the phases
• A good design document will provide a clear and concise view of how all aspects of the environment is configured
• When we handoff any environment to our Operations team we provide a detailed design doc, a runbook, and then hold an official handoff meeting to cover any questions or concerns the Operations team may have.
Documenting the solution
Questions?
This presentation was given during the Spring, 2012 Data Center World Conference and Expo. Contents contained are owned by AFCOM and Data Center World and can only be reused with the express permission of ACOM. Questions or for permission contact: [email protected].
Interested in learning more about Cloud?
Look at the Cloud sessions offered at the upcoming Fall 2012 Data Center World Conference at:
www.datacenterworld.com.