Storage Consolidation: 10,000+ VMs on a Single Storage System
Bob Handlin, Principal Product Manager, ZFS Storage Anderson Souza,
Senior ZFS Software Engineer Erik Swimm, ZFS Storage SC and Veteran
Corporate ZFS Storage User October 1, 2014 Copyright 2014, Oracle
and/or its affiliates. All rights reserved. |
Slide 3
Safe Harbor Statement The following is intended to outline our
general product direction. It is intended for information purposes
only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and
should not be relied upon in making purchasing decisions. The
development, release, and timing of any features or functionality
described for Oracles products remains at the sole discretion of
Oracle.
Slide 4
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. |Oracle Confidential Internal/Restricted/Highly
Restricted4 Oracle ZS3 Award-Winning Analytics and Record- Setting
Performance and Efficiency VM Delivers Unprecedented Scale and
Simplicity for Contemporary IT Fast. Simple. Efficient. Engineered
for Optimal Performance.
Slide 5
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Program Agenda Why Storage Matters When Deploying VMs
Booting Over 16,000 Simultaneous VMs Real-World Experiences with
Oracle ZFS Where to Learn More 1 2 3 4
Slide 6
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Program Agenda Why Storage Matters When Deploying VMs
Booting Over 16,000 Simultaneous VMs Real-World Experiences with
Oracle ZFS Five innovations in Oracle ZFS Storage 2 3 4 1
Slide 7
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Whats the Issue? VMware explains* Oracle Confidential
Internal7 Storage ALWAYS comes up in virtualization conversation as
a pain point Fundamental problem: Architecture Virtualization has
changed the way shared platforms are used Virtualization forced the
storage to look for new ways to build systems Hogan and Epping,
Essential Virtual SAN, 2015, Pearson Education.
Slide 8
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Storage vendors talk about it, too LUN OS AP OS AP OS
AP OS AP OS AP Traditional Virtualized LUN OS AP LUN OS AP LUN OS
AP LUN OS AP LUN OS AP LUN OS AP LUN OS AP LUN OS AP LUN OS AP LUN
OS AP LUN OS AP LUN OS AP LUN OS AP LUN OS AP LUN OS AP LUN OS AP
LUN OS AP LUN OS AP LUN OS AP LUN OS AP Mixed Workload Singular
Workload Manual Tuning Automation 1000x
Slide 9
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Why Storage Matters in a VM Environment Application
performance SLAs demand fast storage Under layers of
virtualization, all roads might lead to one disk! VMs exacerbate
this be adding layers the storage admin cant see Better: Serve data
from fastest possible media automatically You cant control cost
without consolidation Storage systems need to support max number of
apps without buckling Issue: Each VM creates stresses throughout
the storage system Consolidation adds risk 1000s of VMs per system
is a large troubleshooting domain You MUST be able to find the
needle in the haystack or all users suffer SLAs live and die on
storage performance
Slide 10
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Evolving to a Virtual Datacenter O/S is Built on
Storage (Source: VMware) Datacenter Infrastructure LinuxWindows
Management vServices Future Application vServices Infrastructure
vServices vComputevStoragevNetwork AvailabilitySecurityScalability
Off-Premise Cloud Cloud vServices
Slide 11
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Storage became increasingly opaque During the
virtualization process...
Slide 12
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Storage CIFSNFS O/S Application Virtual Server HBAHBA
NICNIC O/S Application Virtual Server Hypervisor VHBAVHBA VHBAVHBA
O/S Application Virtual Server HBAHBA NICNIC O/S Application
Virtual Server Hypervisor VHBAVHBA VHBAVHBA O/S Application Virtual
Server HBAHBANICNIC O/S Application Virtual Server Hypervisor
VHBAVHBA VHBAVHBA O/S Application Virtual Server HBAHBANICNIC O/S
Application Virtual Server Hypervisor VHBAVHBA VHBAVHBA O/S
Application Virtual Server HBAHBANICNIC O/S Application Virtual
Server Hypervisor VHBAVHBA VHBAVHBA O/S Application Virtual Server
HBAHBANICNIC O/S Application Virtual Server Hypervisor VHBAVHBA
VHBAVHBA O/S Application Virtual Server HBAHBANICNIC O/S
Application Virtual Server Hypervisor VHBAVHBA VHBAVHBA O/S
Application Virtual Server HBAHBANICNIC O/S Application Virtual
Server Hypervisor VHBAVHBA VHBAVHBA O/S Application Virtual Server
HBAHBANICNIC O/S Application Virtual Server Hypervisor VHBAVHBA
VHBAVHBA Properly-Designed Storage Solves the Challenges Unknown
virtualization impact Application unaware storage Application
engineered Delivers virtualization aware analytics Underpowered
storage servers Leverages SMP architecture Unpredictable storage
performance Serves I/O out of the fastest media
Slide 13
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Ideal architecture for virtualization Serves VM images
straight from DRAM, the fastest media available SMP OS eliminates
conventional filer CPU congestion Automated caching and data
placement (DRAM, flash, disk) provide management simplicity Deep
analytics provide 73% faster problem resolution* VMware and
OpenStack API, and RESTful API simplify integration For VMs:
Architecture = Automation = Simplicity Architecture = Automation =
Simplicity Oracle Confidential NDA Required
*https://go.oracle.com/LP=4205/?elqCampaignId=6667
Slide 14
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Program Agenda Why Storage Matters When Deploying VMs
Booting Over 16,000 Simultaneous VMs Real-World Experiences with
Oracle ZFS Where to Learn More 1 2 3 4
Slide 15
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Program Agenda Why Storage Matters When Deploying VMs
Booting Over 16,000 Simultaneous VMs Real-World Experiences with
Oracle ZFS Where to Learn More 1 2 3 4
Slide 16
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Traditional NAS Storage Architecture Boot Storm
Overview CPU SAS SATA SSD NVRAM Cache R/W CPU Hybrid Storage Pool
Storage System CPU MEM NIC VM Hypervisor Booting 1,000 VMs Storage
Components Storage Components Read/Write Cache CPU Disk Latency
Bottleneck!!! 17 Minutes
Slide 17
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Traditional NAS Storage Architecture -Only one cache
(tier-1) for R/W operations -First tier cache based on NVRAM
devices -NVRAM devices are expensive and slower than DRAM
-Architecture is NOT designed to work with multiple tiers of
storage media CPU SAS SATA SSD NVRAM Cache R/W CPU Hybrid Storage
Pool Storage System Storage Components Storage Components
Read/Write Cache CPU Disk Latency Bottleneck!
Slide 18
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | FLASH Microsecond Response Time DISK Millisecond
Response Time Storage Media Latencies Intelligent Caching Serve I/O
from the Fastest Available Media DRAM Nanosecond Respond Time
Slide 19
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Oracle ZFS Storage Appliance Dynamic Caching LRU MRU
LFU MFU MLC NAND L2ARC Adaptive I/O Staging SLC NAND Sync I/O Only
SLC NAND Sync I/O Only Adaptive Throttle Algorithm Determine PIPE
Size Adaptive Throttle Algorithm Determine PIPE Size Evicted Read
Flash Write Flash SAS-2 (15, 10K, 7K) 2TB DRAM
Slide 20
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Oracle ZFS Storage Appliance Architecture CPU MEM NIC
VM Hypervisor Read Cache 1 ARC Read Cache 2 L2ARC Write Cache
LogZilla CPU Disk Latency Booting 1,000 VMs 38.9 Seconds Head1
Head2
Slide 21
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Oracle ZFS Storage Appliance Architecture CPU MEM NIC
VM Hypervisor Read Cache 1 ARC Read Cache 2 L2ARC Write Cache
LogZilla CPU Disk Latency Booting 16,000 VMs 7 Minutes Head1
Head2
Slide 22
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Oracle ZFS Storage Appliance for Virtualization Provide
excellent performance and throughput as well as broader bandwidth
to meet the requirements of critical applications running in
virtualized environments. Provide a realistically scaled amount of
of L1 cache based on DRAM that is complemented by a large L2 cache
extension based on SSD disks. Use intelligent cache architecture
for fast reading and writing operations (especially helpful for
random I/O workloads originated by virtual machines and
hypervisors). Are designed to handle heavy I/O load for
virtualization environments, providing better performance and
throughput than traditional NAS storage architecture. Head1
Head2
Slide 23
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Booting Thousands of VMs with Oracle ZFS Storage
Appliance USE CASE 1
Slide 24
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Program Agenda Why Storage Matters When Deploying VMs
Booting Over 16,000 Simultaneous VMs Real-World Experiences with
Oracle ZFS Where to Learn More 1 2 3 4
Slide 25
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Program Agenda Why Storage Matters When Deploying VMs
Booting Over 16,000 Simultaneous VMs Real-World Experiences with
Oracle ZFS Where to Learn More 1 2 3 4
Slide 26
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Real-World Experiences with Oracle ZFS Oracle
Confidential Internal 26
Slide 27
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Oracle ZFS Storage Grew Like a Weed In Our Data Center
Major telecommunications firm Deployed Oracle Database Machine
(ODA), and Oracle ZFS Storage for expansion Noticed how well suited
ZFS is for virtualization workloads Started deploying Oracle ZFS
Storage for virtualization as well
Slide 28
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Oracle ZFS Storage had the key features we needed In
Memory Deduplication VMs were built from the same base images Dedup
ratio was extremely high. Large read cache Combined with dedup so
bulk of reads came from SSD Lifesaver during a boot storms
Replication Added the ability to replicate VMs DR or migration
between sites
Slide 29
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Snapshots Used In addition to normal backups Allowed
for quick recovery of VMs. Integration with Oracle VM Oracle ZFS
Storage plugin Enabled additional features (thin clones and
snapshots) to Oracle VM. Oracle ZFS Storage had the key features we
needed
Slide 30
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Production Environment Oracle ZFS 7420 cluster each
node with 32 cores, 512GB RAM, 4 500GB SSD read cache and 4
dual-port 10GbE Six high capacity storage shelves w/ 20 3TB HDDs
and 4 write SSDs Two storage pools Twelve X2-4 40-core x 512GB
servers running OVM 3.1 Twelve T4-4 32-core x 512GB servers running
Solaris 11 w/ OVM SPARC X2-4 Servers Oracle ZFS Storage Capacity
Disk Shelf Oracle VMs Oracle Enterprise Manager
Slide 31
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | The Result: A highly robust implementation 400 high
availability VMs configured on a single server. Pulled the power on
the primary server and timed the failover to the other node. Under
5 minutes, but the limiting factor with booting 400 VMs was not the
storage
Slide 32
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Program Agenda Why Storage Matters When Deploying VMs
Booting Over 16,000 Simultaneous VMs Real-World Experiences with
Oracle ZFS Where to Learn More 1 2 3 4
Slide 33
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Program Agenda Why Storage Matters When Deploying VMs
Booting Over 16,000 Simultaneous VMs Real-World Experiences with
Oracle ZFS Where to Learn More 1 2 3 4
Slide 34
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Oracle ZS3 Storage addresses key pain points Your
storage admin should look to add it to the mix Automated caching
keep the right things in the fastest media (DRAM) to support
performance SLAs True Symmetric Multiprocessing (SMP) manages VM
boot storms, bottlenecks, hot spots, and I/O saturation to
eliminate filer sprawl Advanced storage analytics allow you to find
and eliminate issues 73% faster for maximum uptime* *Source:
Strategic Focus.
https://go.oracle.com/LP=4206/?elqCampaignId=6667
Slide 35
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. | Where to Learn More Demo Pods 3913 and 3914, and
several live HOL pods Solution Brief: Oracle ZFS Storage for
Virtualization and Cloud
http://www.oracle.com/us/products/servers-storage/zfs-storage-cloud-virtualization-
2225371.pdf?ssSourceSiteId=otnen
http://www.oracle.com/us/products/servers-storage/zfs-storage-cloud-virtualization-
2225371.pdf?ssSourceSiteId=otnen White Paper: Deploying 10,000+
VMware Virtual Machines with Oracle ZFS Storage Appliance
http://www.oracle.com/technetwork/server-storage/sun-unified-storage/documentation/deploy-10k-vm-
zfssa-2245248.pdf
http://www.oracle.com/technetwork/server-storage/sun-unified-storage/documentation/deploy-10k-vm-
zfssa-2245248.pdf White Paper: Architectural Overview of the Oracle
ZFS Storage Appliance
http://www.oracle.com/technetwork/server-storage/sun-unified-storage/documentation/o14-001-
architecture-overview-zfsa-2099942.pdf
http://www.oracle.com/technetwork/server-storage/sun-unified-storage/documentation/o14-001-
architecture-overview-zfsa-2099942.pdf Strategic Focus: Usability
Comparison: Oracle ZFS Storage vs. NetApp FAS Storage
https://go.oracle.com/LP=4205/?elqCampaignId=6667
https://go.oracle.com/LP=4205/?elqCampaignId=6667
Slide 36
Copyright 2014, Oracle and/or its affiliates. All rights
reserved. |