ssd within governmental & enterprise it infrastructure
Transcript of ssd within governmental & enterprise it infrastructure
1 - 48
• Company overview • Architecture & Performance • Reliability • SLC, eMLC, MLC – how to decide • Maximizing SSD • RamSan Flash SSD Product Overview • Sales Success Stories
Markus Steinbrugger TMS EMEA Sales & Representative
SSD WITHIN GOVERNMENTAL & ENTERPRISE IT INFRASTRUCTURE
2 - 48
EXECUTIVE OVERVIEW
Solid State Storage Leader
Global Enterprise Customers
Strong Financial Performance
World Class Team
Deep Domain Expertise
• Industry’s highest performance, highest reliability, lowest latency, lowest power SSD solutions
• Growing enterprise customer base across many verticals in over 34 countries
• No Venture Capital/Long Term Debt
• Strong management and engineering team • Over 400 man-years of SSD experience
• 33 years experience designing SSDs; 30+ patents granted and pending; many trade secrets
3 - 48
Sent a text
message
Placed online bet
Booked a cruise
or flight
Used an ATM
Conducted a
financial trade
Shopped online
Used pre-paid
wireless
Gamed online
…RamSan is Everywhere
The largest SSD installations in production in
the world
Currently operating in 10 major financial
exchanges worldwide
Used today by 7 out of 11 of the world’s largest
telecoms
Installed and in production in over 34 countries
Select RamSan Facts…
4 - 48
The largest SSD installations in production in
the world
Currently operating in 10 major financial
exchanges worldwide
Used today by 7 out of 11 of the world’s largest
telecoms
Installed and in production in over 35 countries
…RamSan is Everywhere
Sent a text
message
Placed online bet
Booked a cruise
or flight
Used an ATM
Conducted a
financial trade
Shopped online
Used pre-paid
wireless
Gamed online
Select RamSan Facts…
5 - 48
Do more, do it faster and support more concurrent users with RamSan!
FINANCIAL
•Trading Systems
• Messaging Systems
• Periodic Reporting
• Batch Processing
• Data Acquisition
GOVERNMENT
• Oracle Databases
• Metadata
• Data Acquisition
• Data Warehousing
• Airborne Data
Centers
E-COMMERCE
• Web Databases
• Shared Content
• Online Gaming
• Online
Communities
HPC
• Scientific
Computing
• Seismic Processing
• Rendering
• Video on Demand
• Data Acquisition
TELCO
• OLTP DB
• Batch Processing
• Data Warehousing
• CRM
Database and Application Acceleration (Oracle, SQL-Server, SAP, ,…)
Key Verticals & Applications
30 - 250 - 500 Watts
6 - 48
Company History at-a-glance
TMS architects and creates The World’s Fastest Storage® in varying memory mediums and form factors ranging from RAM rack mount systems to Flash PCIe direct attach storage options.
Each product is developed with enterprise capability and needs in mind
and leverages the very best of each previous design.
Founded Texas Memory Systems
Mass Memory Systems for Seismic
Industry
SSD for Super Computing
SAM 400
RamSan-210/220 32-GB RAM
4 2Gb FC
SAM 500 64-GB RAM 15 FC (1-Gb)
RamSan-400 128-GB RAM 8 FC (4-Gb)
4 IB (4x)
RamSan-440 512-GB RAM 8 FC (4-Gb)
RamSan-620 5-TB SLC Flash
8 FC (4-Gb)
RamSan-640 8-TB SLC Flash
10 FC (8-Gb) 10 IB (QDR)
RamSan-320 64-GB RAM 8 FC (2-Gb)
RamSan-500 2-TB SLC Flash
64-GB RAM 8 FC (4-Gb)
RamSan-20 450-GB SLC Flash
PCIe x4
RamSan-630 10TB SLC Flash
10 FC (8Gb) 10 IB (QDR)
RamSan-70 900-GB SLC Flash
PCIe x8 Gen 2
RamSan-710 5-TB SLC Flash
4 FC (8-Gb) 4 IB (QDR)
RamSan-810 10TB eMLC Flash
4 FC (8-Gb) 4 IB (QDR)
RamSan-720 12TB SLC Flash
4 FC (8Gb) 4 IB (QDR)
RamSan-820 24TB eMLC Flash
4 FC (8-Gb) 4 IB (QDR)
RamSan-80 450-GB eMLC Flash
PCIe x8 Gen 2
7 - 48
RamSan Flash SSD Advantages
• Latency <100µs sustainable, random READ & WRITE
even operating under RAID5 (RS-720)
• Capacity 24TB useable in 1U chassis & scalable (RS-820)
• Bandwidth 10GB/s available (FC or IB) (RS-630)
• Power 70%+ less electricity – COOL (low HVAC)
• Flexibility No ‘Lock-in’ – will boost existing infrastructure
8 - 48
Keys to Performance Architecture * Latency * Parallelism
• Hardware-only Data Path
– FPGA & Hardware Logic
– Faster than software-shared memory
• Software cannot add performance
– Virtualization is a software overhead to utilizing additional hardware
– QoS is a software overhead to give applications priority over another on shared hardware
9 - 48
Write Buffer
I/O Interface
Flash Controller FPGA
FLASH Media
Lookup Tables
CPU RAM
CPU
• Each controller handles 10 flash chips
• The Lookup Tables and Write Buffer is RAM accessible from the controller only.
• The I/O Interface and controller are both separate FPGAs
• The CPU is an embedded processor that handles all out-of-band operations
• Best Performance: 4K aligned I/O
• DMAs* are all processed completely in FPGA hardware
Series-7 Flash Controller Design
(Process all of the “IN DATA”
activities)
*DMA = Direct Memory Access
10 - 48
DMAs are hardware only
Write Buffer
I/O Interface
Flash Controller FPGA
FLASH Media
Lookup Tables
CPU RAM
CPU
•DMAs* are all processed completely in FPGA hardware
*DMA = Direct Memory Access
11 - 48
Decreasing Latency
Write Buffer
I/O Interface
Flash Controller FPGA
FLASH Media
Lookup Tables
• Removed from the DMA* path, does all non-critical flash memory admin
• Write setup • Garbage collection • Error handling • System health calculations • Wear Leveling • Statistics collection • Formatting • Backup/Restore • Key Generation
The Embedded CPU
CPU RAM
CPU
*DMA = Direct Memory Access
12 - 48
I/O Interface
CPU RAM
CPU
Write Buffer
Flash Controller FPGA
FLASH Media
Lookup Tables
Increasing Parallelism
• Increases the number of flash chips that can run concurrently
• Done by increasing the number of flash chip controllers
• Each TMS flash chip controller can do 36 4KB DMAs* in parallel
• (40 if you include the background chip RAID, or VSR, operations)
• A RamSan-70 has 8 controllers, so it can do 288 4KB operations simultaneously
• A RamSan-810 has 40 controllers, so it can do 1440 4KB operations simultaneously
*DMA = Direct Memory Access
13 - 48
Top 10 SPC-1 IOPS™
0,0
0,5
1,0
1,5
2,0
2,5
3,0
3,5
4,0
4,5
5,0
25.000 75.000 125.000 175.000 225.000
Ave
rage
Re
spo
nse
Tim
e
(All
ASU
s) (
ms)
SPC-1 IOPS™
IBM SVC 6.2 (8-node)
HP 3PAR V800
TMS RamSan-630
IBM SVC 5.1 (6-node)
IBM SVC 5.1 (4-node)
IBM Power 595
Huawei Symantec S8100 (8-node)
TMS RamSan-400
IBM SVC 4.3
IBM SVC 4.2
14 - 48
• If a chip fails, the Flash controller uses the parity bit to rebuild lost data.
– The entire RAID stripe must be relocated. All dies touched by the stripe can no longer be used.
– If the stripe runs across ten dies, a failure of one die means that nine good dies go to waste.
– If you map out the full chip, you throw out the remaining good dies in the chip.
The Problem with RAID-5
Variable Stripe RAID™ (VSR™)
15 - 48
• Patented VSR™ allows RAID stripe sizes to vary.
• If one die fails in a ten-chip stripe, only the failed die is bypassed, and then data is restriped across the remaining nine chips.
…
…
16 Planes
10 Chips
FAIL
How is VSR™ Better?
16 - 48
Flash Quality
• Flash type matters!
– SLC in most RamSans
– Enterprise MLC (eMLC) in RamSan-8x0
• SLC is best but most expensive/least dense
• eMLC chips last 10x longer vs. normal MLC
• TMS technologies like Variable Stripe RAID™ lengthen system life
0
10
20
30
40
50
60
70
80
90
100
MLC eMLC SLCP/
E C
ycle
s (T
ho
usa
nd
s)
Flash Type
Typical Chip Endurance
Endurance of system is calculated:
17 - 48
Let the numbers do the Talk... RamSan Capacity 1 TB
1024 GB User
1048576 MB 600 MB/sec
100.000 Write Cycles 5 Year Lifetime- max. per DAY Days until Degredation Years until Degredation
SLC max. Write Load
100.000 TB
55 TB 2023 5,5
SLC max. Write Load
102.400.000 GB
56.110 GB
SLC max. Write Load
104.857.600.000 MB
57.456.219 MB
30.000 Write Cycles Days until Degredation Years until Degredation
eMLC max. Write Load
30.000 TB
16 TB 607 1,7
eMLC max. Write Load
30.720.000 GB
16.833 GB
eMLC max. Write Load
31.457.280.000 MB
17.236.866 MB
5.000 Write Cycles Days until Degredation Years until Degredation
MLC max. Write Load
5.000 TB
3 TB 101 0,3
MLC max. Write Load
5.120.000 GB
2.805 GB
MLC max. Write Load
5.242.880.000 MB
2.872.811 MB
18 - 48
• Fight endurance with increased capacity
• eMLC has 2x Capacity for same cost
– 2/3rd endurance of SLC
• MLC is 3000 Writes where eMLC is 30000 Writes
• MLC is ~1/4th price of eMLC storage
– Sustained writes do not make sense for MLC
– MLC will last less than a year from sustained writes at same cost and half the write workload
Combat Endurance
RamSan-710/5TB (SLC Flash) EnduranceYearsGBps
TB8.15
1
000,1005
RamSan-810/10TB (eMLC Flash) EnduranceYearsGBps
TB5.9
1
000,3010
yearathanLessMBps
TB
500
000,31
19 - 48
• CPU utilization is the relationship between processing and waiting
• 20% CPU utilization = busy 20% of the time, waiting 80% of the time
• Speeding up the CPU (upgrading the server) only attacks 20% of the performance problem
• TMS attacks 80% of the performance problem
RamSan Benefit
20 - 48
Why is it waiting?
Waiting on user requests or waiting on storage I/O?
- Batch Processes/Autonomous systems
have no user requests – only storage I/O waits
- Large customer-facing applications
have constant feeds of user requests – only storage I/O waits
RamSan Benefit
21 - 48
Processing
Waiting
t
1 Request
Changing the time waiting (wasted)
CPU issues request
CPU handles request
CPU is waiting on request
RamSan Benefit
22 - 48
Processing
Waiting
t
1 Request
I/O Serviced by Disk
• 200 μs + 5,000 μs = 5,200 μs
• 200 / 5,200 = ~4% CPU Utilization
~100us ~100us
~5ms
RamSan Benefit
23 - 48
Processing
Waiting
t
1 request
I/O Serviced by RamSan
• 200 μs + 200 μs = 400 μs
• 200 / 400 = ~50% CPU utilization
~100us ~100us
~ 200μs
12X Application benefit by only changing storage latency
RamSan Benefit
24 - 48
• If a query takes 20 minutes at 25% CPU, then this
implies that 15 minutes is wasted time.
• By making storage 30x faster, the 15 minutes of wait
is reduced to 30 seconds.
(15min * 60sec / 30x)
• With RamSan, that 20 minute query happens in 6
minutes and CPU utilization goes to ~85%.
• Only time for each I/O changes.
RamSan Benefit
25 - 48
• RamSan: Block Storage – No application changes
– Same I/O as before
– Just quicker
• Only changes time – Less wait for same work
– Do same work with less (infrastructure, software, tuning)
– As latency approaches 0, CPUs approach 100%
WaitCPU
CPUinTimeEfficiencyCPU
RamSan delivers Time and Scale
26 - 48
RamSan - Key Product Line
RamSan-70 RamSan-710/810 RamSan-720/820 RamSan-630
SLC Flash SLC/eMLC Flash SLC/eMLC Flash SLC Flash
900GB 5/10TB 12/24TB 10TB
650K IOPS 400K/320K IOPS 500K/450K IOPS 1M IOPS
2.5GB/s 5/4GB/s 5/4GB/s 10GB/s
Full-height, half-length PCIe x8 2.0
1U rackmount, 4x IB or FC ports 3U rackmount, 10x
IB or FC ports
Single Server Apps; Distributed filesystems
Clustered Server Apps; Shared-storage filesystems (GPFS, GFS2, SAM-FS, etc.)
27 - 48
RamSan-630 Overview 1-10TB capacity
1 Million IOPS
10 GB/s throughput
Latency 80-250 µs
Leverages proven flash core from the RamSan-70 and RamSan-710 & RamSan-810
Easily shared and multi-pathed through ten 8 Gbit Fibre Channel ports or QDR InfiniBand ports
Enterprise Reliability
Single Layer Cell (SLC) Flash
Fault Tolerant Flash (FTF) Architecture
Active Spare Flash
28 - 48
RamSan-630 Architecture
Management Control
Processor
Redundant Power
Supplies
5 Dual-ported FC or IB
Interfaces
Redundant Fans
1-10TB of SLC Flash Boards
3U Chassis
29 - 48
RamSan-710 Overview
Highest variable density SLC Flash SSD system available in a 1U
Series-7™ Flash Controller
Four 8 Gbit Fibre Channel ports or QDR InfiniBand ports
Enterprise reliability
Single Layer Cell (SLC) Flash
Variable Stripe RAID (VSR)™
Active Spare
1-5 TB Usable capacity (6.8 TB Raw)
400,000 IOPS
5 GB/s throughput
35-175 µs latency
150K+ Write/Erase Cycles per Cell
30 - 48
RamSan-810 Overview
Highest variable density eMLC Flash SSD system available in a 1U
Series-7™ Flash Controller
Four 8 Gbit Fibre Channel ports or QDR InfiniBand ports
Enterprise reliability
enterprise Multi-Level-Cell (eMLC) Flash
Variable Stripe RAID (VSR)™
Active Spare
2-10 TB Usable capacity (13.7 TB Raw)
320,000 IOPS
5 GB/s throughput
70-225 µs latency (est.)
30K+ Write/Erase Cycles per Cell
31 - 48
management control processor
redundant power
supplies
N+1 batteries
4-20 Flash modules
+ 1 “Active Spare”
1U chassis
redundant fans
RamSan-710/810 Architecture
2 dual-ported 8Gb FC or QDR IB interfaces
32 - 48
Motherboard
33 - 48
Toshiba Flash
Series-7 Flash Controller FPGAs
Gateway FPGA
DDR DRAM
PowerPC CPU @ 400 MHz
34 - 48
• Data Warehousing
• Web Content Hosting
• Low Bandwidth Log Files
• READ Intensive, Low WRITE Application
Applications Suited for eMLC
35 - 48
RamSan-720 Overview
Highest density SLC Flash SSD system available in a 1U
Series-7™ Flash Controller
Four 8 Gbit Fibre Channel ports or QDR InfiniBand ports
High Enterprise reliability
Single-Level-Cell (SLC) Flash
Variable Stripe RAID (VSR)™
2D Flash RAID™
6 or 12 TB Usable capacity (~ 7.8 or ~15.6 TB Raw)
500,000 IOPS (4K)
5 GB/s throughput
<100µs latency
No Single Point of Failure (SPOF)
Hot Swappable Flash Cards
36 - 48
RamSan-820 Overview
Highest density eMLC Flash SSD system available in a 1U
Series-7™ Flash Controller
Four 8 Gbit Fibre Channel ports or QDR InfiniBand ports
High Enterprise reliability
enterprise Multi-Level-Cell (eMLC) Flash
Variable Stripe RAID (VSR)™
2D Flash RAID™
12 or 24TB Usable capacity (~ 15.6 or ~31.2 TB Raw)
450,000 IOPS (4K)
4 GB/s throughput
<110µs latency
No Single Point of Failure (SPOF)
Hot Swappable Flash Cards
37 - 48
RamSan-x20 Architecture
38 - 48
Switch/RAID Controller
Interface Management
Module Management
Module Interface
Switch/RAID Controller
Power Module
Power Module
2D Flash RAID™ (RamSan-720 & RamSan-820)
RAID 5 across Flash Modules (10 data + 1 parity + 1 hot spare)
RAID 5 within Flash Modules
(9 data + 1 parity)
TMS 2D Flash RAID™
39 - 48
4 Layers of Data Correction
RamSan-720/820 introduce System-Level RAID 5 across Flash modules, plus the other mechanisms found on all RamSan Flash storage systems.
RamSan-720/820 only
40 - 48
RamSan-x20 Facts
1. High Reliability and Performance with Lowest Power Drain. 2. Lowest Latency Logic with real Hardware. 3. Over Dependable 2D Flash RAID. 4. Powerful Performance in Smallest 1U Form Factor 5. Last Longer with patented Variable Stripe RAID 6. Lowest Power, Highest Performance 7. High Availability in Smallest 1U size 8. Maximum Bandwidth with Quad QDR InfiniBand Interfaces 9. Support for simultaneous 128 random Read Requests every 100µs 10. 500K IOPS and 400 Watts (1,250 IOPS per Watt)
41 - 48
RamSan-70 Overview
Series-7™ Flash Controller
PCIe 2.0 x8
High Enterprise reliability
Single-Level-Cell (SLC) Flash
Variable Stripe RAID (VSR)™
900GB Usable capacity (~ 1374 GB Raw)
650,000 IOPS (4K)
2,5 GB/s throughput
30µs Write Latency (4K)
100µs Read Latency (4K)
10 Years Life Expectancy
Low Host Overhead
Ultra Capacitors
42 - 48
RamSan-70 Architecture
1
3 4
4
2
2
5
5
5
5 6
7
• 450-900GB
• 650,000 IOPS (4K)
• 2,5 GB/s Bandwidth
• 30µs sustained 4K Write Latency
• 100µs 4K Read Latency
• 10 Years Life Expectancy
1. PCIe 2.0 x8
3. Power PC CPU 333 mHz
4. Xilinx FPGAs
2. 900GB usable SLC Flash (1.374GB RAW)
5. 4GB DRAM 6. Ultra-Capacitors
7. Half-length card
43 - 48
CSCS Benchmark
• CSCS = Swiss National Computing Centre
• Independent evaluation of PCIe SSDs
• RamSan-70 results: – “…by far the best IOPS result
we have ever measured…” (300K+ random 4K IOPS)
– “Unlike the FusionIO and Virident TachIOn devices, the bandwidth is almost independent of block size…”
Full paper available at http://www.cscs.ch/fileadmin/user_upload/customers/cscs/Documents/Performance_Analysis_of_TMS_SSD_sep_2011.pdf.
44 - 48
Parallel Filesystem Metadata
• IBM General Parallel File System (GPFS)
• Mirrored eMLC RamSan-810s for metadata
• Replaced 200x 15K RPM hard drives
Before After Improvement
Backup Time (hours) 6 1 6X
Power (watts) 5000 500 10X
45 - 48
Faster Than 2.5” SSDs
• US government entity
• 100 TB RamSan systems for Oracle database
• Writes mirrored to Hitachi USP-V
• RamSan selected by system integrator due to much lower latency than 2.5” hard drive form factor SSDs behind the USP-V
46 - 48
Major U.S. Financial Exchange
• Over 100 RamSan systems in production
• 3 core revenue-generating apps in production on RamSan
• Moving key Oracle tablespaces from the existing storage array doubled application performance
• Even though tablespaces in the storage array were served from cache
47 - 48
See the Case Study section of our Web site for details of the RamSan customers you can reference:
TMS Reference List
48 - 48
Take Home Message
• TMS is Hardware
• “You cannot increase performance by adding lines of code.”
• RamSan systems provide a hardware-only datapath (FPGA, hardware-logic). This keeps performance cost of going external very low.
• Lower latency on SAN systems (FC/IB) vs. competitors on DAS (PCIe or SAN)