Product Overview MEDIAPOWER, 18.01.2012 Distributed Hyperscale Collaborative Storage Simone Arvigo...
-
Upload
damien-ready -
Category
Documents
-
view
220 -
download
1
Transcript of Product Overview MEDIAPOWER, 18.01.2012 Distributed Hyperscale Collaborative Storage Simone Arvigo...
Product Overview
MEDIAPOWER, 18.01.2012
Distributed Hyperscale
Collaborative Storage
Simone [email protected]
The Big Data Reality
Information universe in 2009:- 800 Exabytes
In 2020′s: - 35 Zettabytes
A new type of data is driving this growth• Structured data – Relational tables or arrays• Unstructured data — All other human generated data
• Machine-Generated Data – growing as fast as Moore’s Law
© 2011 DataDirect Networks. All rights reserved 3
A Paradigm Shift is Needed
Vs.
File storage Object Storage
Millions of Files Scalability 100’s of Billions of Objects
Point to Point, Local Access Peer to Peer, Global
Fault-Tolerant Management Self-Healing, Autonomous
Files, Extent Lists Information Objects w/ Metadata
75% on average Space Utilization Near 100%
© 2011 DataDirect Networks. All rights reserved 4
What Big Data Needs
• Hyper-scale» World-wide single & simple namespace» Dense, efficient & green» High performance versatile on-ramp and off-ramp
• Geographically distributed» Process the data close to where its generated vs. copying vast amount of data to
processing» Cloud enabling» World-wide single & simple namespace
• Resiliency with extremely low TCO » No complexity» Near zero administration
• Ubiquitous Access» Legacy protocols» Web Access
© 2011 DataDirect Networks. All rights reserved 5
Storage should improve collaboration
• … Not make it harder
• Minutes to install, not hours
• Milliseconds to retrieve data, not seconds
• Replication built in, not added on
• Instantaneous recovery from disk failure, not days
• Built in data integrity, not silent data corruption
© 2011 DataDirect Networks. All rights reserved 6
Introducing:DDN Web Object Scaler• Content Storage Building Block for Big Data Infrastructure
» Industry’s leading scale-out object storage appliance» Unprecedented performance & efficiency» Built-in single namespace & global content distribution
• Optimized for Collaborative Environments» Geographic location intelligence optimizes access latency » Just-in-time provisioning
• Lowest TCO in the Industry» Simple, near zero administration» Automated “Continuity of Operations” architecture
Data locality&
Global Collaboration
© 2011 DataDirect Networks. All rights reserved 7
The WOS initiative
• Understand the data usage model in a collaborative environment where immutable data is shared and studied.
• A simplified data access system with minimal layers.
• Eliminate the concept of FAT and extent lists.
• Reduce the instruction set to PUT, GET, & DELETE.
• Add the concept of locality based on latency to data.
© 2011 DataDirect Networks. All rights reserved 8
WOS Fundamentals
» No central metadata storage, distributed management
» Self-managed, online growth & balancing, replication
» Self-tuning, zero-intervention storage
» Self-healing to resolve all problems & failures with rapid recovery
» Single-Pane-of-Glass global, petabyte storage management
© 2011 DataDirect Networks. All rights reserved 9
WOS – Architected for Big Data
3U, 16-drive WOS Node (SAS/SATA)
• 256 billion objects per cluster• Scales to 23PB per cluster• Start small, grow to tens of Petabytes• Network & storage efficient
NAS Protocols(CIFS, NFS, etc)
Native Object Store interface
Cloud PlatformS3 compatibility
MultiProtocol Gateway
• CIFS/NFS protocols
• LDAP/AD Support
• Scalable
• HA & DR Protected
• Migration from existing NAS
• Federation across existing NAS
Cloud Store Platform
• S3-Compatible & WebDAV APIs
• Multi-tenancy
• Reporting & Billing
• Remote storage, file sharing, and backup agents
Native Object Store
• C++, Python, Java, PHP, HTTP REST interfaces
• PUT, GET, DELETE object, RESERVE ObjectID, etc
Universal Access
Global Reach & Data Locality
Resiliency with Near Zero Administration
Hyper-Scale
• Up to 4 way replication • Global collaboration• Access closest data• No risk of data loss
• Self healing • All drives fully
utilized • 50% faster recovery
than traditional RAID• Reduce or eliminate
service calls
2PB / 11 Units per Rack
4U, 60-drive WOS Node (SAS/SATA)
Dead Simple to Deploy & Administer
San Francisco
New York
London
Tokyo
WOS (Replicated or Object Assure)
© 2011 DataDirect Networks. All rights reserved 10
WOS & the Big Data Life Cycle
Content Distribution
Day 1 30 Days 90 Days 1 Year 2 Years 5 Years n Years
High Performance Distribution & Long Term Preservation
WOS delivers high performance
WOS is an intelligent, scalable object store ideal for both high-frequency transactions as well as content archiving and geo-distribution
WOS delivers low TCO & massive scalability
WOS delivers automatic replication & geo-distribution
Real Time Processing Content Growth Rates
Acc
ess
Distributed Hyperscale Collaborative StorageGlobal View, Local Access
Key Features• Replication across up to 4 sites•Geographic, location, & latency intelligence • Data accessed @ local speeds even using
NAS protocols• Data and DR protected
Key Benefits• Users can access and update data
simultaneously across multiple sites• Increases performance & optimizes
access latency• No risk of data loss
Data locality&
Global Collaboration
© 2011 DataDirect Networks. All rights reserved 11
40 ms
80 ms
10 ms
Los Angeles Latency Map
30 ms
80 ms
10 ms
Madrid Latency Map
Replicate & collaborate (ingest, access & update data) at local speeds across multiple locations
Zone 2Zone 1
WOS: Distributed Data Mgmt.
App/Web Servers
A file is uploaded to the application or web server.
The system then replicates the data according to the WOS policy, in this case the file is replicated to Zone 2.
OID = 5718a36143521602
Application makes a call to the WOS client to store (PUT) a new object
OID = 5718a36143521602
Application returns file to user.
The WOS client stores the object on a node. Subsequent objects are automatically load balanced across the cloud.
LAN/WAN
Database
The WOS client returns a unique Object ID which the application stores in lieu of a file path. The application registers this OID with the content database.
Application makes a call to the WOS client to read (GET) the object. The unique Object ID is passed to the WOS client.
A user needs to retrieve a file.
The WOS client automatically determines what nodes have the requested object, retrieves the object from the lowest latency source, and rapidly returns it to the application.
WOS Building Blocks
WOS 6000 - 4U high density 60-drive
WOS 1600 - 3U high-performance 16-drive
Key Metrics» Built on the DDN industry leading high
performance storage platforms» 4 GigE connections per node» Highest density and scalability in the market
- 1.98PB per rack, Up to 23PB per cluster- 660 spindles per rack- 22B objects per rack, 256B objects per cluster- 99% storage efficiency for any mix of file sizes
between 512 bytes to 500GB
» Linear cluster performance scaling» 4 1-Gige ports per node» Low latency
- One disk I/O per read or write for objects < 1MB
4U, 60-drive WOS Node (SAS/SATA)
3U, 16-drive WOS Node (SAS/SATA)
2PB / 11 Units per Rack
WOS 6000
WOS 1600
544TB / 15 Units per Rack
4U, 60-drive WOS Node (SAS/SATA)
© 2011 DataDirect Networks. All rights reserved 13
Object Request Handler
Local Data Store
Replication Engine
Network Services
Node Manager
Object Group Mgmt.
Platform Services
Statistics
Custom Disk Driver
WOS Node SoftwareWOS API
Object Request Handler
Local Data Store
Replication Engine
Network Services
Node Manager
Object Group Mgmt.
Platform Services
Statistics
Custom Disk Driver
WOS Node SoftwareWOS API
WOS Under the Hood
..…Array Controllers
ClientsClientsClients/Apps
Node OID Map
WOS Tray Components• Processor /controller motherboard• WOS Node software • SAS or SATA Drives (2 or 3TB)
WOS Software• Services I/O requests from clients• Directs local I/O requests to disk• Replicates objects during PUTs• Replicates objects to maintain policy
compliance• Monitors hardware health
Other WOS Nodes
WOS-Lib
2 or 3TB drives
Clients/Apps
HTTP/REST
WOS Node / MTS
4 GigE
Intelligent WOS Objects
Sample Object ID (OID):
Eg. Replicate Twice; Zone 1 & 3
Object = PhotoTag = Beach
A random 64-bit key to prevent unauthorized access to WOS objects
Robust 64 bit checksum to verify data integrity during every read.
ACuoBKmWW3Uw1W2TmVYthA
Full File or Sub-Object
User MetadataKey Value or Binary
PolicyPolicy
SignatureSignature
ChecksumChecksum
thumbnails
© 2011 DataDirect Networks. All rights reserved 15
The Result:
WOS eliminates the wasted capacity seen with conventional NAS storage
Efficient Data Placement
WOS “Buckets”Contain objects of similar size to optimize placement
WOS Object “Slots”Different sized objects are written in slots contiguously
Slots can be as small as 512B to efficiently support the smallest of files.Slots can be as small as 512B to efficiently support the smallest of files.
Object Storage vs. File SystemSpace Utilization
• WOS utilizes an average of 25% more of the available disk space than does SAN/NAS file systems
• 1PB deployment, stranded space totals 250TB of space, which add $50K-$100K of system cost, as well as ongoing power & space costs
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
500 B 10KB 50KB 500KB 10MB Mix
98% 97% 99% 99% 99% 98%
65%
75%
90%
95% 95%
75%
Uti
lizati
on
File Size
Disk Space UtilizationGPFS vs WOS
WOS Utilized GPFS Utilized
WOS eliminates stranded capacity Inherent in SAN/NAS File System storage
WOS – The TCO Leader
WOS by the TCO Numbers» WOS annual operating costs are less
than one third of S3 costs» WOS total TCO is 50% S3 TCO over
a 3 yr period» First year includes WOS acquisition
and deployment costs» Follow on years include WOS storage
growth and management costs
500TB Comparison – Total TCO Overview
Moving an existing Amazon Web Services workload to an internal / private cloud with WOS storage can save 50%+ in TCO costs over 3 years
$0
$1,000,000
$2,000,000
$3,000,000
$4,000,000
$5,000,000
$6,000,000
Year 1 Year 2 Year 3 Total 3yr Investment
Annual & 3-Year Cost Comparison WOS vs. S3
S3
WOS
© 2011 DataDirect Networks. All rights reserved 18
WOS AdvantagesSimple Administration
• Designed with a simple, easy-to-use GUI• “This feels like an Apple product”
» Early customer quote
© 2011 DataDirect Networks. All rights reserved 19
© 2011 DataDirect Networks. All rights reserved 20
WOS Deployment & Provisioning
WOS building blocks are easy to deploy & provision – in 10 minutes or less» Provide power & network for the WOS Node» Assign IP address to WOS Node
& specify cluster name (“Acme WOS 1”)» Go to WOS Admin UI. WOS Node appears
in “Pending Nodes” List for that cluster
San Francisco
New York
London
Tokyo
Simply drag new nodes to any zone to extend storage
NoFS
» Drag & Drop the node into the desired zone» Assign replication policy (if needed)
Congratulations! You have just added 180TB to your WOS cluster!
© 2011 DataDirect Networks. All rights reserved 21
Intelligent Data Protection RAID Rebuild vs WOS Re-Balance
RAID Rebuilds Drives• Lost capacity - Spare drives strand capacity• Long rebuild times - Whole drive must be
rebuilt even though failed drive only partially full
• Higher risk of data loss – if spare drive is not available, no rebuild can occur
• Increased support costs - immediate service call is required to replace low spares condition
• Reduced write performance- RAID reduces disk write performance, especially for small files
RAID Spares
…x …… x…x …x
Immediate Service Call
Traditional RAID Storage
Rebuild
x
xx
WOS (Replicated or Object Assure)
FS + RAID 6 Web Object Scaler
WOS Re-Balances Data Across Drives• All drives fully utilized – Any free capacity on any
drive is part of the spare pool• 50% shorter re-balance times – Only actual data
is copied• Faster recovery times increase overall
performance and reduce risk of data loss• Drive failures decrease overall capacity only by
the size of the failed drives• Total capacity may be restored by replacing
drives during scheduled maintenance
Capacity Available:
116TB
Capacity Available:
118TB
Capacity Available:
120TB
Optional Scheduled
Service Call Restores Capacity
Re-Balance
WOS Accessibility
NAS Protocols(CIFS, NFS, etc)
Native Object Store interface
Cloud PlatformS3 compatibility
NAS Gateway• CIFS/NFS protocols• LDAP/AD Support• Scalable• HA & DR Protected• Migration from
existing NAS
Cloud Store Platform• S3-Compatible &
WebDAV APIs• Multi-tenancy• Reporting & Billing • Remote storage, file
sharing, and backup agents
Native Object Store• C++, Python, Java,
PHP, HTTP REST interfaces
• PUT, GET, DELETE object, RESERVE ObjectID, etc
• NAS Gateway• Scalable to multiple gateways
• DR protected & HA Failover
• Synchronized database across remote sites
• Local read & write cache
• LAN or WAN access to WOS
• Federates across WOS & NAS
• Cloud Storage Platform• Targeted at cloud service providers or
private clouds
• Enables S3-enabled apps to use WOS storage at a fraction of the price
• Supports full multi-tenancy, bill-back, and per-tenant reporting
© 2011 DataDirect Networks. All rights reserved 22
© 2011 DataDirect Networks. All rights reserved 23
Cloud & Service Provider Tools
ServiceProviders
Internal Cloud Customer• Medium to large multi-site enterprise• Provides services to internal BU’s• Lowers costs by optimizing utilization
of CPU & storage• Transfer EC2& S3 workload in-house
to improve security & lower costs• Bill-back internal departments for
services
Managed Service Provider • Provides hosting services for a few
large customers• Hosts at local site or third party data
center • May share some resources across
multiple customers• Extremely security conscious
Public Cloud• Shares resources across many
customers• Hosts at third party data centers• Subscription pricing for CPU, storage,
& network usage• Offers lowest CAPEX, Subscription
pricing
Common Needs: DR, Multi-tenancy, Data Locality, Standard Interfaces, Low TCO
Private or Internal Cloud
Hosted Managed Service Providers
Public Clouds
1Multi Protocol Gateway
NFS. CIFS, ftp Metadata store
WOS
Data Center 1
Processing Locality
1Multi Protocol Gateway
NFS. CIFS, ftp Metadata store
WOS
Data Center 1
Processing Locality
1Multi Protocol Gateway
NFS. CIFS, ftp Metadata store
WOS
Data Center 1
Processing Locality
WOS Multi-Protocol Gateway
• NFS / CIFS Gateway for in-house IT & private clouds» Optimizes both multi-site
collaboration & data locality» HA failover and DR protected
• Provides NAS access for POSIX FS Applications» Standard NFS/CIFS Protocol Access
with LDAP integration» NAS data migration capabilities
Replication&
Multi-site Collaboration
Processing Locality
Processing Locality
© 2011 DataDirect Networks. All rights reserved 24
Data Center 2
Data Center 3
© 2011 DataDirect Networks. All rights reserved 25
1
Zone San Fran
WOS Nodes
…
2
Failure recovery - Data, Disk or Net
Get Operation1. WOSLib selects replica with least latency
path & sends GET request
2. Node in Zone “San Fran” returns object A back to application
WOS Nodes
…Zone New
York
10.8.24.101..
10.8.24.105
10.8.25.101..
10.8.25.105
WOS Nodes
…Zone
London
10.8.26.101..
10.8.26.105
Operation: GET “A”
A A A
Get Operation – Corrupted with Repair1. WOS-Lib selects replica with least latency & sends
GET request
2. Node in Zone “San Fran” detects object corruption
3. WOS-Lib finds next nearest copy & retrieves it to the client app
4. In the background, good copy is used to replace corrupted object in San Fran zone
2
A
4
3
X
40 ms
80 ms
10 ms
Latency MapClient App
WOS-LibWOS Cluster Group Map
Best viewed in presentation mode
© 2011 DataDirect Networks. All rights reserved 26
Client App
1
San Francisco
New York
London
Tokyo1
Geographic Replica Distribution
PUT with Asynchronous Replication1. WOSLib selects “shortest-path” node
2. Node in Zone “San Fran” stores 2 copies of object to different disks (nodes)
3. San Fran node returns OID to application
4. Later (ASAP) Cluster asynchronously replicates to New York & London zones
5. Once ACKs are received from New York & London zones, extra copy in San Fran zone is removed
Zone San Fran
WOS Nodes
…
WOS Nodes
…Zone New
York
10.8.24.101..
10.8.24.105
10.8.25.101..
10.8.25.105
WOS-LibWOS Cluster Group Map
WOS Nodes
…Zone
London
10.8.26.101..
10.8.26.105
43
A AAA
2
40 ms
80 ms
10 ms
Latency Map
Best viewed in presentation mode
© 2011 DataDirect Networks. All rights reserved 27
Cluster “Acme WOS 1”
3
40 ms
80 ms
10 ms
Los Angeles Latency Map
San Francisco
New York
London
Tokyo
Multi-site Post-Production OperationData Locality & Collaboration
Data Locality & Collaboration1. LA site user edits video “A”, which replicates
to Mexico City & Madrid based on policy
2. MP Gateway immediately synchronizes metadata DB with Madrid user
3. Madrid user requests video “A” for processing, WOS-Lib selects Madrid site (lowest latency) & retrieves for the user
4. The Madrid user extracts frames from the video & writes to WOS (new object), which replicates to Mexico City & LA
Zone Los Angeles
10.8.24.101..
10.8.24.105
10.8.25.101..
10.8.25.105
10.8.26.101..
10.8.26.105
A ANAS Gateway
WOS-Lib
Madrid User
A
30 ms
80 ms
10 ms
Madrid Latency Map
Best viewed in presentation mode
Los AnglesMexico CityMadrid
NAS Gateway
WOS-Lib
2
Real Time Editing
App
BB
A A
1
Zone Mexico City
Zone Madrid
B
44
3
Los Angles User
Real Time Editing
App
© 2011 DataDirect Networks. All rights reserved 28
Rules oriented application meets object oriented storage
iRODS Integration
• WOS• Petabyte Scalability: Scale out by simply
adding storage modules• Unrivaled Simplicity: Management
simplicity translates directly to lower TCO• Self-Healing: Zero intervention required
for failures, automatically recovers from lost drives
• Rapid Rebuilds: Fully recover from lost drives in moments
• Replication Ready: Ingest & distribute data globally
• Disaster Recoverable: For uninterrupted transactions no matter what type of disaster occurs
• File Layout: Capacity and performance optimized
• Object Metadata: User-defined metadata makes files smarter
iRODS, a rules oriented distributed data management application meets WOS, an object oriented content scale-out and global distribution system
WOS + IRODS is the simple solution for Cloud Collaboration
• WOS is a flat, addressable, low latency data structure.
• WOS creates a “trusted” environment with automated replication.
• WOS is not an extents based file system with layers of V-nodes and I-nodes.
• IRODS is the ideal complement to WOS allowing multiple client access and an incorporation of an efficient DB for metadata search activities.
© 2011 DataDirect Networks. All rights reserved 30
Some iRODS Examples
• NASA & iRODS» Jet Propulsion Laboratory
- Selected for managing distribution of Planetary Data» MODUS (NASA Center for Climate Simulation)
- Federated satellite image and reference data for climate simulation
• U.S. Library of Congress» Manages the entire digital collection
• U.S. National Archives» Manages ingest and distribution
• French National Library» iRODS rules control ingestion, access, and audit functions
• Australian Research Coordination Service» Manages data between academic institutions
In&Out-BandReview
Internet, MAN, Private network, or Intranet
GigE
Surveillance to the Cloud Case StudyEliminating Cost
NVR NVR
In-BandReview
Islands of iSCSI
Volumes
Ethernet
BeforeClassic CCTV with NVRs
x xx x
CCTV Site Surveillance• Multiple NVRs• Multiple islands of iSCSI storage• Clip reviewing software costs• Admin & support costs
AfterDistributed Multi-site Surveillance
Cloud Surveillance with WOS• Centralized Video Review• Centralized Storage• IP Cameras• Eliminates admin costs
In a 6400 camera surveillance deployment, a WOS-based IP camera solution was 33% cheaper to deploy and reduced TCO by 30-50%.
© 2011 DataDirect Networks. All rights reserved 31
WOS + iRODS: YottaBrain Program
Each container: 5 PB of WOS
WOS clusters federated with iRODS
WOS Cloud Storage Advantages
The World’s Leading Object Storage Appliance
• Single Global Namespace for billions of files
• Fast, efficient content delivery - automated policy-based multi-site replication to the network edge
• World-Leading File Write and Read Performance
• Ability to grow non-disruptively in small increments to massive scale with leading energy and space efficiency
• Single management interface for a global WOS cloud
• Distributed, self-healing content infrastructure without bottlenecks or single points of failure.
© 2011 DataDirect Networks. All rights reserved 33
Thank You