GLOBAL SPONSORS - Dell Technologies Germany...GLOBAL SPONSORS Dell EMC Data Analytics Strategy...
Transcript of GLOBAL SPONSORS - Dell Technologies Germany...GLOBAL SPONSORS Dell EMC Data Analytics Strategy...
GLOBAL SPONSORS
GLOBAL SPONSORS
Dell EMC Data Analytics Strategy STEFFEN THUEMMEL
ADV. SYSTEMS ENGINEER, DELL EMC
44ZB DATA
2020
IoT
Enterprise Challenges in Digital Transformation Drive for deeper insights is accelerating need
for architectures to enable unstructured
analytics
22ZB
2018
11ZB
2016
Manage Data Growth
Perform Advanced Analytics
Handle Unstructured Data Sources
Drive Real-Time Results
Organizations need to deliver analytics on more than
just their traditional structured data
Evolving spectrum of data analytics
Requires infrastructure that enables multiple applications and varied use cases
Predictive
Analytics
Business
Intelligence
Analytics of
Things
Cyber security
Analytics
Real-time
Analytics
Machine
Learning
Enables analytics for ALL of your data
Dell EMC Unstructured Analytics Portfolio
PowerEdge
Performance
Centric
Storage
Centric
Predictive
Analytics
Business
Intelligence
Analytics of
Things
Cyber security
Analytics
Real-time
Analytics
Machine
Learning
Archive
Centric
© Copyright 2017 Dell Inc. 6
Dell EMC Unstructured Analytics Portfolio
PowerEdge
Solution accelerators Splunk Ready System
Hadoop Ready Bundle
QuickStart for Hadoop
EDW Optimization Solutions
Hadoop Backup Solutions
SAS-Grid Solution with Isilon
Streaming Analytics Solutions
Proven solutions for unstructured analytics
© Copyright 2017 Dell Inc. 7
HADOOP DECISIONS
DAS
ECS
© Copyright 2017 Dell Inc. 8
3 TRADITIONAL DISCOVERY QUESTIONS
1
2
3
What do you hope to achieve with Hadoop?
Why is this impactful to your business?
Which Hadoop Distribution
will you choose?
© Copyright 2017 Dell Inc. 9
NEXT LEVEL QUESTIONS
Access Implementat
ion
Compliance
Scalability
Tools & Apps
Business Units
Consolidate
© Copyright 2017 Dell Inc. 10
EMC ISILON HDFS INTERFACE
• Native HDFS support
• Underlying file system is OneFS
• As simple as pointing the HDFS clients to the
DNS name of the Isilon cluster!
© Copyright 2017 Dell Inc. 11
HADOOP WITH EMC DATA LAKE
1 Multi Protocol Scale-Out Storage Platform
• NFS, SMB, FTP, HTTP, HDFS, SWIFT
2 Enterprise Data Protection & Governance
• SnapshotIQ, SyncIQ, SmartLock, ACLs..
3 Industry-Leading Storage Efficiency
• >80% Storage Utilization
4 Independent Scalability with Optimized QoS
• Optimally Scale Storage & Compute
5 Consolidate Data Silos
• Industry Standard Protocols
• Bring Applications to Shared Data
6
Hadoop as a Service
• Eliminate Shadow IT
• Offer variations of Hadoop to all your BUs
© Copyright 2017 Dell Inc. 12
Traditional “Share-Nothing” Hadoop
Existing Virtualized Data Center SHARE-NOTHING Hadoop Infrastructure
Unstructured Data
1
Existing Primary Storage
2 3 4 2 3 4 2 3 4 2 3 4
• Hadoop on a Stick (R=3)
means 5 data copies ($$$$)
• Data has to copy to the
Hadoop cluster before analysis
can begin (Time to Results)
How will you maintain data
consistency when a file changes
on your primary storage?
© Copyright 2017 Dell Inc. 13
Existing Virtualized Data Center
Existing Primary Storage
Isilon “Share-Everything” Hadoop
1
Start using Hadoop NOW with unused processing and RAM available in your VMware environment
No replication required (Use your existing data)
Access to same data via NAS and HDFS protocols
Time to results extremely fast using already existing data with NO COPIES or wasted $$$$
Analysis Can
Begin with
the 1st VM
New Hadoop Compute Nodes
Unstructured Data
Use Native HDFS Protocol
© Copyright 2017 Dell Inc. 14
DEPENDENT SCALING
Traditional Hadoop HDFS
Storage to Compute ratio is fixed
Scaling compute means scaling capacity
Difficult to provide QoS
Compute upgrade is a forklift
Compute
Sto
rage
Required
performance/
capacity
Required Hadoop
Cluster Nodes
© Copyright 2017 Dell Inc. 15
INDEPENDENT SCALING
Traditional Hadoop HDFS
Isilon HDFS
Storage to Compute ratio is fixed
Scaling compute means scaling capacity
Difficult to provide QoS
Compute upgrade is a forklift
Scale compute independent of storage
Achieve optimal performance balance even as workloads evolve
No data migrations, ever!
Add new performance as hardware evolves
Benefit from 80% utilization rate Compute
Sto
rage
Required
performance/
capacity
Required Hadoop
Cluster Nodes
© Copyright 2017 Dell Inc. 16
SNAPSHOT & VERSION CONTROL
Before
After
Traditional HDFS does not have replication
No Snapshotting of data
Loss of version control
Not designed for Mission Critical
Full SnapshotIQ integration identifies changes
Multi-threaded, Multi-Node Scale-Out replication
Improved RPO/RTO for business continuity
Geo-replicated Hadoop!
© Copyright 2017 Dell Inc. 17
ISILON REGULATORY COMPLIANCE
© Copyright 2017 Dell Inc. 18
Data Center Network
TIME-TO-RESULTS
Data Copy Analysis In-Place Analysis
Existing Primary Storage
Hadoop on a Stick
Have you ever
copied 100TB from
Primary Storage to
a Hadoop system?
How long does it
take to copy 100TB
from one place to
another over a
10Gb link?
>24 Hours
Data Center Network
Existing Primary Storage
Hadoop Compute Nodes
Reading
relevant
data to
analysis
© Copyright 2017 Dell Inc. 19
SMB, NFS,
HTTP, FTP,
HDFS
NameNode
Data
name
node
name
node
name
node
name
node
da
ta n
od
e
NFS
SMB
SMB
NFS
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MULTIPLE HADOOP DISTRIBUTIONS
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
MAP Reduce
© Copyright 2017 Dell Inc. 20
HADOOP OUTLOOK
80 % Utilizati
on
Capacity Independen
t Scalabilit
y
Implementation
Enterprise features
Data Access Applications
& Tools
Hadoop-as-a-service
Compliance &
Regulation
© Copyright 2017 Dell Inc. 21
OPPORTUNITY COSTS OF TRADITIONAL HADOOP DEPLOYMENTS
Opportunity Cost Description Benefit with Isilon
Data Transformation/Migration
Projects
Hydrating a data lake involves data migration
projects as pre-requisite for analytics use cases.
Data Migration Projects avoided with in-
place analytics
Multiple Siloed Analytics/Hadoop
platforms Data management costs per each platform
Supports Multiple Analytics/Hadoop
Platforms with single copy of data
Customized DR and data
protection solutions
Design, implementation and life cycle management
of data protection and disaster recovery solutions on
DAS cluster
Supports Data Protection and Disaster
Recovery
out-of-the-box
Managing storage efficiently Storage provisioning, monitoring, quota
management on DAS cluster Storage management out-of-the-box
Fraud
Detection &
Risk Analytics
Trading / Tick
Data Analytics IoT
Data Driven
Business
Transformation
Unstructured Analytics Use Cases
Customer 360
Analytics
Enabling enterprises to improve operational efficiencies
and monetize new revenue streams
Right Solution Configuration for the use case
High Performance w/ cost as main driver
100% Compliance to Hadoop Operational features
Ability to scale down at cost On
e o
r
mo
re
Storage scaling faster than compute
Enterprise Grade File Mgmt.
Consolidation of IT Workloads
Aggregate capacity > 100 TB
On
e o
r
mo
re Data Compute
Geo-distributed single namespace
Analytics and Hadoop
Compute Data
Compute + Data
PowerEdge
PowerEdge
PowerEdge
Dire
ct
Att
ach
ed
Sto
rag
e
Sh
are
d S
tora
ge
CUSTOMER REQUIREMENTS CONFIGURATION drive
Pe
rfo
rman
ce-
ce
ntr
ic
Sto
rag
e-
ce
ntr
ic
Arc
hiv
e-
ce
ntr
ic
© Copyright 2017 Dell Inc. 24
Data analytics offerings
© Copyright 2017 Dell Inc. 25
Visit: dellemc.com/bigdata