Piranha vs. mammoth predator appliances that chew up big data

61
Piranha vs. Mammoth Predator Appliances chew up BIG DATA

description

If you also got the Big Data itch, here is something to ease the pain :-) Answers to this questions will be available soon (more info in the attached link) Which Big Data Appliance should YOU use? (click on the attached link for Poll results) Appliances are Small and Quick, Right? Revealing the 6 Types of Big Data Appliances Uncovering the Main Players Challenges, Pitfalls, and Winning the Big Data Game Where is all this leading YOU to?

Transcript of Piranha vs. mammoth predator appliances that chew up big data

Page 1: Piranha vs. mammoth   predator appliances that chew up big data

Piranha vs. MammothPredator Applianceschew up BIG DATA

Page 2: Piranha vs. mammoth   predator appliances that chew up big data

Piranha vs. MammothPredator Applianceschew up BIG DATA

• Appliances are Small and Quick, Right?• Revealing the 6 Types of Big Data Appliances• Uncovering the Main Players• Which Big Data Appliance should YOU use?• Challenges, Pitfalls, and Winning the Big Data

Game• Where is all this leading YOU to?

Page 3: Piranha vs. mammoth   predator appliances that chew up big data

Appliances are Small and Quick, Right?

Page 4: Piranha vs. mammoth   predator appliances that chew up big data

Well, in some cases.But, Big Data Appliance can be…

BIG…

Quantum StorNext M330 Presented on YouTubehttp://www.youtube.com/watch?v=X1IZpoyHxlY

Page 5: Piranha vs. mammoth   predator appliances that chew up big data

So what makes a great appliance?

Page 6: Piranha vs. mammoth   predator appliances that chew up big data

But first, let’s get to know You(Big Data Appliance Poll #1…)

Page 7: Piranha vs. mammoth   predator appliances that chew up big data

How deep have you dived into Big Data?

A. Just starting to learn itB. Learning a lot, nothing done yetC. Planning a Big Data ProjectD. Running a Big Data OperationE. I don't get it Yet! What's all the fuss about it?

Page 8: Piranha vs. mammoth   predator appliances that chew up big data

Results…

Page 9: Piranha vs. mammoth   predator appliances that chew up big data

So what makes a great appliance?

Page 10: Piranha vs. mammoth   predator appliances that chew up big data

So what makes a great appliance?

1. Does the job – no more, no less2. Quick and simple setup3. Quick and easy updates4. Easy control of one or many instances5. Simple Infrastructure requirements6. Reliable underlying system7. No delays doing it’s job8. What else?

Page 11: Piranha vs. mammoth   predator appliances that chew up big data

What’s the most important Job for a Great Appliance? (Poll #2)

Page 12: Piranha vs. mammoth   predator appliances that chew up big data

What’s the most important Job for a Great Appliance? (Poll #2)

A. Does the job on time – no more, no lessB. Quick and simple Setup and UpdatesC. Easy control of one or many instancesD. Simple Infrastructure requirementsE. Reliable underlying system

Page 13: Piranha vs. mammoth   predator appliances that chew up big data

Results…

Page 14: Piranha vs. mammoth   predator appliances that chew up big data

What is the job for your Big Data Appliance?

Page 15: Piranha vs. mammoth   predator appliances that chew up big data

What is the job for your Big Data Appliance?

1. Extend your Existing Data Warehouse to include Non-Structured Data?

2. Discover new types of insights to Increase Innovation3. Run a pilot to verify it is worth it4. Process more (types of) Data5. Process Data faster6. Process Data cheaper7. Static or Continuous Analysis of Data8. Flexibility and Lock-In prevention (yes, sure :-)) - Hadoop9. Turn Operational Data into Assets10. Break Data Silo barriers11. Stick to existing Data vendors or work with new ones

Page 16: Piranha vs. mammoth   predator appliances that chew up big data

Revealing the 6 Types of Big Data Appliances

Page 17: Piranha vs. mammoth   predator appliances that chew up big data

Revealing the 6 Types of Big Data Appliances

• Hadoop Engine - Software Based Appliance• Data Warehouse Hardware Engine + API to

Hadoop / Analytics• Hardware Storage “Only”• Software Based Appliance, Compatible to Hadoop• Cloud based VMs + Hadoop Engine• Cloud Based API with Hooks to Hadoop

Page 18: Piranha vs. mammoth   predator appliances that chew up big data

What type of Big Data Appliance will you use? (Poll #3)

Page 19: Piranha vs. mammoth   predator appliances that chew up big data

What type of Big Data Appliance will you use? (Poll #3)

A. Hadoop Engine or Compatible - Software Based

B. Data Warehouse Hardware Engine + API to Hadoop / Analytics

C. Hardware Storage “Only”D. Cloud based VMs + Hadoop EngineE. Cloud Based API with Hooks to Hadoop

Page 20: Piranha vs. mammoth   predator appliances that chew up big data

Results…

Page 21: Piranha vs. mammoth   predator appliances that chew up big data

Uncovering (some of) the Main Players

Page 23: Piranha vs. mammoth   predator appliances that chew up big data

Oracle Big Data Appliance

• End goal: Get data into Oracle Database 11g• Includes open source Hadoop (Now Cloudera)• Oracle NoSQL Database (JVM DB vs. HDFS!)• Oracle Loader for Hadoop (more next slide)• Open source distribution of R• Oracle Linux + Oracle Java Hot Spot VM

Page 24: Piranha vs. mammoth   predator appliances that chew up big data

Oracle Big Data Appliance

• Oracle Data Integrator + Hadoop API– Easy upload to HDFS by automating MAP-R– Validate constraints of Hives– Add Data to Hives– Upload to Oracle using Oracle Loader for Hadoop– Allows query of Hives, using Oracle SQL, via a

“connector” Oracle Table

Page 25: Piranha vs. mammoth   predator appliances that chew up big data

Oracle Big Data Appliance

• Type: Hadoop Engine - Software Based Appliance• Does the job – See next slide• Quick and simple setup – Medium (Oracle)• Quick and easy updates – Medium (Oracle/CDH?)• Easy control of one or many instances• Simple Infrastructure requirements – Medium (Oracle)• Reliable underlying system • No delays doing it’s job - ?• What else?

– Great if you’ve got Oracle already– Add on to Oracle Exadata Hardware / Data Warehouse

Page 26: Piranha vs. mammoth   predator appliances that chew up big data

Oracle Big Data Appliance

• Can do most of the job requirements• Exceptions:– Process Data faster – Looks like…– Process Data cheaper – Oracle is not a cheap

product…– Flexibility and Lock-In prevention - Medium

Page 27: Piranha vs. mammoth   predator appliances that chew up big data

Cloudera

• Integrated, Tested collection of Open Source Apache Hadoop (more next slide)

• HDFS is the NOSQL Database...• Management Console for rapid node deploy• Free up to X nodes• Paid Enterprise Subscription, includes support• Integrated into a bunch of Data software

Giants

Page 28: Piranha vs. mammoth   predator appliances that chew up big data

Cloudera Included Open Source Mods:

• Apache HBase HDFS based tables• Apache Hive SQL-like language• Apache Mahout Machine Learning algorithms• Apache Pig High-level data flow language• Apache Sqoop Engine integrating with SQLDBs• Apache Whirr to deploy Hadoop in the cloud• Hue Browser-based interface for Hadoop

Page 29: Piranha vs. mammoth   predator appliances that chew up big data

Cloudera

• Type: Hadoop Engine - Software Based Appliance• Does the job – See next slide• Quick and simple setup – Great once first node set• Quick and easy updates • Easy control of one or many instances • Simple Infrastructure requirements• Reliable underlying system• No delays doing it’s job - maybe• What else?

– Easy to start as a pilot!– Great for old hardware

Page 30: Piranha vs. mammoth   predator appliances that chew up big data

Cloudera

• Can do most of the job requirements• Exceptions:– Process Data faster – depends on allocated

resources– Process Cheaper – Yes (but cheap HW can be

costly)– Static or Continuous Analysis – needs more tools– Endorsement from Huge Players

Page 31: Piranha vs. mammoth   predator appliances that chew up big data

MapR Special Features(Do You need it?)

• ExpressLane – Small jobs finish quickly (medium)• Mount / use HDFS over NFS (strategic?)• NFS, allows data streaming (Important/lock in?)• Volumes (manage, mirror, snap) – (Important?)• X times more scalable / faster (lock in?)• Name Node and Job Tracker HA (claims regular

hadoop has only 1 Name Node) (Medium)• SW Snapshot/Mirror (Fast? Complex?)

Page 33: Piranha vs. mammoth   predator appliances that chew up big data

TeraData Aster MapR Appliance

• Hadoop is not at the front, MAP Reduce is• Short learning curve, using current DW tool• MPP is already built in for scale as part of DW• Reliability and Performance done by HW• Connectivity (JDBC,ODBC) to Big Data: Cloudera• Guess Price is higher than Hadoop solutions• Platform: SuSE Linux• Aster Data nCluster Amazon AWS Cloud Edition

Page 34: Piranha vs. mammoth   predator appliances that chew up big data

TeraData Aster MapR Appliance

• Type: Data Warehouse Hardware Engine + API to Hadoop / Analytics

• Does the job – See next slide• Quick and simple setup – • Quick and easy updates • Easy control of one or many instances • Simple Infrastructure requirements – Specialized HW…• Reliable underlying system• No delays doing it’s job - maybe

Page 35: Piranha vs. mammoth   predator appliances that chew up big data

TeraData Aster MapR Appliance

• Can do most of the job requirements• Exceptions:– Run a pilot to verify it is worth it – probably pricy…

unless using the Software / Cloud editions– Process Data cheaper – probably not so…– Static or Continuous Analysis of Data – Should Excel!– Lock-In – probably, not sure how much

• Turn Operational Data into Assets - Should Excel at this…

Page 37: Piranha vs. mammoth   predator appliances that chew up big data

DataDirect Networks Big Data Storage Appliances

• “Science Fiction” I/O Performance– Single Array: 40GB⁄s and 1.4 Million Flash IOPS– Up to 25 FC/Infiniband hooked arrays: 1TB⁄s +– More info and pricing

Page 38: Piranha vs. mammoth   predator appliances that chew up big data

Quantum StorNext Metadata Appliances

• Special additional features:– Huge file size support– Huge amount of files support– Varying Operating System direct access support

Page 39: Piranha vs. mammoth   predator appliances that chew up big data

Hardware Storage “Only”

• Does the job – See next slide• Quick and simple setup – Once you set the HW • Quick and easy updates - probably• Easy control of one or many instances • Simple Infrastructure requirements –

Specialized HW…• Reliable underlying system• No delays doing it’s job

Page 40: Piranha vs. mammoth   predator appliances that chew up big data

Hardware Storage “Only”

• Can do SOME of the job requirements• Exceptions: Can’t do all those without

additional software– Run a pilot to verify it is worth it – too costly for a

pilot?– Process Data faster– Process Data cheaper– Flexibility and Lock-In prevention

Page 42: Piranha vs. mammoth   predator appliances that chew up big data

Amazon Elastic MapReduce (Amazon EMR)

• Type: Cloud based VMs + Hadoop Engine• Cost Affective (not always = cheap!)• Includes Hadoop SW such as MapR including all

MapR advanced SW based File Services• Easily add or remove nodes– Pre set VMs– Easy mass deployment using AWS console

• HA integrated into Amazon S3• Hadoop Hbase DB as EMR service

Page 43: Piranha vs. mammoth   predator appliances that chew up big data

Google Compute Engine Special Features

• Type: Cloud based VMs + Hadoop Engine• Based on CentOS (nice – open…)• Various disk types (all encrypted, fast)– Non Persistent (dies with the VM)– Persistent – shared + snapshots– Cloud based (looks similar to Amazon S3)

• Cheaper than Amazon?

Page 44: Piranha vs. mammoth   predator appliances that chew up big data

Amazon Elastic MapReduce (Amazon EMR)

• Does the job – See next slide• Quick and simple setup• Quick and easy updates - probably• Easy control of one or many instances • Simple Infrastructure requirements • Reliable underlying system• No delays doing it’s job

Page 45: Piranha vs. mammoth   predator appliances that chew up big data

Amazon Elastic MapReduce (Amazon EMR)

• Can do most of the job requirements• Exceptions:– Extend your Existing Data Warehouse to include Non-

Structured Data - Your DW out in the cloud …– Run a pilot to verify it is worth it – Excels at this!– Process Data faster– Process Data cheaper– Static or Continuous Analysis of Data– Turn Operational Data into Assets Operational in the Cloud…

Page 46: Piranha vs. mammoth   predator appliances that chew up big data

Cloud Based API with Hooks to Hadoop

• Google APP Engine Map Reduce

• Microsoft Big Data via Windows Azure

Page 47: Piranha vs. mammoth   predator appliances that chew up big data

Google APP Engine Map Reduce

• open-source library for doing MapReduce on the Google App Engine platform

• Can process data store entities and blob files (probably Google Cloud Storage)

• Both in memory and disk operation• Scale up or down “working threads”• Python and Java support• Experimental, still allows a look into the future…

Page 48: Piranha vs. mammoth   predator appliances that chew up big data

Google APP Engine Map Reduce

• Does the job – See next slide• Quick and simple setup – Once you learn the

API• Quick and easy updates • Easy control of one or many instances • Simple Infrastructure requirements • Reliable underlying system – still Beta…• No delays doing it’s job

Page 49: Piranha vs. mammoth   predator appliances that chew up big data

Google APP Engine Map Reduce

• Can do SOME of the job requirements• Exceptions:– Extend your Existing Data Warehouse – Cloud Security and

DW– Run a pilot to verify it is worth it – could be great!– Process Data faster– Process Data cheaper– Static or Continuous Analysis of Data– Flexibility and Lock-In prevention – Code is open, but

Process may not be – Turn Operational Data into Assets – Cloud Security…

Page 50: Piranha vs. mammoth   predator appliances that chew up big data

Microsoft Big Data via Windows Azure

• Provides SQL Server Hadoop Connector Provides ODBC Hadoop connector to tie MS Office and other Apps to Hadoop Hive

• Seems similar to DW providers who have connector to Hadoop– Reason: It is not clear exactly where and how

Azure Cloud Implementation goes…

Page 51: Piranha vs. mammoth   predator appliances that chew up big data

Which Big Data Appliance should YOU use?

Page 52: Piranha vs. mammoth   predator appliances that chew up big data

Which Big Data Appliance should YOU use?

• Let’s look at the Big Data Appliance Job to be Done and ask questions:

• Where are you and what is your goal?– So you have some of the puzzle pieces?– Any constraints?– Long term vs. Short term?– (Always start with a Pilot, if this is your first time…)

Page 53: Piranha vs. mammoth   predator appliances that chew up big data

Challenges, Pitfalls, andWinning the Big Data Game

Page 54: Piranha vs. mammoth   predator appliances that chew up big data

Challenges, Pitfalls, andWinning the Big Data Game

• You can’t get much of Big Data if you don’t know how to find useful insights (Lack of Data Scientists)

• The same abilities you needed for Data Warehouse digging, you need with Big Data, even more

• Commoditization of the data warehouse (hadoop + Cloud) = More players and innovation

Page 55: Piranha vs. mammoth   predator appliances that chew up big data

Challenges, Pitfalls, andWinning the Big Data Game

• You can’t make use of it, if you lack innovative quick agile abilities to change direction and respond on time

• Privacy (implied and specific)• Security (implied and specific)• To pay cheap (many X86 nodes) you need Mass

Node Management APP• Big DW Vendors embrace hadoop through solution

providers such as Cloudera and HortonWorks, but it “feels” a bit “vague”

Page 56: Piranha vs. mammoth   predator appliances that chew up big data

Where is all this leading YOU to?

Page 57: Piranha vs. mammoth   predator appliances that chew up big data

Where is all this leading YOU to?

• The Simple Stuff (I know it looks complicated)– Crunching More and Faster for Less– Optimizing the Process and Utilizing the right Tools

• The real challenge: Turning Data into an Asset– Finding: The Golden Nuggets– Deciding: What should I do now?– Pitching and leading: The Transformation

• Big Data does not mean Endless Capacity…• Don’t get lost in the Technology Play Ground

Page 58: Piranha vs. mammoth   predator appliances that chew up big data

Q&A Soon…But First,I need Your Help now…

1. Please rate the Webinar2. Download the resource attachments for future

use3. Register to my channel on BrightTalk4. Spread the word5. Have fun with Big Data and Enjoy Life

Page 59: Piranha vs. mammoth   predator appliances that chew up big data

Questions?

Page 60: Piranha vs. mammoth   predator appliances that chew up big data

Reminder…

1. Please rate the Webinar2. Download the resource attachments for

future use3. Register to my channel on BrightTalk4. Spread the word5. Have fun with Big Data and Enjoy Life

Page 61: Piranha vs. mammoth   predator appliances that chew up big data