Hadoop in the Cloud

11
The Elephant in the Room Jim O’Neil Developer Evangelist, Microsoft [email protected] @jimoneil

description

Overview of Public Clou

Transcript of Hadoop in the Cloud

Page 1: Hadoop in the Cloud

The Elephant in the Room

Jim O’Neil Developer Evangelist, Microsoft

[email protected] @jimoneil

Page 2: Hadoop in the Cloud

DIY

Roll your own Hadoop cluster….welcome to DevOps

Pallet

“Isotope”

Page 3: Hadoop in the Cloud

Appliances

Oracle Big Data Appliance– 18 server / 12 core each / 40Gb Infiniband– Partnering with Cloudera on the distribution

Greenplum HD Data Computing Appliance – 18 nodes, 12 core each– Straight up Apache Hadoop

NetApp Open Solution for Hadoop– Storage arrays only (E2660 and FAS2040)– Partnership with Cloudera

Page 4: Hadoop in the Cloud

The Elephant in the CloudJim O’NeilDeveloper Evangelist, [email protected] @jimoneil

Page 5: Hadoop in the Cloud

Cloud: a Notional Definition

Essential Characteristics

On-demand self-service

Broad network access

Resource Pooling

Rapid Elasticity

Measured serviceServ

ice

Mod

elsInfrastructure as a Service

Platform as a Service

Software as a Service

Deployment Models

Private Cloud

Public Cloud

Hybrid Cloud

Community Cloud

Page 6: Hadoop in the Cloud

Hadoop in the Cloud

Google App Engine

appengine-mapreduce API (not really Hadoop)

Amazon Web Services66 Public AMIs (including Cloudera)Elastic Map Reduce

Windows AzureHadoop on Azure

IBM SmartCloudInfosphere BigInsights

Page 7: Hadoop in the Cloud

Google App Engine

MapreducePipeline Class

Experimental!

Mapreduce is an experimental, innovative, and rapidly changing new feature for App Engine. Unfortunately, being on the bleeding edge means that we may make backwards-incompatible changes to Mapreduce. We will inform the community when this feature is no longer experimental.

Page 8: Hadoop in the Cloud

Amazon EMR

u

Page 9: Hadoop in the Cloud

Windows Azure

http://HadoopOnAzure.com

Currently in Customer Technology PreviewPartnership with Hortonworks

Windows updates to ApacheJavaScript frameworkHive ODBC connector

Page 10: Hadoop in the Cloud

IBM SmartCloud

InfoSphere BigInsightsIBM distribution of Hadoop (0.20.2)Jaql query languageBigSheetsBigInsight Scheduler“Hadoop ecosystem”

Hive, Avro, Hbase, Pig, Oozie, Flume

Page 11: Hadoop in the Cloud

Jim O’Neil Developer Evangelist, Microsoft

[email protected] @jimoneil

I meant what I said, and I said what I meant.

An elephant's faithful, one hundred percent.