Bigger Data For Your Budget

39
V Dave Porter Dave Porter – SproutCore Architect, Appnovation [email protected] Bigger Data For Your Budget CANADIAN HEADQUARTERS 152 West Hastings Street Vancouver BC, V6B 1G8 UNITED STATES OFFICE 3414 Peachtree Road, #1600 Atlanta Georgia, 30326-1164 UNITED KINGDOM OFFICE 3000 Hillswood Drive Hillswood Business Park Chertsey KT16 0RS, UK www.appnovation.co m [email protected] om How to turn your Big Data into Big Insights without breaking the bank

description

How to turn your Big Data into Big Insights without breaking the bank.

Transcript of Bigger Data For Your Budget

Page 1: Bigger Data For Your Budget

V Dave Porter

Dave Porter – SproutCore Architect, [email protected]

Bigger Data For Your Budget

CANADIAN HEADQUARTERS152 West Hastings StreetVancouver BC, V6B 1G8

UNITED STATES OFFICE3414 Peachtree Road, #1600Atlanta Georgia, 30326-1164

UNITED KINGDOM OFFICE3000 Hillswood DriveHillswood Business ParkChertsey KT16 0RS, UK

[email protected]

How to turn your Big Data into Big Insights without breaking the bank

Page 2: Bigger Data For Your Budget

V Dave Porter

John KreisaVP Marketing, Hortonworks

Dave PorterSproutCore Architect,

Appnovation Technologies

Speakers

Page 3: Bigger Data For Your Budget

V Dave Porter

Appnovation is one of the world’s TOP OPEN SOURCE DEVELOPMENT SHOPS.

Page 4: Bigger Data For Your Budget

V Dave Porter

LOCATIONS

VANCOUVER OFFICE152 West Hastings StreetVancouver BC, V6B 1G8

ATLANTA OFFICE3414 Peachtree Road, #1600Atlanta Georgia, 30326-1164

LONDON OFFICE3000 Hillswood DriveHillswood Business ParkChertsey KT16 0RS, UK

Page 5: Bigger Data For Your Budget

V Dave Porter

Page 6: Bigger Data For Your Budget

V Dave Porter

Bigger DataFor Your Budget

Page 7: Bigger Data For Your Budget

V Dave Porter

DatabasesServer logs

Raw transactional dataHuman-Quality Input

WHAT IS BIG DATA?

Page 8: Bigger Data For Your Budget

V Dave Porter

Website Traffic Patterns

Financial Transactions

Science

People

WHERE IS IT COMING FROM?

Page 9: Bigger Data For Your Budget

V Dave Porter

Page 10: Bigger Data For Your Budget

V Dave Porter

Curing Cancer

Beating XDR-TB

Finding Earth 2.0 in Outer Space

Seeing Deeper Into Your Business

THE PROMISE OF BIG DATA

Page 11: Bigger Data For Your Budget

V Dave Porter

THE PROMISE OF BIG DATA

Page 12: Bigger Data For Your Budget

V Dave Porter

Retail Inventory System

WHAT CAN BIG DATA DO FOR ME?

Page 13: Bigger Data For Your Budget

V Dave Porter

Retail Inventory System

Overnight Batch Cycle

WHAT CAN BIG DATA DO FOR ME?

Page 14: Bigger Data For Your Budget

V Dave Porter

Retail Inventory System

Hourly Cycle

WHAT CAN BIG DATA DO FOR ME?

Page 15: Bigger Data For Your Budget

V Dave Porter

Collecting & Storing

Processing & Analyzing

THE BIG DATA CHALLENGES

Page 16: Bigger Data For Your Budget

V Dave Porter

Collecting & Storing…on expensive hardware

Processing & Analyzing…with expensive software

THE BIG DATA CHALLENGES

Page 17: Bigger Data For Your Budget

V Dave Porter

Bigger DataFor Your Budget

Page 18: Bigger Data For Your Budget

V Dave Porter

Open Source Software,

Running on Commodity Hardware.

BIGGER DATA FOR YOUR BUDGET

Page 19: Bigger Data For Your Budget

V Dave Porter

BIGGER DATA FOR YOUR BUDGET

Page 20: Bigger Data For Your Budget

V Dave Porter

Gnomes … with flashlights (and notepads)

HADOOP:BIGGER DATA FOR YOUR BUDGET

Page 21: Bigger Data For Your Budget

V Dave Porter

+

HADOOP:BIGGER DATA FOR YOUR BUDGET

Page 22: Bigger Data For Your Budget

© Hortonworks Inc. 2013

A Brief History of Apache Hadoop

Page 22

2013

Focus on INNOVATION2005: Yahoo! creates

team under E14 to work on Hadoop

Focus on OPERATIONS2008: Yahoo team extends focus to

operations to support multiple projects & growing clusters

Yahoo! begins to Operate at scale

EnterpriseHadoop

Apache Project Established

HortonworksData Platform

2004 2008 2010 20122006

STABILITY2011: Hortonworks created to focus on “Enterprise Hadoop“. Starts with

24 key Hadoop engineers from Yahoo

Page 23: Bigger Data For Your Budget

© Hortonworks Inc. 2013

Hortonworks Snapshot

Page 23

• We distribute the only 100% Open Source Enterprise Hadoop Distribution: Hortonworks Data Platform

• We engineer, test & certify HDP for enterprise usage

• We employ the core architects, builders and operators of Apache Hadoop

• We drive innovation within Apache Software Foundation projects

• We are uniquely positioned to deliver the highest quality of Hadoop support

• We enable the ecosystem to work better with Hadoop

Develop Distribute Support

We develop, distribute and support the ONLY 100% open source Enterprise Hadoop distribution

Endorsed by Strategic Partners

Headquarters: Palo Alto, CAEmployees: 180+ and growingInvestors: Benchmark, Index, Yahoo

Page 24: Bigger Data For Your Budget

© Hortonworks Inc. 2013

Hortonworks Process for Enterprise Hadoop

Page 24

Upstream Community Projects Downstream Enterprise Product

HortonworksData Platform

Design & Develop

Distribute

Integrate & Test

Package & Certify

ApacheHCatalo

g

ApachePig

ApacheHBase

Other Apache Projects

ApacheHive

Apache Ambari

ApacheHadoop

Test &Patch

Design & Develop

Release

No Lock-in: Integrated, tested & certified distribution lowers risk by ensuring close alignment with Apache projects

Virtuous cycle when development & fixed issues done upstream & stable project releases flow downstream

Stable Project Releases

Fixed Issues

Page 25: Bigger Data For Your Budget

© Hortonworks Inc. 2013

Enhancing the Core of Apache Hadoop

Deliver high-scale storage & processing with enterprise-ready platform services

Unique Focus Areas:• Bigger, faster, more flexible

Continued focus on speed & scale and enabling near-real-time apps

• Tested & certified at scale Run ~1300 system tests on large Yahoo clusters for every release

• Enterprise-ready servicesHigh availability, disaster recovery, snapshots, security, …

Page 25

HADOOP CORE

Hortonworkers are the architects, operators, and builders of core Hadoop

Distributed Storage & Processing

PLATFORM SERVICES Enterprise Readiness

Page 26: Bigger Data For Your Budget

© Hortonworks Inc. 2013Page 26

HADOOP CORE

DATASERVICES

Provide data services to store, process & access data in many ways

Unique Focus Areas:• Apache HCatalog

Metadata services for consistent table access to Hadoop data

• Apache Hive Explore & process Hadoop data via SQL & ODBC-compliant BI tools

Distributed Storage & Processing

Hortonworks enables Hadoop data to be accessed via existing tools & systems

Store, Process and Access Data

PLATFORM SERVICES Enterprise Readiness

Data Services for Full Data Lifecycle

Page 27: Bigger Data For Your Budget

© Hortonworks Inc. 2013

Operational Services for Ease of Use

Page 27

OPERATIONAL SERVICES

Include complete operational services for productive operations & management

Unique Focus Area:• Apache Ambari:

Provision, manage & monitor a cluster; complete REST APIs to integrate with existing operational tools; job & task visualizer to diagnose issues

Only Hortonworks provides a complete open source Hadoop management tool

Manage & Operate at

Scale

DATASERVICES

Store, Process and Access Data

HADOOP CORE Distributed Storage & Processing

PLATFORM SERVICES Enterprise Readiness

Page 28: Bigger Data For Your Budget

© Hortonworks Inc. 2013

OS Cloud VM Appliance

Page 28

PLATFORM SERVICES

HADOOP CORE

DATASERVICES

OPERATIONAL SERVICES

Manage & Operate at

Scale

Store, Process and Access Data

Enterprise Readiness

Only Hortonworks allows you to deploy seamlessly across any deployment option

• Linux & Windows• Azure, Rackspace & other clouds• Virtual platforms• Big data appliances

HORTONWORKS DATA PLATFORM (HDP)

Distributed Storage & Processing

Deployable Across a Range of Options

Page 29: Bigger Data For Your Budget

© Hortonworks Inc. 2013

OS Cloud VM Appliance

HDP: Enterprise Hadoop Distribution

Page 29

PLATFORM SERVICES

HADOOP CORE

DATASERVICES

OPERATIONAL SERVICES

Manage & Operate at

Scale

Store, Process and Access Data

HORTONWORKS DATA PLATFORM (HDP)

Distributed Storage & Processing

Hortonworks Data Platform (HDP)Enterprise Hadoop

• The ONLY 100% open source and complete distribution

• Enterprise grade, proven and tested at scale

• Ecosystem endorsed to ensure interoperability

Enterprise Readiness

Page 30: Bigger Data For Your Budget

© Hortonworks Inc. 2013

Existing Data Architecture

Page 30

APPL

ICAT

ION

SDA

TA S

YSTE

MS

TRADITIONAL REPOSRDBMS EDW MPP

DATA

SO

URC

ES

OLTP, POS SYSTEMS

OPERATIONALTOOLS

MANAGE & MONITOR

Traditional Sources (RDBMS, OLTP, OLAP)

DEV & DATATOOLS

BUILD & TEST

Business Analytics

Custom Applications

Enterprise Applications

Page 31: Bigger Data For Your Budget

© Hortonworks Inc. 2013

An Emerging Data Architecture

Page 31

APPL

ICAT

ION

SDA

TA S

YSTE

MS

TRADITIONAL REPOSRDBMS EDW MPP

DATA

SO

URC

ES

MOBILEDATA

OLTP, POS SYSTEMS

OPERATIONALTOOLS

MANAGE & MONITOR

Traditional Sources (RDBMS, OLTP, OLAP)

New Sources (web logs, email, sensor data, social media)

DEV & DATATOOLS

BUILD & TEST

Business Analytics

Custom Applications

Enterprise Applications

HORTONWORKS DATA PLATFORM

Page 32: Bigger Data For Your Budget

© Hortonworks Inc. 2013

Interoperating With Your Tools

Page 32

APPL

ICAT

ION

SDA

TA S

YSTE

MS

TRADITIONAL REPOS

DEV & DATATOOLS

OPERATIONALTOOLS

Viewpoint

Microsoft Applications

HORTONWORKS DATA PLATFORM

DATA

SO

URC

ES

MOBILEDATA

OLTP, POS SYSTEMS

Traditional Sources (RDBMS, OLTP, OLAP)

New Sources (web logs, email, sensor data, social media)

Page 33: Bigger Data For Your Budget

© Hortonworks Inc. 2013

Big DataTransactions, Interactions, Observations

Hadoop Patterns of Use

Page 33

Business Case

HORTONWORKS DATA PLATFORM

Refine Explore Enrich

Page 34: Bigger Data For Your Budget

© Hortonworks Inc. 2013

Operational Data Refinery

Page 34

DATA

SYS

TEM

SDA

TA S

OU

RCES

1

31 Capture

Capture all data

ProcessParse, cleanse, apply structure & transform

ExchangePush to existing data warehouse for use with existing analytic tools

2

3

Refine Explore Enrich

2

APPL

ICAT

ION

S

Collect data and apply a known algorithm to it in trusted operational process

TRADITIONAL REPOSRDBMS EDW MPP

HORTONWORKS DATA PLATFORM

Business Analytics

Custom Applications

Enterprise Applications

Traditional Sources (RDBMS, OLTP, OLAP)

New Sources (web logs, email, sensor data, social media)

Page 35: Bigger Data For Your Budget

© Hortonworks Inc. 2013

Big Data Exploration & Visualization

Page 35

DATA

SYS

TEM

SDA

TA S

OU

RCES

Refine Explore Enrich

APPL

ICAT

ION

S

1 CaptureCapture all data

ProcessParse, cleanse, apply structure & transform

ExchangeExplore and visualize with analytics tools supporting Hadoop

2

3

Collect data and perform iterative investigation for value

3

2TRADITIONAL REPOS

RDBMS EDW MPP

1

HORTONWORKS DATA PLATFORM

Business Analytics

Traditional Sources (RDBMS, OLTP, OLAP)

New Sources (web logs, email, sensor data, social media)

Page 36: Bigger Data For Your Budget

© Hortonworks Inc. 2013

Application Enrichment

Page 36

DATA

SYS

TEM

SDA

TA S

OU

RCES

Refine Explore Enrich

APPL

ICAT

ION

S

1 CaptureCapture all data

ProcessParse, cleanse, apply structure & transform

ExchangeIncorporate data directly into applications

2

3

Collect data, analyze and present salient results for online apps

3

1

2TRADITIONAL REPOS

RDBMS EDW MPP

Traditional Sources (RDBMS, OLTP, OLAP)

New Sources (web logs, email, sensor data, social media)

Custom Applications

Enterprise Applications

NOSQL

HORTONWORKS DATA PLATFORM

Page 37: Bigger Data For Your Budget

V Dave Porter

John KreisaVP Marketing, Hortonworks

Dave PorterSproutCore Architect,

Appnovation Technologies

Speakers

Page 38: Bigger Data For Your Budget

V Dave Porter

Next Steps

Hortonworks.com/sandbox

Hortonworks.com/hadoop-training

@Appnovation

[email protected] [email protected]

@hortonworks@hortonworks_U

Appnovation.com/Blog

BlogLEARN

Page 39: Bigger Data For Your Budget

V Dave Porter

Thank You For Your Participation!

CANADIAN HEADQUARTERS152 West Hastings StreetVancouver BC, V6B 1G8

UNITED STATES OFFICE3414 Peachtree Road, #1600Atlanta Georgia, 30326-1164

UNITED KINGDOM OFFICE3000 Hillswood DriveHillswood Business ParkChertsey KT16 0RS, UK

[email protected]