Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job...

23
Greg Pedley Canadian Sales Lead Big Data, Small Data, ALL Data with

Transcript of Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job...

Page 1: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

Greg Pedley Canadian Sales Lead

Big Data, Small Data, ALL Data with

Page 2: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

Agenda

• Introductions

• Handling the Data Deluge!

• Modern Data Warehouse

• Hadoop

• Polybase Demonstration

• What is SQL PDW ?

• PDW Customer Use Case

• Resources

• Q & A

Page 3: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

CUSTOMER DRIVEN DATAPOS data, Loyalty data, etc.

3

Today we have more data than ever but …

… it has never been harder to understand it.

SOCIAL CHANNELSCustomer preferences & brand perception

INTERNAL SYSTEMSProfitability & segmentation data

Page 4: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

Data is complex, time consuming & hard to get at….

• Quantity an explosion of data

• Integration data locked in silos

• Quality data quality is not reliable

• Action slow to get value from data

Page 5: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

… but at same time, it has never been more important to understand massive amounts of data.

5

Page 6: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

6

Wouldn’t it be good if you could do the following ?

Page 7: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

Traditional Data Warehouses At A Tipping Point

Page 8: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

Difficulties with Data Warehousing Today…..

Operational

Systems

Enterprise

Data Warehouse

Data Marts

Business Intelligence

1

2

3

4

1

2

3

Get the data model right—up-front.

Load, clean, transform data fast.

Improve query performance from hours to seconds.

4 Manage multiple types of data.

Page 9: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

…and with the Modern Data Warehouse

Files

Business Intelligence

3

1

2

3

DocumentsBlobs Cube4

2

1

SQL4

Trad

ito

nal

Relational

Page 10: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

How is Microsoft Unique?

1

3

2

4

Business Intelligence

SQL SQL Query:

Polybase

CLOUDAPPLIANCE

1

2

Page 11: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

Data complexity: variety and velocity

Petabytes Big Data

Log files

Spatial & GPS coordinates

Data market feeds

eGov feeds

Weather

Text/image

Click stream

Wikis/blogs

Sensors/RFID/devices

Social sentiment

Audio/video

Types of Big Data?

Page 12: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

What is Hadoop?

12

MapReduce (Job Scheduling/Execution System)

HDFS (Hadoop Distributed File System)

HBase (Column DB)

Hive Mahout

Oozie

Sqoop

HBase/Cassandra/Couch/MongoDB

Avro

Zoo

keep

er

Pig

Hadoop = MapReduce + HDFS

FlumeCascad-ingR

Am

bar

i

HCatalog

Page 13: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

SQL Polybase Demonstration….

Page 14: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

What is SQL Parallel Data Warehouse ?

• PDW = Parallel Data Warehouse

• Massively Parallel Processing (MPP) for high performance

• Sold as an appliance with software preloaded

• Microsoft software running on HP or Dell hardware

• Based on proven MS SQL Server 2012 platform

• Lowest cost of ownership in the industry

• Integral Part of Microsoft’s BIG DATA & Cloud Strategies

• ****Dedicated Region for Hadoop****

Page 15: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

Scale out relational data to petabytes

15

Scale out technologies in SQL Server Parallel Data Warehouse

Page 16: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

Scale out non-relational data

16

Scale out non-relational data in HDInsight (Azure, Windows, or PDW)

Page 17: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

In-memory performance

17

Page 18: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

Distributed Data Warehouse Architecture

High-Performance

Reporting

SQL ServerAnalysis Services

Data Files3rd Party Data Integration

ETL Tools 3rd Party RDBMS

Central EDW Hub

Departmental

Reporting

Accessible from

Anywhere

SQL

Database

Page 19: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive
Page 20: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive
Page 21: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

Resources

• www.UpgradeToPDW.com for a quick video, a downloadable white paper, ROI Calculator, case studies, migration guide, etc.

• Greg Pedley – Canadian Sales Lead – [email protected]

• Tom Pizzato – PDW Technical Lead – [email protected]

Page 22: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive

Conclusion

• Introductions

• Handling the Data Deluge!

• Modern Data Warehouse

• Hadoop

• Polybase Demonstration

• What is SQL PDW ?

• PDW Customer Use Case

• Resources

• Q & A

Page 23: Selling Big Datadownload.microsoft.com/download/8/8/1/881A3C2B-343B-48D2...MapReduce (Job Scheduling/Execution System) HDFS (Hadoop Distributed File System) HBase (Column DB) Hive