Barak regev

21
Big Data Turning your data problem into a competitive advantage Barak Regev Head of Cloud Platform - EMEA

description

 

Transcript of Barak regev

Page 1: Barak regev

Big Data Turning your data problem into a competitive advantage

Barak RegevHead of Cloud Platform - EMEA

Page 2: Barak regev

Managing big data is hard

20 min in 1 minute

Put your data to work for you.

There is a better way

Page 3: Barak regev

How we do it - Google Infrastructure

4 billion hours of video per month

425 million Gmail users

100,000,000 GB web Index

0.25 secs to search results

Page 4: Barak regev

Defining Big DataPractical problems & opportunities

Page 5: Barak regev

“ How are hotel reservations for Spain from New York compared with this time last year? ”

“ Do we need to adjust our marketing campaign? Where? ”

CenterParcs - European hospitality

Page 6: Barak regev

“ Which users who signed up last quarter, have also advanced at least 3 levels, and purchased an item worth more than $5? ”

Claritics - mobile & social user analytics

Page 7: Barak regev

Business & IT trends driving Big Data

ChallengesOpportunities

Data is a core business asset

Increasingly data is out in the Cloud (e.g. social, CRM)

New things are possible in the Cloud (unique algorithms, scale)

Greatly increased speed of sharing and iteration

Information is growing faster than ability to leverage it

Tough for Enterprise to capture all the data they generate

Scaling traditional BI for Big Data can be hard

Skills: requires IT, analytics, software development

Page 8: Barak regev

Some common characteristics

What does Big Data look like?

Diverse industries

Retail point of sales transactions

User activity logs (mobile & social)

Mobile telemetry & smart devices

Industrial & manufacturing

Financial trading

Medical research (e.g. genomics)

Movie rendering & production

Structured, semi-structured, unstructured

Millions if not billions of rows

Too large to process on a single machine

Too large to store on a single machine

High rate of growth

More daily

Page 9: Barak regev

Put the Data to workGoogle cloud services for Big Data

Page 10: Barak regev

Composable cloud services

Focus on the solution rather than on the infrastructure

Do new things that weren't possible before

Pay for what you use.

Use the cloud

Page 11: Barak regev

BIG DATA LOG ANALYSIS

Scalable Storage

Google Spreadsheets

App EngineApp

MarketingMerchandisingLocal StoresPartners

POS,ClickstreamRFIDCustomer LoyaltyAdd clickthroughs..

Corporate data3rd party data

API

Analyze interactivelyProduct Affinity, Market Basket etc

Securely Share/

distribute the resultsStore all your data

in the cloud

SQL

Other BI Tools

Data sets for further Analysis

BigQuery

Page 12: Barak regev

Scaling large ads reporting Customer load test: On-prem MySQL vs BigQueryLatency

(seconds)

# days of data

Business: ads authoring tools and reportingData: ad serving logs for 500 websites, ~300M rows/dayProblem solved: interactively finding new trends and patterns

Page 13: Barak regev

A New Hadoop Terasort World Record

Page 14: Barak regev

What did we learn?

Store data with reliability, redundancy and consistency

Go from Data to Meaning

At Scale

...fast

Google white papersGoogle File System (2003)

MapReduce: Simplified Data Processing on Large Clusters (2004)

BigTable: A Distributed Storage System for Structured Data (2006)

Dremel: Interactive Analysis of Web-Scale Datasets (2010)

Machine Translation (2004-2011)

Page 15: Barak regev

The virtuous cycle of data

Build application (GAE / GCE)

Collect Data(Cloud Storage, Datastore,

Logstore)

Process Data(App Engine, GCE)

Analyze Data, (BigQuery)

(improve)

Page 16: Barak regev

BuildThe Next Generation of Data-Centric Applications

Page 17: Barak regev

BigQuery use cases in industry

Ad Spend Attribution(online travel reservations)

Media consulting(global top-5 media agency)

Ad authoring tools(online ads authoring)

Revenue optimization(holiday/travel properties)

Mash up Adwords + Google Analytics data + customer reservations for high volume attribution analysis

Analyze 20GB/day of DoubleClick display ads performance metrics for F500 clients

Deliver x-platform performance analytics dashboards to 100s of ads authoring customers

Measure x-media campaign effectiveness to maximize occupancy rates

Social gaming(data analytics vendor)

Cohort analysis on million+ gamers to monetize massive online social gaming

Business RequirementsA single place to capture growing dataCombine data from different sources

Ad hoc detection of patterns and correlationsEasily share data insights with org

Distribute data-based decision making

Page 18: Barak regev

Interactively analyze 450M rows of sales data

BIME + BigQuery

Page 19: Barak regev

Mobile & social gaming user analysis

Notice trend change

Slice user data, identify segments

Compare segments vs general population

Page 20: Barak regev

Revenue optimization - hospitality industry

New solution for real-time decision makingSaves more than $150,00 a year

Cloud StorageBigQuery

AppEngine

Oracle DB

Analysts Execs

BI team

Netezza appliance

Regional Sales

Page 21: Barak regev

cloud.google.com

Thank you