New Analytical Architectures for Big Data

Post on 07-Dec-2014

584 views 1 download

Tags:

description

Why Classic Data Warehousing Architectures miss the mark with the New Analytics

Transcript of New Analytical Architectures for Big Data

New Analytical Architectures

March 21, 2013

Casey Kiernan • casey.l.kiernan@gmail.com

Blog • www.the-data-platform.com

Why Classic Data Warehousing Approaches Miss the Mark with Big Data

Doug Cutting“Hadoop is the kernel of a new Distributed Data OS”

“The Future is Data”

Transactional

Communities

Personal

Data has Changed

> Trailing Indicators

> Reach/Influence

> Interactive

> Analytics has Changed

Can the Data Warehouse Architecture adapt?

The World as I See it

“Data” is the Platform

New DataClutch Analytics

Wink Eller

My Mountain Bike

Guidance

PerformanceRate of ClimbCalories BurnedMiles ObtainedTotal ClimbedElapsed Time

Current, Average, Max Values

Data CollectionSpeed / Trip Miles

Data CollectionCadence / RPM

Data Collection Heart Rate

Data Collection AltitudeTemperatureTime

Data Architecture - on a Local Wireless Network (ANT+ Protocol)

as a Data Platform

“Personal” Ride Analytics

…is this a Data Warehouse?

Beha

vior

s

Content

Progression of B

ehaviors

New Data Behaviors (individual actions) > Content > Time

Time Varia

nce

9

Gui

danc

e

Data

Meaningful

Massive

New Data More is Better…

BUSINESS INTELLIGENCEOLAP / DATA WAREHOUSE

OLTP / TRANSACTIONSDATA.

“Business” Analytics - Classic “DW”

Answers the question: What are our most profitable Products?

11

What will Happen?What did Happen?

StrategicTactical TrendingOperational Reporting

Months Weeks Weeks Months Years

Classic “Business” AnalyticsGood for Reporting, Forecasting

Descriptive/Trending Analytics

New“Personal” Analytics

Answers the question: Show me a good movie to watch!

DATA.

SELF-SERVICEGUIDANCE

BEHAVIOURS

StrategicTactical TrendingOperational Reporting

13

What will Happen?What did Happen?

Months Weeks Weeks Months Years

What is Happening RIGHT NOW!

“Personal” Analytics“Right Now” is a very important time-frame!

Predictive/Prescriptive Analytics

14

15

16

Ordering App

Data WarehouseOLTP to OLAP

Mapping

OLAP / ReportsFacts/DimensionsFinancial App

Master Data

BusinessAnalyst

What are our most Profitable Products?

Stag

ing

“Business” Analytical ArchitectureClassic “DW” Data Flow - Uni-Directional, Latent,…

Business Metrics, KPI, YTD ReportingFacts &

Dimensions

17

Application / UX

AnalyticsData

“Personal” Analytical Architecture

DataAnalysts

Analytical CapabilitiesScoring/Ranking, Recommendations,Natural Language Processing, Relevancy, Classification, Optimization, Collaborative Filtering,Personalization,Digital Attribution,…

“New” Data Flow - Iterative, Specialized, Extensible, plug & play Analytics, near real-time [Some components are open-source]

What movie should I watch tonight?

18

Published Analytics “Read” Performance

App Persistence“State” PersistencePersistence/Analytics

Mass Data StorageBehaviors / “Write” Performance

PersonalizedRecommendations

Personalization,Preferences, State

End-User ExperienceBrowser, Tablet,

Mobile,…Self-Service Application

“Personal Analytics” Data Architecture

Analytics EnginesPluggable

Data Scientists

“New” Data Flow – Detailed View of Components

Social SignalsRSS/Facebook/…

SALLY LIKES TACOS

HOW DO WE MODEL THIS DATA?

Let’s get personal…

Classic “DW” Data Model

OBJECT PREDICATE (Score) SUBJECT

SALLY LIKES (143) TACOS

MARY LIKES (200) TACOS

THE_TACO_SHOP MENU_ITEM TACOS

SALLY LIKES (125) THE_TACO_SHOP

SALLY CITY VENICE BEACH

THE_TACO_SHOP CITY VENICE BEACH

SALLY FRIEND (187) MARY

“Triples” - Directed (Weighted) Acyclic GraphModeling Social Data

Reach and Influence

Collaborative Filtering

Analyzing Relationships Reach and Influence

How important is Social?

Install ghostery.comShows you who is actively watching you surf the web! Lots of people!!!

Signals – The Core of New Data

SocialPersonalContent

Time

Mixture of Proprietary and Public Data

26

Published AnalyticsHbase

App PersistenceCassandra, Riak,…Persistence/Analytics

Data-Center or Cloud

Mass Data StorageHadoop

PersonalizedRecommendations

Personalization,Preferences, State

End-User ExperienceBrowser, Tablet,

Mobile,…Self-Service Application

Specialization of Data Technologies

AnalyticsR, Mahout, Pig

The New “Analytical Application” Architecture“New” Data Flow – Specialized Technology Choices

p. 27

Published Analytics

HBase

PersistenceRiak

Mass Data StorageBehaviors / “Write” Performance

Hadoop / AWS

Self-Service Application A

Analytics EnginePluggable

Data Scientists

Analytics EnginePluggableAnalytics Engine

Pluggable

Published Analytics

MySQL

PersistenceCassandra

Self-Service Application B

Servicing Multiple Analytical SystemsUsing Shared Analytical Mas- Storage

Integrating the Architectures

28

Data WarehouseOLTP to OLAP Mapping

OLAP / Reports

BusinessAnalystSt

agin

g

AppOnly Financial Events ($$$) cross the threshold(and are recorded into) the Data Warehouse

App

App

“Local” Events stay Local (they are analyzed locally)

“Personal” Analytics Stack + Classic “DW” Stack

Not all DATA Belongs in the Data Warehouse!

Classic DW New Analytics

Scope Enterprise Application

Analytics Trailing: OLAP Predictive: Machine LearningSentiment Analysis, Recommendations, Personalization, Natural Language Processing, Classification, Clustering, Optimization, Collaborative Filtering,Digital Attribution,…

Actionable? Loosely Coupled Tightly Coupled Analytics Embedded in Application

Data Structures Facts/Dimensions(Requires a DW)

Semantic Data, Graph / Triples, Observations, Direct Signals

Knowledge Expert Business Analyst Data Scientist

Technology Stack Vendor Driven ($$$) Open-Source

Architecture Scale-Up Scale-Out (or in the Cloud)

Classic DW Vs. the New AnalyticsThe Shift from “Business” Analytics to “Personal” Analytics

New Signals + New Analytics = New Scenarios

Data

Signals

Social

Location

Personal

Behaviors

Transactions

Content

Time

New Analytics

Recommendations,Natural Language

Processing, Relevancy,

Classification, Optimization, Collaborative

Filtering,Digital

Attribution,…

NewScenariosCustomer

Engagement, Customer Loyalty / Attrition / Retention, Fraud, Risk Analysis,

Intent, Customer Personalization

Thank You!

casey.l.kiernan@gmail.comblog: www.the-data-platform.com