Analytic Platforms in the Real World with 451Research and Calpont_July 2012

34
Calpont InfiniDB ® Accelerating Data Insights Where the Rubber Meets the Road – Analytic Platforms in the Real World ® Featuring Matt Aslett, 451Research July 18, 2012

description

Matt Aslett, 451 Research, and Bob Wilkinson, VP Engineering for Calpont, discuss the emergence of the analytic platform, its place the new ecosystem for Big Data, considerations for selection, and applied use cases of Calpont’s analytic platform, InfiniDB, in Telco and Mobile Advertising.

Transcript of Analytic Platforms in the Real World with 451Research and Calpont_July 2012

Page 1: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

Calpont InfiniDB® Accelerating Data Insights

Where the Rubber Meets the Road – Analytic Platforms in the Real World

®

Featuring Matt Aslett, 451Research July 18, 2012

Page 2: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

Today’s Presenters

2

Matt Aslett • Research Manager,

Data Management and Analytics • With 451 Research since 2007 • www.twitter.com/maslett

Information Management Operational databases Data warehousing Data caching Event processing

Commercial Adoption of Open Source (CAOS) Open source projects Adoption of open source software Vendor strategies

Page 3: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

Today’s Presenters

Bob Wilkinson • Calpont Vice President of Engineering • Formerly CTO for Tektronix

Communications • 16 years of product development •Responsible for design, development,

and support of InfiniDB

3

®

Page 4: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

Today’s Discussion

•Matt Aslett o Total Data and the Rise of the Analytic Platform o Analytic Platforms in the Big Data ecosystem o Defining the Analytic Platform

•Bob Wilkinson o InfiniDB Analytic Platform o InfiniDB in Action

• Telecommunications • Online Advertising

• Summary and Q&A

4

Page 5: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

Overview

5

The analytic platform’s place in the ‘big data’ ecosystem Where and when

The key characteristics of an analytic platform How and which

The rise of the analytic platform What and why

Page 6: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

The 451 Group

6

Page 7: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

Big Data – Implications for Data Management

Velocity The data is being produced at a rate that is beyond the performance limits of traditional systems

Volume The volume of data is too large for traditional database software tools to cope with

Variety The data lacks the structure to make it suitable for storage and analysis in traditional databases and data warehouses

“Big data” - realization of greater business intelligence by storing, processing and analyzing data that was previously ignored due to the limitations of traditional data management technologies to handle its volume, velocity and/or variety.

Page 8: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

Total Data - Beyond ‘Big Data’

Exploration The interest in exploratory analytic approaches, in which schema is defined in response to the nature of the query.

Totality The desire to process and analyze data in its entirety, rather than analyzing a sample of data and extrapolating the results.

Dependency The reliance on existing technologies and skills, and the need to balance investment in those existing technologies and skills with the adoption of new techniques.

Frequency The desire to increase the rate of analysis in order to generate more accurate and timely business intelligence.

The adoption of non-traditional data processing technologies is driven not just by the nature of the data, but also by the user’s particular data processing requirements.

Page 9: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

Beyond the limitations of traditional data warehousing The EDW is supposed to be a single source of the ‘truth’ and avoid

data silos.

One of the most significant inefficiencies of data warehousing is that users have traditionally had to design their data-warehouse models to match their planned queries.

This approach is too rigid in a world of rapidly changing business requirements and real-time decision-making

And its inflexibility serves to encourage the growth of data silos and the exact redundancy and duplication issues the EDW was apparently designed to avoid.

A business analyst or executive unable to get the answers to queries they require from the EDW is likely to find their own ways to answer these queries.

Page 10: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

The Rise of Specialist Platforms

The alternative is to embrace dispersed data, adopting not silos but specialist data platforms, that complement the EDW.

‘Total Data’ describes an approach that treats the various data management components as an integrated whole.

eBay is a prime example of this approach in action, with its Singularity analytic platform, as well as an EDW and Hadoop.

Structured SQL analysis Semi-structured SQL Unstructured analysis

Page 11: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

Defining “Analytic Platform”

Enterprises have used specialist data marts/warehouses for many years for departmental/application-specific use-cases.

Analytic platforms are designed to enable different analytic approaches, that complement traditional EDW workloads.

Large data volumes Raw/close-to-raw data Multiple dimensions Complex variables Near real-time requirements Columnar storage SQL, user-defined functions MapReduce In-database analytics Flexible schema

Page 12: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

Flexible schema

Apply structural patterns as the data is analyzed, rather than when it is loaded into the database.

Results Schema Data storage

Results Data storage Schema Application

Application

Schema on read

Schema on write Query

Query

Page 13: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

“Exploratory Analytic Platform”

The need for EAPs is not necessarily driven by the choice of storage platform (e.g., Hadoop or analytic database) or query language (e.g., SQL or MapReduce).

Instead it is driven by the nature of the query or workload, or the skills and tools employed by the person interacting with the data.

While data analysts are analyzing data to find answers to existing

questions, data scientists are exploring patterns in data to prompt new questions.

E.g. customer analysis, interactive marketing, targeted advertising, churn analysis, sentiment analysis, fraud analysis.

An EAP should be flexible enough to enable the use of multiple techniques to support exploratory analysis.

Page 14: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

EAP in larger Total Data landscape

EDW retains core role for stable schema and structured SQL analytics on ERP, CRM apps etc.

Hadoop for storage and processing of raw data, analysis of unstructured, schemaless data.

EAP for flexible, exploratory analytics on rapidly updated data with evolving schema.

Page 15: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

Integration enables a ‘total data’ approach that treats the various platforms as points on a spectrum depending on the rigidity and importance of schema, rather than individual silos.

The Spectrum of Analytic Approaches

Page 16: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

Integration enables a ‘total data’ approach that treats the various platforms as points on a spectrum depending on the rigidity and importance of schema, rather than individual silos.

The Spectrum of Analytic Approaches

Page 17: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

Integration enables a ‘total data’ approach that treats the various platforms as points on a spectrum depending on the rigidity and importance of schema, rather than individual silos.

The Spectrum of Analytic Approaches

Calpont InfiniDB • Columnar MPP • Vertical and horizontal range partitioning • Integrated MapReduce • Distributed user-defined functions

Page 18: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

© 2012 by The 451 Group. All rights reserved

Considerations for Deploying an Analytic Platform

Scalability – the ability to handle large volumes of data and expand as data volumes grow

Performance – high performance processing is required to deliver rapid results

Efficiency – in-database analytics approaches that take the query to the data

Flexibility – no reliance on restrictive schema to deliver the desired performance

Variability – support for multiple query approaches and advanced functions to enable exploratory analysis

Page 19: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

Calpont Corporation

• Software Company

• High Perf/ HA Analytic Data Platform

• Dallas HQ, Silicon Valley

• Partners in North America, Europe, Japan

• Online Media, Digital Networks, Telco

Calpont Mission To provide a highly

scalable data platform that enables

analytic business decisions as timely as customers and markets dictate.

Page 20: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

What is InfiniDB?

20

Columnar Performance Efficiency

Widely used MySQL Interface

MPP, MapReduce style Query Execution

Simple, Powerful Platform for Big Data Analytics

Page 21: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

Benefits of InfiniDB

21

Real-time, Consistent Query Performance

Linear Scale for Massive Data

Removes Limits to Dimensions and Granularity

Easy to Deploy and Maintain

Page 22: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

Data Warehouse

Hadoop

Operational

Transactional

Dimensional Analytics

Data Discovery

Predictive Analytics

Analytic Data Store

Analytic Needs Analytic Platform Big Data Sources Data Integration

ETL

MDM

Direct Load Model Legacy RDBMS

InfiniDB Analytic Platform – DW and Exploration

Page 23: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB - Telecommunications

Page 24: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

Telecommunications Market Challenges

24 7/18/2012

Voice Revenue Data Revenue Total ARPU

Global Mobile Voice and Data Revenues/ARPU – 2007-2013

Source: Informa Telecoms & Media

US

$ M

illio

ns p

er Y

ear

Macro Drivers: • Subscriber Growth declining • ARPU declining • Revenue Growth vs. Cost to

Carry Do carriers? • Attempt to control costs via

throttling, etc. • Increase revenue through

monetization strategies

Page 25: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

The Telco Gold Mine

25

Quality • Meets CSP expectations? • Meets Subscriber expectations?

Location • Where are they? • Movement patterns, etc.

Usage • What applications/services? • How much, how long, etc.

Data Sources • Element feeds • Probe feeds • Device agents • Log files • Care data

Telco data is rich – Can it be fully leveraged?

Page 26: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

Challenge? or Opportunity? Multi-Dimensional Analysis

service application

network

kpi

customer

Dimensions

Linkage?

Page 27: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

Telco Success

Legacy InfiniDB Improvement

# of DRs 15 billion 15 billion n/a

Database size 4 TB < 1TB (75%)

Load rates 30k/sec >120K/sec 400%

Typical analytics query

300 sec. 5 sec. (98%)

Representative data from Customer Experience (CEM) analytics :

Benefits Game-changer for storage of and access to non-aggregated data Near linear scale out performance

Page 28: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB - Online Advertising

Page 29: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

Online Advertising – Market Challenges

• Advertising Analytics (≠ Web Analytics) o Interactions and performance of ads on other sites o Attribution analysis - ad optimization, efficient targeting,

and return on ad spend

• Challenges o Massive daily data consumption – “Billions Served” o Ad targeting is not real-time with traditional data tech o Attribution analytics effectiveness

Wide Dimensionality Granularity

Page 30: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

Mobile Advertising – Analytic Data Environment

30

Location Ads

WiFi Captive Display

Free WiFi Ad Share

App Embedded Ads

Info Sources Source Data

ETL Analytic Platform BI / Analytic Front End

Special Needs Latitudinal / Longitudinal Geospatial Functions Military Grid Ref System (MGRS) Functions

Non-Calpont product names are trademarks of their respective owners

Page 31: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

Online Advertising Success

Legacy InfiniDB Improvement

# of DRs 300 Million 300 Million n/a

Database size >6 TB 3 TB (50%)

Load rates 100k/sec 1M+/sec 1000%

Typical analytics query

20-30 min with cubes

15 sec. (99.2%)

Location-based Mobile Advertiser Funnels Big Data Insights

Benefits Real-time analytics about niche segments Simple MySQL interface for easy use of Hadoop ETL extracts “Mobile Audience Insights” for segment affinity and engagement strategies

Mobile Audience Insights Report

Page 32: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved.

Key Takeaways

A spectrum of analytic platforms address structured and unstructured needs that complement the traditional EDW Proper choice of an analytics platform should depend on rigidity

and importance of schema, as well as skills and tools of users InfiniDB is a scalable MPP columnar platform supporting

exploratory analytics for structured data Calpont is helping partners create transformational solutions in

Telco Customer Experience and Online Advertising

Page 33: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

InfiniDB® Scalable. Fast. Simple. © 2012 Calpont. All Rights Reserved. 33

More Info on 451 Research and Calpont

Matt Aslett 451 Research www.451research.com @maslett @451research

451 examines trends behind Big Data and the Total Data management approach

Bob Wilkinson Calpont Corporation www.calpont.com @Calpont, @InfiniDB

Calpont discusses why Big Data in online marketing needs modern data technology

Page 34: Analytic Platforms in the Real World with 451Research and Calpont_July 2012

®