C* Summit EU 2013: Blending Cassandra Data Into The Mix

24
#CASSANDRAEU CASSANDRASUMMITEU Blending Cassandra Data Into the mix Matt Casters| Chief Architect, Data Integration at Pentaho Kettle Project Founder

description

Speaker: Matt Casters, Chief Architect & PDI/Kettle Project Founder at Pentaho Video: http://www.youtube.com/watch?v=r7BEp-C60bQ&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=8 Traditionally, data is delivered to business analytics tools through a relational database. However, there are cases where that can be inconvenient, for example when the volume of data is just too high or when you can't wait until the database tables are updated. This presentation by Pentaho Kettle founder Matt Casters will demonstrate a solution of data 'Blending', which allows a data integration user to create a transformation capable of delivering data directly to Pentaho - and other - business analytics tools. Matt will demonstrate taking data from Cassandra, and blending it with other data from both SQL and NoSQL sources, and then visualizing that data. Matt will explain how it becomes possible to create a virtual "database" with "tables" where the data actually comes from a transformation step.

Transcript of C* Summit EU 2013: Blending Cassandra Data Into The Mix

Page 1: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Blending Cassandra Data Into the mix

Matt Casters| Chief Architect, Data Integration at Pentaho Kettle Project Founder

Page 2: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

* About Pentaho* Blended Big Data Integration* Demo* Takeaway & QA

What we will discuss today…

Page 3: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

About Pentaho

Our mission and key takeaways

Page 4: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Pentaho MissionEnabling the future of analytics

Modern unified business analytics and data integration platform• Full spectrum of advancing analytics for all key roles• Embeddable, cloud-ready analytics• Big data blending for analytics in real-time environments• Broadest and deepest big data integrationInnovation through open source• Open, pluggable, purpose built for the future• Early sustained leadership in big data

ecosystem with technology innovation

Critical mass achieved• Over 1,200 commercial customers• Over 10,000 production deployments

Page 5: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

* ETL and Analytics that complement Cassandra* Create data transformations from source systems

into Cassandra, and Cassandra to target systems, via drag and drop

* Quickly visualize and explore data inside Cassandra with Pentaho Data Services

* Deeper Casandra/Pentaho integration in development

* Keep up with the latest Cassandra developments* Provide underlying API compatibility layer

Pentaho and Cassandra

Page 6: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

The New RealitySimplified Analysis for all Users

ANY Analytics• Reports• Dashboards• Visualizations• Discovery• Predictive

Analytics

ANY Environment• Data warehouses• Data marts• Stack vendors• Cloud• Embedded

Existing & New Data

Infrastructure & Processes

ANY Data• Relational• Operational• Big Data• Data sources not yet

anticipated…

Billing

Location

Social Media

Customer

Web

Network

Page 7: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Pentaho 5.0 Architected for the Future Simplified analytics experience for all users

Simplified Analytics

Experience

Enterprise Big Data Integratio

n

Blended Big Data

Page 8: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Basic Cassandra Use Case• Enterprise Customer Data Store Source Systems

Pentaho Data Integration

Enterprise Data Store

Pentaho Data Integration

Target Systems

System Scope

Pentaho Analytics• Reporting• Dashboards• Visualization• Discovery

• Visual ETL development with Pentaho Data Integration

• Reporting, Dashboards,

Visualization and Data discovery with full spectrum analytics

Page 9: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Big Data Orchestration

Page 10: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Orchestration Toolkit

Page 11: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Pentaho Visual Development

Would you rather do this?

Integrate, Manipulate, Ingest

… or this?

Schedule

Model

Page 12: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Cassandracluster

Analytics

Broad Connectivity

PDI

Page 13: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Blending data

When copying data all over the place stops making sense

Page 14: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Analytics on Cassandra– Two Approaches

Cassandracluster

AnalyticsPDI

Data Services

Direct Access

RDBMSPDI ETL

Analytics

Access via Database

Page 15: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Direct Access to Cassandra Data

PDI ETP

Extract -> Transform -> Present

Pentaho Operational Reports

Pentaho Operational Dashboards

Cassandracluster

Page 16: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Pentaho Operational DashboardsArchitected Access for Reliable Executive Insight

Page 17: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Improve operational effectiveness• Machines/sensors: predict failures, network attacks

• Financial risk management: reduce fraud, increase security

Reduce data warehouse cost• Integrate new data sources without increased database

cost

• Provide online access to ‘dark data’

Drive incremental revenue• Predict customer behavior across all

channels

• Understand and monetize customer behavior

• Begin to monetize data as a service

Customer Value from Big DataMonetizing big data-driven use cases driving need to blend data

Page 18: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Analytics

Analyze quality of service: • Network outages• Dropped calls • Poor quality• Calls to support center

For profiles of customers:• Up for renewal• Profitable• Multiple agreements/services• In competitive area

Determine best action to take:• Billing Credit• Customer Coupon • No Action

EDW

ExistingETL Tool

or PDI

Customer

Billing

Provisioning

Call Detail Records from: • Billing• Payment• Usage

NoSQLNetwork

Location

PDI

Call Detail Records from Network: • Outages• Drops• Service Quality

PDI

Blend revenue-related and quality-of-service data together to find customers at risk

Why Blending at the Source MattersCustomer Experience Analytics for loyalty and revenue

Page 19: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

• Just in time blending of data from multiple sources for a complete picture• Connect, combine and transform data from multiple sources• Query data directly from any transformation• Access architected blends with the full spectrum of Pentaho Analytics• Manage governance and security of data for on-going accuracy

Accurate, Blended Big Data AnalyticsOptimally stored data, blended when needed

EDW

ExistingETL Toolor PDI

Customer

Billing

Provisioning

NoSQLNetwork

Location

PDI

PDI Analytics

Just in time blending

Page 20: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Broadest options for storing and blending data

• New analytic use case templates for Hadoop and Splunk

• Deeper NoSQL integration to and direct reporting

• Hadoop high availability support with MapR

• Expanded big data integration

• New integrations: Redshift, Impala and Splunk

• New certifications: DataStax , Cassandra , Intel, Hortonworks, latest Cloudera, MapR, MongoDB, …

Bring More Big Data to LifeAdaptive Big Data Layer: broadest, deepest big data support

Page 21: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Demo!

Demonstrate how to easily write to and read from CassandraDemonstrate how to blend data

Page 22: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Takeaways…

Page 23: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

Pentaho 5.0 key takeawaysMeeting the demands of the big data-driven enterprise

Blended Big Data at the source for more accurate insights

Enterprise-ready data integration and simplified embedding for any environment

Simplified analytics experience with a new modern interface

Analytics

Blended Big Data

EnterpriseBig Data

Integration

Page 24: C* Summit EU 2013: Blending Cassandra Data Into The Mix

#CASSANDRAEUCASSANDRASUMMIT

EU

THANK YOU

Any questions?

blog.pentaho.com

@Pentaho

Facebook.com/Pentaho

Pentaho Business Analytics

www.pentaho.com