Making Big Data Analytics with Hadoop fast & easy (webinar slides)
-
Upload
yellowfin -
Category
Technology
-
view
1.169 -
download
4
description
Transcript of Making Big Data Analytics with Hadoop fast & easy (webinar slides)
![Page 1: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/1.jpg)
Making Big Data Analytics Fast and Easy
Using Actian, Yellowfin and Hadoop
December 16, 2013
John Ryan Marketing Manager APAC
Actian Corporation
Ryan Templeton Snr Solutions Architect
Actian Corporation
Ivan Seow Snr Technical Consultant
Yellowfin
![Page 2: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/2.jpg)
2
Take Action on Big Data Making BI Easy
![Page 3: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/3.jpg)
3
Take Action on Big Data
Fastest Data Prep Engine
Fastest Hadoop Loader
Fastest Single Node Database
Fastest MPP Database
Huge library of Analytical Functions
Making BI Easy
![Page 4: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/4.jpg)
4
Take Action on Big Data Making BI Easy
Fastest Data Prep Engine
Fastest Hadoop Loader
Fastest Single Node Database
Fastest MPP Database
Huge library of Analytical Functions
Ranked #1 BI Vendor
Dresner Global BI Study 2012 & 13
#1 Dashboard Vendor: BARC BI Survey 12
#1 Enterprise Reporting Vendor:
BARC BI Survey 13
Gartner: ‘Vendor to Consider’
![Page 5: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/5.jpg)
Today’s Agenda
5
1. Big Data Analytics with Hadoop 2. Making Analytics in Hadoop Fast & Easy 3. Customer Example (Telecom) 4. Demo: From Data to Dashboard
• Making Hadoop Fast and Easy • Making BI Fast and Easy
5. Summary
![Page 6: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/6.jpg)
6 Confidential © 2012 Actian Corporation
Big Data Analytics With Hadoop
![Page 7: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/7.jpg)
Expect to have HDFS in production
7
Based on 263 respondents TDWI Best Practices Report – Q2 2013
73%
![Page 8: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/8.jpg)
Big Data Source for Analytics Most Likely to Benefit from Hadoop
8
Based on 263 respondents TDWI Best Practices Report – Q2 2013
71%
![Page 9: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/9.jpg)
Why is analytics inside Hadoop so hard and slow?
9
HDFS is a file system, not a database
Queries not standard SQL, only resemble SQL
Need a Data Scientist MapReduce inefficient for analytic queries
![Page 10: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/10.jpg)
10 Confidential © 2012 Actian Corporation
Making Big Data with Hadoop Fast and Easy With Actian and Yellowfin
![Page 11: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/11.jpg)
Enterprise
Actian Big Data Analytic Platform
11
DATA VALUE
Business Intelligence
Applications DW
Big Data Storage
Advanced technology platform:
Industry leading: Scale
Performance
Complexity
Cost (price/performance)
Time to Value
Multiple deployment options: On-premise
Cloud
Hybrid
Embedded
Connect Prepare Analyze
Optimize
Accelerating Big Data 2.0
![Page 12: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/12.jpg)
Enterprise
Actian Big Data Analytic Platform
12
DATA VALUE
Business Intelligence
Applications DW
Big Data Storage
Advanced technology platform:
Industry leading: Scale
Performance
Complexity
Cost (price/performance)
Time to Value
Multiple deployment options: On-premise
Cloud
Hybrid
Embedded
Connect Prepare Analyze
Optimize
Accelerating Big Data 2.0
![Page 13: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/13.jpg)
Enterprise
Actian Big Data Analytic Platform
13
DATA VALUE
Business Intelligence
Applications DW
Big Data Storage
Advanced technology platform:
Industry leading: Scale
Performance
Complexity
Cost (price/performance)
Time to Value
Multiple deployment options: On-premise
Cloud
Hybrid
Embedded
Connect Prepare Analyze
Optimize
Accelerating Big Data 2.0
![Page 14: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/14.jpg)
Enterprise
Actian Big Data Analytic Platform
14
DATA VALUE
Business Intelligence
Applications DW
Big Data Storage
Advanced technology platform:
Industry leading: Scale
Performance
Complexity
Cost (price/performance)
Time to Value
Multiple deployment options: On-premise
Cloud
Hybrid
Embedded
Connect Prepare Analyze
Optimize
Accelerating Big Data 2.0
![Page 15: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/15.jpg)
Enterprise
Actian Big Data Analytic Platform
15
DATA VALUE
Business Intelligence
Applications DW
Big Data Storage
Advanced technology platform:
Industry leading: Scale
Performance
Complexity
Cost (price/performance)
Time to Value
Multiple deployment options: On-premise
Cloud
Hybrid
Embedded
Connect Prepare Analyze
Optimize
Accelerating Big Data 2.0
![Page 16: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/16.jpg)
Industry Leading Performance
16
Process Hadoop Data Faster
Dataflow vs PIG (MapReduce) DBT-3@1TB : Run times
Analyze Data Faster
Database Benchmarks TPC-H QphH@1TB Benchmarks (non-clustered)
![Page 17: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/17.jpg)
Today’s demonstration
17
Connect Hadoop
Transform Data
Parallel Load
Fast Database Queries
Fast Analysis
Actian Dataflow Actian Vector BI Visualization Layer Yellowfin BI
![Page 18: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/18.jpg)
18 Confidential © 2012 Actian Corporation
Telecom Example Storing CDR Log Files inside Hadoop
![Page 19: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/19.jpg)
Customer Use Case
Tier two telecom provider
Planning for large growth with minimal staff impact
Business demands deeper insights
19
![Page 20: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/20.jpg)
IT Challenges
20
Collect, manage, process CDR data in Hadoop
Users are domain experts, not data scientists
Swamped with data. Network switch dumps 200MB /min
during peak times. Hundreds of thousands of records per drop.
170 columns.
Too hard to analyze Raw data must first be distilled
and enriched to gain insight
![Page 21: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/21.jpg)
What the business was asking for
Fastest time to decision Speed up processing by an order of magnitude
Increased granularity of analysis
Without increasing processing times or bogging down backend
Proactive analysis, not reactive Enable trend analysis and predictive capabilities
Answer real business questions
e.g. visual insight for near real-time customer and vendor performance, determine routing performance
optimization, etc
Scale for future growth Extensible for future capabilities and scalable growth
21
![Page 22: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/22.jpg)
Specific Business Questions - CDR Analysis
Answer Service Rate (ASR & Adjusted ASR) • Calls completed vs. route attempts (vendor performance)
• Calls completed vs. call attempts (customer satisfaction)
Opportunity Monitor • Calculate profit/loss per call due to routing path chosen
Post Dial Delay (PDD) • Annoying delay until path through network selected
Analysis of near real time quality measures • Call duration, jitter and packet loss
Trends and correlations of above metrics
22
![Page 23: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/23.jpg)
Filter data Logical functions Split flow for separate
processing rules
Meta-node encapsulates
processing Extract failed
routing attempts
CDR Workflow Overview
23
CONNECT TRANSFORM
PARALLEL DATA LOAD
![Page 24: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/24.jpg)
Data processing – Execution Plan
24
Reader FilterRows DeriveFields Group(partial)
Reader FilterRows DeriveFields Group(partial)
Reader FilterRows DeriveFields Group(partial)
Reader FilterRows DeriveFields Group(partial)
Repartition Group(final) Writer
Repartition Group(final) Writer
Repartition Group(final) Writer
Repartition Group(final) Writer
Phase 1 Phase 2
Compiled to a set of physical graphs
![Page 25: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/25.jpg)
25 Confidential © 2012 Actian Corporation
Demo Making Big Data Analytics Fast and Easy
![Page 26: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/26.jpg)
Customer Take Aways – Actionable Insights
Processing streaming CDR data in seconds
26
FAST
![Page 27: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/27.jpg)
Customer Take Aways - Analysis
visibility at the Area Code and Exchange level
27
Deeper Analysis
![Page 28: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/28.jpg)
Customer Take Aways – Cost Savings
updates made to routing tables during first week of collecting data
28
20,000
![Page 29: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/29.jpg)
Customer Take Aways - Scalability
rows of data collected during first 6 months
29
8.9 Billion
![Page 30: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/30.jpg)
Solution Architecture
30 30 30
End Users
Desktop & Mobile Devices
Yellowfin BI
• Dashboard • Ad Hoc • Statistics • Data Mining • Analytics
Hadoop Collection
Paraccel Dataflow
Extraction Cleansing Enrichment Aggregation Analysis Mining
Vectorwise Very fast reporting
database
Clustered Execution Parallel Loading
OSS/BSS
Data Retention
![Page 31: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/31.jpg)
Summary – Take Action on Big Data
31
Enterprise
DATA VALUE
Business Intelligence
Applications DW
Big Data Storage
Advanced technology platform:
Industry leading: Scale
Performance
Complexity
Cost (price/performance)
Time to Value
Multiple deployment options: On-premise
Cloud
Hybrid
Embedded
Connect Prepare Analyze
Optimize
Accelerating Big Data 2.0
![Page 32: Making Big Data Analytics with Hadoop fast & easy (webinar slides)](https://reader033.fdocuments.net/reader033/viewer/2022052904/5583a44dd8b42a03088b4cc2/html5/thumbnails/32.jpg)
32 Confidential © 2012 Actian Corporation
Questions
Ivan Seow [email protected]
John Ryan [email protected]
Ryan Templeton [email protected]
Actian www.actian.com Yellowfin www.Yellowfin.bi