Left Brain, Right Brain: How to Unify Enterprise Analytics

58
The Briefing Room

description

The Briefing Room with Robin Bloor and Teradata Live Webcast on Jan. 29, 2013 Despite its name, effective Data Science requires a certain amount of artistic flair. Analysts must be creative about how and where they find the insights that will drive business value. One classic roadblock to that kind of frictionless process? Programming. Not everyone can code Java, which makes the unstructured domain of Hadoop quite challenging for the average business analyst. Check out the slides from this episode of the Briefing Room to hear veteran Analyst Dr. Robin Bloor explain how a new generation of analytical platforms will solve the complexity of unifying structured and unstructured data. He'll be briefed by Steve Wooledge of Teradata Aster who will tout his company's Big Data Appliance, which leverages the SQL-H bridge, an innovation designed to connect Hadoop with SQL. Visit: http://www.insideanalysis.com

Transcript of Left Brain, Right Brain: How to Unify Enterprise Analytics

Page 1: Left Brain, Right Brain: How to Unify Enterprise Analytics

The Briefing Room

Page 2: Left Brain, Right Brain: How to Unify Enterprise Analytics

Twitter Tag: #briefr

The Briefing Room

Welcome

Host: Eric Kavanagh

[email protected]

Page 3: Left Brain, Right Brain: How to Unify Enterprise Analytics

Twitter Tag: #briefr

The Briefing Room

!   Reveal the essential characteristics of enterprise software, good and bad

!   Provide a forum for detailed analysis of today’s innovative technologies

!   Give vendors a chance to explain their product to savvy analysts

!   Allow audience members to pose serious questions... and get answers!

Mission

Page 4: Left Brain, Right Brain: How to Unify Enterprise Analytics

Twitter Tag: #briefr

The Briefing Room

JANUARY: Big Data

February: Analytics

March: Open Source

April: Intelligence

Page 5: Left Brain, Right Brain: How to Unify Enterprise Analytics

Twitter Tag: #briefr

The Briefing Room

Big Data

Copy

righ

ted

prop

erty

. M

ay n

ot b

e co

pied

or

dow

nloa

ded

wit

hout

per

mis

sion

fro

m 1

23RF

Lim

ited

.

NEW SOURCESNew Insights NEW  Challenges  

Page 6: Left Brain, Right Brain: How to Unify Enterprise Analytics

Twitter Tag: #briefr

The Briefing Room

Analyst: Robin Bloor

 Robin Bloor is Chief Analyst at The Bloor Group

[email protected]

Page 7: Left Brain, Right Brain: How to Unify Enterprise Analytics

Twitter Tag: #briefr

The Briefing Room

!   Teradata is known for its data analytics solutions with a focus on integrated data warehousing, big data analytics and business applications

!   It offers a broad suite of technology platforms and solutions; data management applications; and data mining capabilities

!   Teradata Aster is its MapReduce platform to handle big data analytics on multi-structured data

Teradata Aster

Page 8: Left Brain, Right Brain: How to Unify Enterprise Analytics

Twitter Tag: #briefr

The Briefing Room

Steve Wooledge

Steve is Senior Director of Product Marketing for

Teradata Aster and has 10 years of industry

experience.

Page 9: Left Brain, Right Brain: How to Unify Enterprise Analytics

Steve Wooledge – Sr. Director, Product Marketing, Teradata Aster January 2013

Bringing Big Data into the Light: Teradata Big Analytics Appliance

Page 10: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2012 Teradata Corporation. Confidential and proprietary. Copyright © 2012 Teradata Corporation. 10

TOPICS

WHAT IS DIFFERENT ABOUT BIG DATA ANALYTICS?

MAKING BIG ANALYTICS & DISCOVERY FAST AND EASY

TERADATA ASTER BIG ANALYTICS APPLIANCE

Page 11: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2012 Teradata Corporation.

What is Different about Big Analytics and Discovery?

Page 12: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 12

The Lytro and Big Data

Page 13: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 13

“Interactive, Living Pictures”

Page 14: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 14

See Your Business in High-Definition Big Analytics & Discovery Unlocks Hidden Value

“Capture only what’s needed”

IT delivers a platform for storing, refining, and

analyzing all data sources Business explores data for questions worth answering

Big Data Analytics Multi-structured & Iterative Analysis

IT structures the data to answer those questions

Business determines what questions to ask

Classic BI Structured & Repeatable Analysis

“Capture in case it’s needed”

Page 15: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 15

Iterative Analytics Accelerates Discovery

Analytical Idea

Evaluate Results SQL and non-SQL

Analysis

Operational DB or EDW

Operationalize or Move On

Zero-ETL Data Load/Integration 5x

Faster Discovery Process

with Aster - Hours vs. Days

Page 16: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 16

Need for a Unified Data Architecture for New Insights Enabling Any User for Any Data Type from Data Capture to Analysis

Java, C/C++, Python, R, SAS, SQL, Excel, BI, Visualization

Discover and Explore Reporting and Execution in the Enterprise

Capture, Store and Refine

Audio/ Video Images Docs Text Web &

Social Machine

Logs CRM SCM ERP

Page 17: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 17

Big Data Comes with BIG HEADACHES

Even free software like Hadoop is causing companies to spend more money…Many CIOs believe data is inexpensive because storage has become inexpensive. But data is inherently messy—it can be wrong, it can be duplicative, and it can be irrelevant—which means it requires handling, which is where the real expenses come in.

” Through 2015, 85% of Fortune 500 organizations will

be unable to exploit big data for competitive advantage. “ ” Source: The Wall Street Journal. “CIOs’ Big Problem with Big Data”. Aug 2012 Source: Gartner. “Information Innovation: Innovation Key Initiative Overview”. April 2012

Page 18: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 18

AUDIO & VIDEO IMAGES TEXT WEB & SOCIAL MACHINE LOGS CRM SCM ERP

DISCOVERY PLATFORM

CAPTURE | STORE | REFINE

INTEGRATED DATA WAREHOUSE

UNIFIED DATA ARCHITECTURE

Big Data Analytics

Big Data Management

LANGUAGES MATH & STATS DATA MINING BUSINESS INTELLIGENCE APPLICATIONS

Engineers

Data Scientists

Business Analysts

Front-Line Workers Customers / Partners Quants

Operational Systems Executives

Page 19: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 19

AUDIO & VIDEO IMAGES TEXT WEB & SOCIAL MACHINE LOGS CRM SCM ERP

DISCOVERY PLATFORM

CAPTURE | STORE | REFINE

INTEGRATED DATA WAREHOUSE

TERADATA UNIFIED DATA ARCHITECTURE

LANGUAGES MATH & STATS DATA MINING BUSINESS INTELLIGENCE APPLICATIONS

Engineers

Data Scientists

Business Analysts

Front-Line Workers Customers / Partners Quants

Operational Systems Executives

Page 20: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 20 AUDIO & VIDEO IMAGES TEXT WEB & SOCIAL MACHINE LOGS CRM SCM ERP

DISCOVERY PLATFORM

CAPTURE | STORE | REFINE

INTEGRATED DATA WAREHOUSE

LANGUAGES MATH & STATS DATA MINING BUSINESS INTELLIGENCE APPLICATIONS VIEWPOINT SUPPORT

Engineers

Data Scientists

Business Analysts

Front-Line Workers Customers / Partners Quants

Operational Systems Executives

TERADATA UNIFIED DATA ARCHITECTURE

Aster Connector for Hadoop

Teradata Connector for Hadoop

Aster Teradata Connector

SQL-H

Aster Loader Teradata Loader

SQL-H

Page 21: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 21

Shift from a Single Platform to an Ecosystem

Source: “Big Data Comes of Age”. EMA and 9sight Consulting. Nov 2012.

“Big Data requirements are solved by a range of platforms including analytical databases, discovery platforms and NoSQL solutions beyond Hadoop.”

Page 22: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2012 Teradata Corporation.

How Does Big Analytics and Discovery Add Business Value?

Page 23: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 23

Customer Behavior Analysis

BI Tools Database Tools Monitoring Tools

STORE VISION PLATFORM

DATA

CALL CENTER DATA

EMAIL CORRESPOND-

ENCE DATA

BRANCH TELLER DATA

ONLINE BANKING

DATA

CUSTOMER PROFILE DATA

CUSTOMER SURVEY DATA

Page 24: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 24

Events Preceding Account Closure

Page 25: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 25

Interactive Analytics Reducing the “Noise” to find the “Signal”

SELECT * FROM npath ( ON ( SELECT … WHERE u.event_description IN ( SELECT aper.event FROM attrition_paths_event_rank aper ORDER BY aper.count DESC LIMIT 10) ) … PATTERN ('(OTHER|EVENT){1,20}$') SYMBOLS (…) RESULT (…) ) ) n;

Events Preceding Account Closure

Page 26: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2012 Teradata Corporation.

How Do We Make Big Analytics & Discovery Possible?

Page 27: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 27

Key Requirements of a Discovery Platform

Highly Efficient & Performant Big Data Platform That Allows Quick Iterations 1

Hybrid Capabilities that supports SQL, statistics, and new MapReduce analytics 2

Significant Out-of-the-Box Analytical Functions that Minimize Development 3

Democratize Big Data & Maximize Enterprise Adoption

Page 28: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 28

Teradata Aster Big Analytics Appliance First Deeply Integrated SQL, MapReduce and Hadoop Appliance

UNIQUE FEATURES

1.  Integrated, modular Aster Database and 100% Open-Source Hortonworks HDP

2.  First and only ANSI SQL & HCatalog integration via SQL-H™ 3.  Industry’s only ANSI-standard SQL & MapReduce integration

via SQL-MapReduce® 4.  Industry’s most manageable & supportable Apache Hadoop

appliance via Teradata Viewpoint™ & TVI™ 5.  Most complete MapReduce App Portfolio with 70+ pre-built

MapReduce functions 6.  Fully engineered and supported by Teradata, with Level-4

support by Hortonworks world-class Hadoop team

Benefits •  Leverage existing investments in standard BI, ETL tools & people with SQL skills •  Industry’s highest performance platform for Big Analytics •  Lowest TCO (technology + people), highest ROI, and fastest time to value

Page 29: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 29

Teradata Aster Analytics Portfolio The App Store of Big Data

PATH ANALYSIS Discover Patterns in Rows of Sequential Data

TEXT ANALYSIS Derive Patterns and Extract Features in Textual Data

STATISTICAL ANALYSIS High-Performance Processing of Common Statistical Calculations

SEGMENTATION Discover Natural Groupings of Data Points

MARKETING ANALYTICS Analyze Customer Interactions to Optimize Marketing Decisions

DATA TRANSFORMATION Transform Data for More Advanced Analysis

Page 30: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 30

Unified Big Data Analytics Architecture Integrated Analytics and Navigation

BI Tools, SQL, ETL

Multi-Structured Data

Unstructured Data

TERADATA IDW BIG ANALYTICS APPLIANCE

Revenue Social Media

Discovery Platform

Facebook Twitter

Pinterest

Sentiments Behavior

Unified Big Analytics Architecture

Iterative Information Discovery

Operationalized Analytics

Best Decision Possible

Page 31: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 31

Teradata Aster Big Analytics Appliance Solution Value Add

SQL BI Tools

Analytic SQL Apps

Hadoop Tools

•  Processing, storage, and networking designed for Big Data workloads

•  40 GB/s InfiniBand network

•  Pre-tuned HDFS and MapReduce parameters for Big Data workloads

•  Store and manage data in Apache Hadoop or Aster Database

•  Analytics Library w/ 70+ functions •  SQL interface to MapReduce and

Hadoop

•  Supports standard BI and ETL tools •  Use Hadoop tools like Hive and Pig

•  Single vendor for lowest TCO •  Common system management tools

Aster Database InfiniBand (40 GB/s) Interconnect Fabric

Big Analytics Appliance Hardware

Aster MapReduce Portfolio of Functions

SQL SQL-MapReduce

Com

mon

Man

agem

ent,

Tr

oub

lesh

ooti

ng

, an

d S

up

por

t

NEW

NEW

NEW

Hive, Pig, …

SQL-H NEW

Page 32: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 32

ESG Benchmark Report Summary Third Party Validation of Aster and Hadoop “Fit”

FULL REPORT AVAILABLE AT www.asterdata.com/esg

RESULTS

Scope •  Identical hardware for Aster and Hadoop •  Clickstream, sentiment, and traditional retail data •  Compare “time to insight” and “time to develop”

Hadoop MapReduce

Aster SQL-MapReduce

32 Hours

6 Hours

Discovery Process: Aster

5x Faster

Analytics: Aster 35x Faster (range: 4–416x)

Development: Aster

3x Faster

Loading: Hadoop

1.8x Faster

Transforms: Hadoop

1.3x Faster

Aster 5x Faster Discovery Cycle-Time (Development + Execution Time)

Page 33: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 33

Comparing Advanced Analytic Development and Execution Example: Determine Spikes In Hourly Pageviews

Development Time: 4 hours Execution Time: 149 seconds

Development Time: 1 hour (4x faster) Execution Time: 3 seconds (50x faster)

Apache Hadoop Teradata Aster •  Write Java MR job to group records by pagename

and find all pages <100 pageviews/hr •  Sort by the yy/mm/dd and hour fields

•  Java reduce phase to place all same-keyed records into temporary arrays

•  Compute counts for low/high/low hourly page views

•  Create custom partitioner •  Create custom grouping comparator •  Create custom key comparator

•  Execute each Mapper and Reducer •  Multiple passes of data

•  Save output to flat files making it unstructured, •  No relational semantics and preventing use of

DB interfaces (e.g. ODBC/JDBC) •  Retrieve results with other tools (e.g., SSH/FTP)

•  Use Aster nPath •  Input parameters in SQL as regular expressions

•  Single Pass of the data •  SQL handles group-by, counts, sorts •  MapReduce perform regular pattern matching

over a sequence of rows

•  Outputs written to relational table •  Use SQL or BI tools to visualize results

1

2

3+

Execute

5

1

Execute

3

“By using SQL-MapReduce, Aster takes fewer steps to develop analytics”

“This is also why the execution time in Aster is much faster.”

“Rather than using MapReduce processing for each step in the analysis, SQL is used in place of a Map (or Reduce) phase and MapReduce is used only in steps that cannot be expressed in SQL.”

“Map or Reduce requires data shuffling and produces higher latency than SQL”

Source: Enterprise Strategy Group, Lab Validation Report, September 2012

Page 34: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2012 Teradata Corporation.

Teradata Aster Big Analytics Appliance—Key Innovations

Page 35: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 35

Aster SQL-H™ A Business User’s Bridge to Analyze Hadoop Data

Aster SQL-H Gives Analysts and Data Scientists a Better Way to Analyze Data Stored in Hadoop •  Allow standard ANSI SQL access to

Hadoop data

•  Leverage existing BI tool and enable self service

•  Enable 50+ prebuilt SQL-MapReduce Apps and IDE

Hadoop Layer: HDFS

Pig

Hive

Hadoop MR

Aster: SQL-H

HCatalog

Dat

a

Dat

a Fi

ltering

NEW

Page 36: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 36

•  Tightly aligned with core Apache code lines

•  All code committed back to open source

•  Engineered integration with Teradata Viewpoint and Ambari

•  HCatalog - centralized metadata services for easy data sharing

•  Dependable full stack high availability

•  Capacity scheduler for better multi-tenancy

•  Intuitive graphical data integration tools

The ONLY 100% open source data platform for Hadoop

Hortonworks Data Platform Enterprise-Ready Hadoop

Page 37: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 37

Common Management Console for Aster, Teradata and Apache Hadoop

Aster-Specific Portlets •  Aster Node

Monitoring •  Aster Completed

Processes

Trend/ Visualization Portlets •  Capacity

Heat Map •  Metrics Graph •  Metrics Analysis

Query Portlets •  Query Monitor

Admin Portlets •  Teradata System •  Roles Manager

Other Portlets •  System Health •  Canary queries •  Aster Alerting

Teradata Viewpoint Integration Easier, Faster, and Better System Management

Page 38: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 38

Teradata Vital Infrastructure (TVI) Integrated hardware & software solution for systems

management PROACTIVE RELIABILITY, AVAILABILITY, AND MANAGEABILITY

1U server virtualizes system and cabinet management software Server Management VMS •  Cabinet Management Interface Controller (CMIC) •  Service Work Station (SWS) •  Automatically installed on base/first cabinet

VMS allows full rack solutions without additional cabinet for traditional SWS

Eliminates need for expansion racks, reducing customers’ floor space & energy costs

Supports Teradata hardware and Aster/Hadoop software

TVI Support for Aster and Hadoop

62–70% of Incidents Discovered through TVI

Page 39: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2012 Teradata Corporation.

How Can You Get Started? Aster Express

Page 40: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 40

Making it easy to try Aster Big Analytics Solutions Aster Express, Aster Live, Aster Big Analytics Appliance

Aster Express Aster Live

Aster Big Analytics

Appliance

Page 41: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 41

Aster Express Tutorials Make it Easy to Start www.asterdata.com/asterexpress

Page 42: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 42

Teradata Aster Big Analytics Appliance Summary Bring Big Data to Life with Big Analytics & Discovery

INDUSTRY’S FIRST UNIFIED BIG ANALYTICS APPLIANCE

UNIFIED INTERFACES FOR ITERATIVE SQL AND MAPREDUCE ANALYTICS

TERADATA-TRUSTED RELIABILITY, AVAILABILITY & MANAGEABILITY

EASY TO DEPLOY, MANAGE & USE

Get Started Now! asterdata.com/AsterExpress

Page 43: Left Brain, Right Brain: How to Unify Enterprise Analytics
Page 44: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 44

When to Use Which? The best approach by workload and data type Processing as a Function of Schema Requirements and Stage of Data Pipeline

Low Cost Storage and Fast Loading

Data Pre-Processing,

Refining, Cleansing

“Simple math at scale”

(Score, filter, sort, avg., count...)

Joins, Unions,

Aggregates

Analytics (Iterative and data mining)

Reporting

Stable Schema

Evolving Schema

Aster (SQL +

MapReduce Analytics)

Format, No Schema Hadoop Hadoop Hadoop Aster Aster

Aster (MapReduce Analytics)

Teradata/ Hadoop Teradata Teradata Teradata Teradata Teradata

Hadoop Aster / Hadoop

Aster / Hadoop Aster Aster Aster

Hadoop Hadoop Hadoop Aster Aster Aster

Financial Analysis, Ad-Hoc/OLAP Enterprise-Wide BI and Reporting

Spatial/Temporal Active Execution

Interactive Data Discovery Web Clickstream, Set-Top Box Analysis

CDRs, Sensor Logs, JSON

Social Feeds, Text, Image Processing Audio/Video Storage and Refining

Storage and Batch Transformations

Page 45: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 45

When to Use Which? The best approach by workload and data type Processing as a Function of Schema Requirements and Stage of Data Pipeline

Low Cost Storage and Fast Loading

Data Pre-Processing,

Refining, Cleansing

“Simple math at scale”

(Score, filter, sort, avg., count...)

Joins, Unions,

Aggregates

Analytics (Iterative and data mining)

Reporting

Stable Schema

Evolving Schema

Aster (SQL +

MapReduce Analytics)

Format, No Schema Hadoop Hadoop Hadoop Aster Aster

Aster (MapReduce Analytics)

Teradata/ Hadoop Teradata Teradata Teradata Teradata Teradata

Hadoop Aster / Hadoop

Aster / Hadoop Aster Aster Aster

Hadoop Hadoop Hadoop Aster Aster Aster

Page 46: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 46

Ease of Development and Reuse Analytic Foundation : 70+ out-of-the-box modules Modules Business-ready SQL-MapReduce Functions

Path Analysis Discover patterns in rows of sequential data

•  nPath: complex sequential analysis for time series analysis and behavioral pattern analysis

•  Sessionization: identifies sessions from time series data in a single pass over the data

•  Attribution: operator to help ad networks and websites to distribute “credit”

Statistical Analysis High-performance processing of common statistical calculations

•  Histogram: function to provide capability of generating •  Decision Trees: Native implementation of parallel random forests. •  Approximate percentiles and distinct counts: calculate

percentiles and counts within specific variance •  Correlation: calculation that characterizes the strength of the

relation between different columns •  Regression: performs linear or logistic regression between an output

variable and a set of input variables •  Averages: calculate moving, weighted, exponential or volume-

weighted averages over a window of data

Relational Analysis Discover important relationships among data

•  Graph analysis: finds shortest path from a distinct node to all other nodes in a graph

•  Tokenization: splits strings into individual words to assist text processing

Page 47: Left Brain, Right Brain: How to Unify Enterprise Analytics

Confidential and proprietary. Copyright © 2013 Teradata Corporation. 47

Modules SQL-MapReduce Analytic Functions

Text Analysis Derive patterns in textual data

•  Text Processing: counts occurrences of words, identifies roots, & tracks relative positions of words & multi-word phrases

•  Text Partition: analyzes text data over multiple rows •  Levenshtein Distance: computes the distance between two words

Cluster Analysis Discover natural groupings of data points

•  k-Means: clusters data into a specified number of groupings •  Canopy: partitions data into overlapping subsets within which k-

means is performed •  Minhash: buckets highly-dimensional items for cluster analysis •  Basket analysis: creates configurable groupings of related items

from transaction records in single pass •  Collaborative Filter: predicts the interests of a user by collecting

interest information from many users

Data Transformation Transform data for more advanced analysis

•  Unpack: extracts nested data for further analysis •  Pack: compress multi-column data into a single column •  Antiselect: returns all columns except for specified column •  Multicase: case statement that supports row match for multiple

cases

Ease of Development and Reuse Analytic Foundation : 50+ out-of-the-box modules

Page 48: Left Brain, Right Brain: How to Unify Enterprise Analytics

Twitter Tag: #briefr

The Briefing Room

Analyst: Robin Bloor

Perceptions & Questions

Page 49: Left Brain, Right Brain: How to Unify Enterprise Analytics

The Bloor Group

Page 50: Left Brain, Right Brain: How to Unify Enterprise Analytics

The Bloor Group

Big Data Is About Analytics

DATA AIN’T WHAT IT USED TO BE Machine generated data (logs)

Web data

Social media data

Public data services

Supply chain data

Real-time data flows

THE ANALOGY OF STRIP-MINING IS RELEVANT BECAUSE THE SCALE OF DATA

ANALYTICS HAS EXPANDED DRAMATICALLY

Page 51: Left Brain, Right Brain: How to Unify Enterprise Analytics

The Bloor Group

The Data Analytics Issue

Page 52: Left Brain, Right Brain: How to Unify Enterprise Analytics

The Bloor Group

What Hadoop Is NOT

A MULTIUSER HIGHLY TUNED ENGINE

AN ANALYTICS PLATFORM

A SOLUTION

A USEFUL, FLEXIBLE AND VERY ECONOMIC DATA STORE – WITH

PLUG-INS

But it IS:

Page 53: Left Brain, Right Brain: How to Unify Enterprise Analytics

The Bloor Group

About Data Analytics

It is all about TIME TO INSIGHT – as long as that is followed by action

Fast time to insight requires FLEXIBLE management of high performance data flows -

for the benefit of the data analyst

The data analyst needs to be able to MARSHAL the data

Then maybe, just maybe, he will deserve the title of DATA SCIENTIST

Page 54: Left Brain, Right Brain: How to Unify Enterprise Analytics

The Bloor Group

Clearly the Teradata Aster Big Analytics Appliance is a powerful data flow engine, so:

!   How does Aster Data achieve its performance lift with MapReduce?

!   How is it most usually deployed?

!   Can it do data cleansing in flight?

!   Can it perform analytic tasks?

Page 55: Left Brain, Right Brain: How to Unify Enterprise Analytics

The Bloor Group

!   Why an appliance? What is gained and what is sacrificed?

!   Which sectors/businesses do you expect to be able to make best use of this technology?

!   Which companies/products do you regard as competitors (either direct or near)?

!   Which companies/products do you partner with?

!   How does the appliance fit in the cloud?

Page 56: Left Brain, Right Brain: How to Unify Enterprise Analytics

Twitter Tag: #briefr

The Briefing Room

Page 57: Left Brain, Right Brain: How to Unify Enterprise Analytics

Twitter Tag: #briefr

The Briefing Room

Upcoming Topics

This month: Big Data

February: Analytics

March: Open Source

April: Intelligence

www.insideanalysis.com

Page 58: Left Brain, Right Brain: How to Unify Enterprise Analytics

Twitter Tag: #briefr

The Briefing Room

Thank You for Your

Attention