Product forecastingwebinar 20130417

39
Showcasing Data Science Lab functionality Welcome from Kognitio www.kognitio.com

description

Kognitio Webinar: Showcasing the Data Scientist Lab functionality with External Scripting and how it can be used to run ‘R’ in an MPP environment April 18, 8:00am pst, 11:00am est, 4pm bst, 5pm cest Duration: 45mins plus Q&A Register Dr. Sharon Kirkham, Principal, Kognitio Analytics Center of Excellence, showcases the power of external scripting with a demonstration of the ‘R’ statistical language, running in the massively parallel Kognitio Analytical Platform environment.

Transcript of Product forecastingwebinar 20130417

Page 1: Product forecastingwebinar 20130417

Showcasing Data Science Lab functionality

Welcome from Kognitiowww.kognitio.com

Page 2: Product forecastingwebinar 20130417

Today’s Web Seminar -

Presenters HostMichael HiskeyVice PresidentMarketing & Business Development

Format & Agenda

Keynote Presenters

Dr. Sharon KirkhamData ScientistKognitio Analytics Center of Excellence

• Big Data and Complexity– the need for Data Scientists Question Break #1

• Data Manipulation – functional demonstrationQuestion Break #2

• Product forecasting with parallel R  ‐ practical demonstration Question Break # 3

Page 3: Product forecastingwebinar 20130417

Kognitio

Kognitio is focused on providing the premier high‐performance analytical platform to power business insight 

around the world

• Kognitio invented the in‐memory analytical platform, first taking it to market in 1989

• Privately held• Labs in the UK ‐ HQ in New York, NY 

Page 4: Product forecastingwebinar 20130417

The Data Science Lab

Data Scientists &

Staff

MathematicAlgorithms

MPP Computing

BIG DATA

11

Page 5: Product forecastingwebinar 20130417

What do business users want to do?

Find patterns

Track life time

journeys

Predict behavior

Forecast scenarios

Allocate scarce

resources

Model value

Characterize groups

Visualize discovery

Respond, trigger,

manage, promote

Page 6: Product forecastingwebinar 20130417

I’m a data scientist! Are you?Entry level skills and development - aspiration

Machine Learning

Graduates

Page 7: Product forecastingwebinar 20130417

I’m a data scientist! Are you?

BusinessExpertise

Machine Learning

Interpretationskills

= Insight

Graduates

Need guidance

Data Scientist

Page 8: Product forecastingwebinar 20130417

Supporting the data scientistTypical process – traditionally…

Database

Page 9: Product forecastingwebinar 20130417

Supporting the data scientistTypical process – direct data preparation

Database

SQL processing

Page 10: Product forecastingwebinar 20130417

Supporting the data scientistTypical process – produces analytical data set

Database

SQL processingData Set

Page 11: Product forecastingwebinar 20130417

Supporting the data scientistTypical process – run analytics from server

Database

SQL processingData Set

???

Page 12: Product forecastingwebinar 20130417

Supporting the data scientistTypical process – data samples often used

Database

SQL processingData Set

???

Data Samples Process runiteratively= slow

Page 13: Product forecastingwebinar 20130417

Supporting the data scientistTypical process – modelling process is honed

Database

SQL processingData Set

???

Data Samples Process runiteratively= slow

Page 14: Product forecastingwebinar 20130417

Supporting the data scientistTypical process – model is complete

Database

Data Set

???

Page 15: Product forecastingwebinar 20130417

Supporting the data scientistTypical process – score full data (Ouch!)

Database

Data Set

???

Full data to score

Page 16: Product forecastingwebinar 20130417

Supporting the data scientistPush processes to DB – still produce analytical data set

Analytical Platform

SQL processingData Set

Page 17: Product forecastingwebinar 20130417

Supporting the data scientistPush processes to DB – translate specific processes

Analytical Platform

SQL processingData Set

???

Translation

Page 18: Product forecastingwebinar 20130417

Supporting the data scientistPush processes to DB – results passed back

Analytical Platform

SQL processingData Set

???

Translation

Result Data Set

Page 19: Product forecastingwebinar 20130417

Supporting the data scientistPush processes to DB– modelling process is honed

Analytical Platform

SQL processingData Set

???

Translation

Result Data Set

Page 20: Product forecastingwebinar 20130417

Supporting the data scientistPush processes to DB– model scoring done in DB

Analytical Platform

SQL processingData Set

???

Result Data Set

Page 21: Product forecastingwebinar 20130417

Supporting the data scientistBut we always want more! Complex data structure

Analytical Platform

Data Set

???

Result Data Set

SQL cannot handleData complexity.How do I integrate into my model?

Page 22: Product forecastingwebinar 20130417

Supporting the data scientistBut we always want more! non-standard processes

Database

SQL processingData Set

???

Data Samples Back where we started

Page 23: Product forecastingwebinar 20130417

Supporting the data scientistBring Analytics to data – still produce analytical data set

SQL processing

SQL processing

Page 24: Product forecastingwebinar 20130417

Supporting the data scientistBring Analytics to data – can use other code for data prep

SQL processing

Kognitio scripting

Code executedUsing MPP

Data held in Memory. Fast access to CPUs

Page 25: Product forecastingwebinar 20130417

Supporting the data scientistBring Analytics to data – run analytics natively in Kognitio

SQL processing

Kognitio scripting

Code executedUsing MPP

Data held in Memory. Fast access to CPUs

One platform flexible workingfrom data prep through analyticalprocess

Page 26: Product forecastingwebinar 20130417

New! Kognitio version 8: Enabling and extending the Analytical Platform

External Tables

External FunctionsNot Only SQL

Hadoop Connector Other Connectors

Kognitio Storageas an External table

General Availability: June 2013

Page 27: Product forecastingwebinar 20130417

External Scripting – Data Transformation

Converting structured data into XML format, i.e. furnishing

personalised content

Assembly

Converting XML into structured data

Disassembly

Extracting complex informationfrom URLs

Pulling words from large text fields, i.e. sentiment analysis

Parsing

Converting row based informationinto columns for data mining,

i.e. supporting classification orsegmentation

Transposition

e.g. using perl

Examples where SQL is typically complex and extensive

Page 28: Product forecastingwebinar 20130417

Data Manipulation Small Demo

Page 29: Product forecastingwebinar 20130417

Product Forecasting – with parallel R

ForecastingRequirements

Forecast Inputs

Page 30: Product forecastingwebinar 20130417

R running in an MPP environment

PersistenceLayer

AnalyticalPlatform

Layer

Page 31: Product forecastingwebinar 20130417

R running in an MPP environment

PersistenceLayer

AnalyticalPlatform

Layer

Kognitioplatform

specification

16 servers462GB Kognitio

RAM128 Cores

This is old kit

2.9 billionrows of

epos

184 day time seriesfor 12K products

Page 32: Product forecastingwebinar 20130417

R running in an MPP environment

PersistenceLayer

AnalyticalPlatform

Layer

Page 33: Product forecastingwebinar 20130417

R running in an MPP environment

PersistenceLayer

AnalyticalPlatform

Layer

1 output table in RAM

128 parallel instances of R

Page 34: Product forecastingwebinar 20130417

R running in an MPP environment

PersistenceLayer

AnalyticalPlatform

Layer

Application &Client Layer

ExcelAll BI Tools

Page 35: Product forecastingwebinar 20130417

R running in an MPP environment

PersistenceLayer

AnalyticalPlatform

Layer

Application &Client Layer

ExcelAll BI Tools

13 views of different analytical

output

Page 36: Product forecastingwebinar 20130417

R running in an MPP environment

PersistenceLayer

AnalyticalPlatform

Layer

Application &Client Layer

ExcelAll BI Tools

Result set contained

# rows

12K forecasts andstats calculated

in # seconds

2.9B EPOS items collated into time seriesin # seconds

Page 37: Product forecastingwebinar 20130417

Product Forecastingusing parallel R Demo

Page 38: Product forecastingwebinar 20130417

Thank you for your participation today

• More information on today’s topic can be found at: • kognitio.com/mpp_r• kognitio.com/product‐forecasting

• FREE TO USE – perpetual license– www.kognitio.com/free– Contact us for the pre‐release version 8

• Analyst White Papers– EMA Comparative Analysis – In‐memory database platforms– www.kognitio.com/emacompinmem

• Today’s slides (and more): www.slideshare.net/Kognitio

Page 39: Product forecastingwebinar 20130417

connect

www.kognitio.com

twitter.com/kognitiolinkedin.com/companies/kognitio

tinyurl.com/kognitio youtube.com/kognitio

NA: +1 855  KOGNITIOEMEA: +44 1344 300 770