SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Post on 25-Dec-2014

3.150 views 1 download

description

Presentation delivered at SQL Saturday Atlanta GA -- April 22, 2010

Transcript of SQL Server 2008 Data Mining with PowerPivot and Excel 2010

SS2008 Data Mining with

Excel 2010 and PowerPivot

Mark Tabladillo Ph.D.

http://marktab.net

April 22, 2010

W. Edwards Deming

2 © 2010 Mark Tabladillo Ph.D.

W. Edwards Deming

3 © 2010 Mark Tabladillo Ph.D.

Production, assembly, inspection

Distribution

Consumers

Consumer

research

Design and

Redesign

Receipt and

test of materials

Tests of processes,

machines, methods,

costs

Suppliers of

materials and

equipment

A

B

C

D

CRISP-DM Version 1.0

4 © 2010 Mark Tabladillo Ph.D.

Jeff Hawkins

5 © 2010 Mark Tabladillo Ph.D.

Outline

What is Data Mining

What is PowerPivot

Demos

6 © 2010 Mark Tabladillo Ph.D.

Technology

7 © 2010 Mark Tabladillo Ph.D.

Outline

What is Data Mining

What is PowerPivot

Demos

8 © 2010 Mark Tabladillo Ph.D.

Data Mining Definitions

• Data mining

• Machine Learning

• Data mining algorithms typically use

estimation or optimization to achieve

results (as opposed to only calculations).

9 © 2010 Mark Tabladillo Ph.D.

Data Mining Tasks

• Supervised

– Answer known, what is correlated?

• Unsupervised

– Answer unknown (unspecified), what are the

groups?

• Forecasting

– Given a trend, what is next?

10

Value

Slide

© 2010 Mark Tabladillo Ph.D.

Data Mining Add-In for Excel

• Version 10.00.2531.00 (April 2009)

• 32-Bit Add-In

• Microsoft .NET Framework 2.0 (32-bit)

• Office 2007 (Professional, Professional

Plus, Ultimate, Enterprise)

• SQL Server Enterprise or Standard (or

Developer) 2008 or higher

11 © 2010 Mark Tabladillo Ph.D.

The Analyze Tab

12 © 2010 Mark Tabladillo Ph.D.

The Analyze Tab

13 © 2010 Mark Tabladillo Ph.D.

Menu Option Data Mining Algorithm

Analyze Key Influencers Naïve Bayes

Detect Categories Clustering

Fill from Example Logistic Regression

Forecast Time Series

Highlight Exceptions Clustering

Scenario Analysis (Goal Seek) Logistic Regression

Scenario Analysis (What If) Logistic Regression

Prediction Calculator Logistic Regression

Shopping Basket Analysis Association Rules

Data Mining Tab

14 © 2010 Mark Tabladillo Ph.D.

Data Mining Tab

15 © 2010 Mark Tabladillo Ph.D.

Data Mining Tab

16 © 2010 Mark Tabladillo Ph.D.

Outline

What is Data Mining

What is PowerPivot

Demos

17 © 2010 Mark Tabladillo Ph.D.

PowerPivot for Excel

• Take advantage of familiar Excel tools and

features

• Process massive amounts of data in seconds

• Load even the largest data sets from virtually

any source

• Use powerful new analytical capabilities, such as

Data Analysis Expressions (DAX)

• Make the most of multi-core processors and

gigabytes of memory

18 © 2010 Mark Tabladillo Ph.D.

PowerPivot for Excel

• SQL Server

• SQL Azure

• Oracle, Teradata, Sybase, Informix, IBM DB2

• OLEDB/ODBC

• Analysis Services (SSAS)

• Reporting Services (SSRS)

• Excel, Text File

19 © 2010 Mark Tabladillo Ph.D.

What is it?

20 © 2010 Mark Tabladillo Ph.D.

What is it?

21 © 2010 Mark Tabladillo Ph.D.

PowerPivot Reference

• http://www.powerpivot.com (Product Site)

• http://www.powerpivotpro.com (Blog Site)

22 © 2010 Mark Tabladillo Ph.D.

Outline

What is Data Mining

What is PowerPivot

Demos

23 © 2010 Mark Tabladillo Ph.D.

W. Edwards Deming

24 © 2010 Mark Tabladillo Ph.D.

Resources

• MarkTab.NET Links, video resources and information for data mining

25 © 2010 Mark Tabladillo Ph.D.

Regroup and Conclusion

• Main Points from this Presentation

26 © 2010 Mark Tabladillo Ph.D.

Contact Information

• Mark Tabladillo

Twitter @marktabnet

• Also on:

Linked In

Facebook

27 © 2010 Mark Tabladillo Ph.D.