SQL Server 2008 Data Mining with PowerPivot and Excel 2010

27
SS2008 Data Mining with Excel 2010 and PowerPivot Mark Tabladillo Ph.D. http://marktab.net April 22, 2010

description

Presentation delivered at SQL Saturday Atlanta GA -- April 22, 2010

Transcript of SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Page 1: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

SS2008 Data Mining with

Excel 2010 and PowerPivot

Mark Tabladillo Ph.D.

http://marktab.net

April 22, 2010

Page 2: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

W. Edwards Deming

2 © 2010 Mark Tabladillo Ph.D.

Page 3: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

W. Edwards Deming

3 © 2010 Mark Tabladillo Ph.D.

Production, assembly, inspection

Distribution

Consumers

Consumer

research

Design and

Redesign

Receipt and

test of materials

Tests of processes,

machines, methods,

costs

Suppliers of

materials and

equipment

A

B

C

D

Page 4: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

CRISP-DM Version 1.0

4 © 2010 Mark Tabladillo Ph.D.

Page 5: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Jeff Hawkins

5 © 2010 Mark Tabladillo Ph.D.

Page 6: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Outline

What is Data Mining

What is PowerPivot

Demos

6 © 2010 Mark Tabladillo Ph.D.

Page 7: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Technology

7 © 2010 Mark Tabladillo Ph.D.

Page 8: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Outline

What is Data Mining

What is PowerPivot

Demos

8 © 2010 Mark Tabladillo Ph.D.

Page 9: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Data Mining Definitions

• Data mining

• Machine Learning

• Data mining algorithms typically use

estimation or optimization to achieve

results (as opposed to only calculations).

9 © 2010 Mark Tabladillo Ph.D.

Page 10: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Data Mining Tasks

• Supervised

– Answer known, what is correlated?

• Unsupervised

– Answer unknown (unspecified), what are the

groups?

• Forecasting

– Given a trend, what is next?

10

Value

Slide

© 2010 Mark Tabladillo Ph.D.

Page 11: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Data Mining Add-In for Excel

• Version 10.00.2531.00 (April 2009)

• 32-Bit Add-In

• Microsoft .NET Framework 2.0 (32-bit)

• Office 2007 (Professional, Professional

Plus, Ultimate, Enterprise)

• SQL Server Enterprise or Standard (or

Developer) 2008 or higher

11 © 2010 Mark Tabladillo Ph.D.

Page 12: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

The Analyze Tab

12 © 2010 Mark Tabladillo Ph.D.

Page 13: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

The Analyze Tab

13 © 2010 Mark Tabladillo Ph.D.

Menu Option Data Mining Algorithm

Analyze Key Influencers Naïve Bayes

Detect Categories Clustering

Fill from Example Logistic Regression

Forecast Time Series

Highlight Exceptions Clustering

Scenario Analysis (Goal Seek) Logistic Regression

Scenario Analysis (What If) Logistic Regression

Prediction Calculator Logistic Regression

Shopping Basket Analysis Association Rules

Page 14: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Data Mining Tab

14 © 2010 Mark Tabladillo Ph.D.

Page 15: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Data Mining Tab

15 © 2010 Mark Tabladillo Ph.D.

Page 16: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Data Mining Tab

16 © 2010 Mark Tabladillo Ph.D.

Page 17: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Outline

What is Data Mining

What is PowerPivot

Demos

17 © 2010 Mark Tabladillo Ph.D.

Page 18: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

PowerPivot for Excel

• Take advantage of familiar Excel tools and

features

• Process massive amounts of data in seconds

• Load even the largest data sets from virtually

any source

• Use powerful new analytical capabilities, such as

Data Analysis Expressions (DAX)

• Make the most of multi-core processors and

gigabytes of memory

18 © 2010 Mark Tabladillo Ph.D.

Page 19: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

PowerPivot for Excel

• SQL Server

• SQL Azure

• Oracle, Teradata, Sybase, Informix, IBM DB2

• OLEDB/ODBC

• Analysis Services (SSAS)

• Reporting Services (SSRS)

• Excel, Text File

19 © 2010 Mark Tabladillo Ph.D.

Page 20: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

What is it?

20 © 2010 Mark Tabladillo Ph.D.

Page 21: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

What is it?

21 © 2010 Mark Tabladillo Ph.D.

Page 22: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

PowerPivot Reference

• http://www.powerpivot.com (Product Site)

• http://www.powerpivotpro.com (Blog Site)

22 © 2010 Mark Tabladillo Ph.D.

Page 23: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Outline

What is Data Mining

What is PowerPivot

Demos

23 © 2010 Mark Tabladillo Ph.D.

Page 24: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

W. Edwards Deming

24 © 2010 Mark Tabladillo Ph.D.

Page 25: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Resources

• MarkTab.NET Links, video resources and information for data mining

25 © 2010 Mark Tabladillo Ph.D.

Page 26: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Regroup and Conclusion

• Main Points from this Presentation

26 © 2010 Mark Tabladillo Ph.D.

Page 27: SQL Server 2008 Data Mining with PowerPivot and Excel 2010

Contact Information

• Mark Tabladillo

Twitter @marktabnet

• Also on:

Linked In

Facebook

27 © 2010 Mark Tabladillo Ph.D.