An overview of The IBM Intelligent Miner for Data By: Neeraja Rudrabhatla 11/04/1999.

28
An overview of The IBM Intelligent Miner for Data By: Neeraja Rudrabhatla 11/04/1999
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    224
  • download

    1

Transcript of An overview of The IBM Intelligent Miner for Data By: Neeraja Rudrabhatla 11/04/1999.

An overview of The IBM Intelligent Miner for Data

By: Neeraja Rudrabhatla

11/04/1999

Mining Features supported by the Data Miner:

• Association Rules

• Clustering - Demographic, Neural networks

• Predicting classifications - Neural Networks, Decision Trees

• Predicting values

• Discovering sequential patterns

• Discovering similar time sequences

Steps for mining data using the Data Miner:

• Creation of data

• Analyze and prepare data for mining

• Mine the data using one or a combination of mining techniques

• Visualize mining results using advanced graphical techniques

Main Window of the Data Miner:

Database used for mining association rules:

Store ID Customer # Date(yymmdd) Transaction # ItemID001 0000007 950109 00982 122001 0000007 950109 00982 125001 0000007 950109 00982 133001 0000007 950109 00982 150 001 0000003 950109 00983 153001 0000003 950109 00983 154001 0000003 950109 00983 162001 0000003 950109 00983 166001 0000005 950109 00984 147001 0000005 950109 00984 174001 0000005 950109 00984 191001 0000005 950109 00984 198001 0000008 950109 00985 147001 0000008 950109 00985 174001 0000008 950109 00985 182001 0000008 950109 00985 184001 0000006 950109 00986 174001 0000006 950109 00986 186001 0000006 950109 00986 187001 0000006 950109 00986 188001 0000002 950109 00987 109

Name Mapping:101 Cream102 A-Beer103 B-Beer104 C-Beer105 Stout107 Export108 Cider109 Milk110 Antifreeze111 Port wine112 White German wine113 Red German wine114 White French wine115 Red French wine116 White Italian wine117 Red Italian wine118 Sherry119 Champagne120 Sekt121 Asti Spumante122 Crackers123 Salty biscuits124 Crisps125 Cheddar Cheese126 Gouda Cheese127 Cottage cheese128 Irish Butter

Results of mining for associations:

Results on the automobile Database:

Another view:

Database used for Clustering:Gender Age Siblings Income Type Productfemale 18.02 1 97 red 2female 13.03 6 490 green 3male 11.0 3 647 red 4female 47.5 2 3192 green 5male 11.07 5 736 blue 6female 24.0 3 22358 blue 7female 62.1 0 3936 green 8female 04.08 1 516 pink 1female 40.1 0 9478 red 2female 04.08 0 193 pink 3female 45.8 5 16984 green 4male 21.07 0 10428 blue 5male 07.02 0 960 blue 6female 42.5 0 10835 pink 7female 36.9 2 37083 green 8male 10.03 3 877 blue 1male 02.03 0 10 blue 2female 20.0 0 15432 green 3

Clustering - Demographic:

Max #clusters: 9

Accuracy: 5%

Max #clusters: 9

Accuracy: 5%

Details of Cluster 7:

Detailed pie-chart for attribute Type:

Detailed bar-graph of attribute Age:

Output obtained with Clustering using Neural Networks:

Details of Cluster 6:

Database used for Classification:

Day Outlook Temperature Humidity Wind PlayTennis

D1 Sunny Hot High Weak No

D2 Sunny Hot High Strong No

D3 Overcast Hot High Weak Yes

D4 Rain Mild High Weak Yes

D5 Rain Cool Normal Weak Yes

D6 Rain Cool Normal Strong No

D7 Overcast Cool Normal Strong Yes

D8 Sunny Mild High Weak No

D9 Sunny Cool Normal Weak Yes

D10 Rain Mild Normal Weak Yes

D11 Sunny Mild Normal Strong Yes

D12 Overcast Mild High Strong Yes

D13 Overcast Hot Normal Weak Yes

D14 Rain Mild High Strong No

Classification using Decision Tree:

A view of a leaf node of the decision tree:

Classification using neural network:In-sample: 4

Out-Sample: 1

Accuracy: 80

Error: 10

Learning Rate: 0.1

Momentum: 0.9

In-sample: 4

Out-Sample: 1

Accuracy: 80

Error: 10

Learning Rate: 0.1

Momentum: 0.9

Viewing the results in bar-graphs:

Database for Value Prediction:

D1 Sunny 80 High Weak No

D2 Sunny 75 High Strong No

D3 Overcast 70 High Weak Yes

D4 Rain 55 High Weak Yes

D5 Rain 32 Normal Weak Yes

D6 Rain 35 Normal Strong No

D7 Overcast 40 Normal Strong Yes

D8 Sunny 60 High Weak No

D9 Sunny 20 Normal Weak Yes

D10 Rain 67 Normal Weak Yes

D11 Sunny 62 Normal Strong Yes

D12 Overcast 58 High Strong Yes

D13 Overcast 74 Normal Weak Yes

D14 Rain 61 High Strong No

Results of PlayTennis:

In-sample: 2

Out-sample: 1

In-sample: 2

Out-sample: 1

One partition of the PlayTennis-Prediction:

Textual Representation of a single partition:

Sequential Patterns Mining and Time Sequence Mining:

• Sequential patterns are used to find predictable patterns of behavior over a period of time.

(A certain behavior at a given time is likely to produce another behavior or a sequence of behaviors within a certain time-span)

• Time sequences help find all occurrences of similar subsequences in a database of time sequences.

Sequences:

• Combine several objects into a single object that you can run

• The benefit is that you can combine several steps into one step

• If you combine several functions into a sequence, you need run only

the sequence, which then runs each of the objects within it

Applications:

The Intelligent Miner offerings are intended for use by Data Analysts and Business Technologists in the following areas:

• Perform database marketing

• Streamline business and manufacturing processes

• Detect potential cases of fraud

• Helps in customer relationship management