1
Data Mining
2
SAS Enterprise Miner
User: sasdemo1, sasdemo2, … , sasdemo24
Password: aboi0rajee
Server: asas2
Data:
use your sgh login as Project name
3
Process diagram flow
Businesthe data mining process is driven by a process flow diagram that you create by dragging nodes from a toolbar that is organized by SEMMA categories and dropping them onto a diagram workspace.
4
The SAS EM Grafical User Interface1. Toolbar shortcut buttons
2. Project Panel
3. Properties Panel
4. Property
Help Panel
7. Diagram
Navigation Tollbar
6. Diagram
Workspace
5. Toolbar
5
The SAS EM Grafical User Interface
Toolbar Shortcut Buttons to perform common computer functions and frequently used
SAS Enterprise Miner operations. Move the mouse pointer over any shortcut button to see the text name. Click on a shortcut button to use it.
Project Panel to manage and view data sources, diagrams, results, and
project users. Properties Panel
to view and edit the settings of data sources, diagrams, nodes, and users.
Property Help Panel The Property Help Panel displays a short description of any
property that you select in the Properties Panel. Extended help can be found from the Help main menu.
6
The SAS EM Grafical User Interface
Toolbar a graphic set of node icons that you use to build process flow
diagrams in the Diagram Workspace. Drag a node icon into the Diagram Workspace to use it. The icon remains in place in the Toolbar, and the node in the Diagram Workspace is ready to be connected and configured for use in the process flow diagram.
Diagram Workspace to build, edit, run, and save process flow diagrams. In this
workspace, you graphically build, order, sequence, and connect the nodes that you use to mine your data and generate reports.
Diagram Navigation Toolbar to organize and navigate the process flow diagram.
http://support.sas.com/documentation/onlinedoc/miner/
7
ROC Curves
0,00
0,10
0,20
0,30
0,40
0,50
0,60
0,70
0,80
0,90
1,00
0,00 0,10 0,20 0,30 0,40 0,50 0,60 0,70 0,80 0,90 1,00
Fn(X|Y=0)
Fn(X|Y=1)
F(S|Y=0) = 1- Specifity
F(S|
Y=1)
= S
ensiv
ity
(F(S|Y=0); F(S|Y=1))
Styczna do krzywej ROC
1
0
)( dxxfAUROC ROC
k
iiininin YXFYXFYXFYXFAUROC
0
1|1|0|0|21
Stanowi wizualizację „separacji” rozkładów warunkowych zmiennej: Można potraktować pole pod krzywą ROC jako miarę zależności stochastycznej
Classification error
9
ROC Curve
10
Classification Errors
11
Classification Errors
12
SAS/BASE & SAS/STAT
13
PROC step
libname data „path";
libname data "C:\Users\Andrzej\Desktop";
proc logistic data=data.German_credit desc;model default=duration credit_amt instalment age /outroc=roc;run;
proc gplot data=roc;Title "ROC Curve";symbol i=join;plot _sensit_ * _1mspec_;run;
14
Dimension Reduction – PROC VARCLUS
The VARCLUS procedure divides a set of numeric variables into disjoint or hierarchical clusters. Associated with each cluster is a linear combination of the variables in the cluster. This linear combination can be either the first principal component (the default) or the centroid component (if you specify the CENTROID option).
proc varclus data=data.German_credit outtree=Tree maxclusters=10 noprint; var duration credit_amt instalment age;run;
proc tree data=tree;proc tree data=tree lineprinter;axis1 order=(0 to 1 by 0.2);proc tree data=Tree horizontal haxis=axis1; height _PROPOR_;run;
15
_NAME_ the name of the cluster _PARENT_ the parent of the cluster _NCL_ the number of clusters _VAREXP_ the amount of variance explained by the
cluster _PROPOR_ the proportion of variance explained by the
clusters at the current level of the tree diagram _MINPRO_ the minimum proportion of variance explained
by a cluster _MAXEIGEN_ the maximum second eigenvalue of a
cluster
Top Related