Chapter 6: Model Assessment

1

Chapter 6: Model Assessment

6.1 Model Fit Statistics

6.2 Statistical Graphics

6.3 Adjusting for Separate Sampling

6.4 Profit Matrices

2


6.1 Model Fit Statistics6.1 Model Fit Statistics



6.4 Profit Matrices

3

Summary Statistics SummaryStatisticPrediction Type

Decisions

Rankings

Estimates

ROC Index (concordance)Gini coefficient

Average squared errorSBC/Likelihood

...

4


Decisions

Rankings

Estimates Average squared errorSBC/Likelihood

Accuracy/MisclassificationProfit/Loss

Inverse prior threshold

...

5


Decisions

Rankings

Estimates

Accuracy/MisclassificationProfit/Loss

Inverse prior threshold

ROC Index (concordance)Gini coefficient

6

Comparing Models with Summary Statistics

This demonstration illustrates the use of the Model Comparison tool, which collects assessment information from attached modeling nodes and enables you to easily compare model performance measures.

7



6.2 Statistical Graphics6.2 Statistical Graphics


6.4 Profit Matrices

8

Statistical Graphics – ROC Chart

captured response fraction(sensitivity)

false positive fraction(1-specificity)

...

The ROC chart illustrates a tradeoffbetween a captured response fraction

and a false positive fraction.

0.0

1.0

0.0 1.0

9


captured response fraction(sensitivity)

false positive fraction(1-specificity)

...

The ROC chart illustrates a tradeoffbetween a captured response fraction

and a false positive fraction.

0.0

1.0

0.0 1.0

10


...

0.0

1.0

0.0 1.0

Each point on the ROC chart corresponds to a specific fraction of cases, ordered by their predicted value.

11


...

0.0

1.0

0.0 1.0

Each point on the ROC chart corresponds to a specific fraction of cases, ordered by their predicted value.

12


...

0.0

1.0

0.0 1.0

top 40%

For example, this point on the ROC chart corresponds to the 40% of cases with the highest predicted values.

13


...

0.0

1.0

0.0 1.0

top 40%

For example, this point on the ROC chart corresponds to the 40% of cases with the highest predicted values.

14


...

0.0

1.0

0.0 1.0

top 40%

The y-coordinate shows the fraction of primary outcomecases captured in the top 40% of all cases.

15


...

0.0

1.0

0.0 1.0

top 40%

The y-coordinate shows the fraction of primary outcomecases captured in the top 40% of all cases.

16


...

0.0

1.0

0.0 1.0

top 40%

The x-coordinate shows the fraction of secondary outcome cases captured in the top 40% of all cases.

17


...

0.0

1.0

0.0 1.0

top 40%

The x-coordinate shows the fraction of secondary outcome cases captured in the top 40% of all cases.

18


...

0.0

1.0

0.0 1.0

top 40%

Repeat for all selection fractions.

19


...

top 40%

0.0

1.0

0.0 1.0


20


...

0.0

1.0

0.0 1.0

weak model strong model

21

Statistical Graphics – ROC Index

...

0.0

1.0

0.0 1.0

weak modelROC Index < 0.6

strong modelROC Index > 0.7

22

Comparing Modelswith ROC Charts

This demonstration illustrates the use of ROC charts to compare models.

23

Statistical Graphics – Response Chart

cumulative percent response

percent selected

...

The response chart shows the expectedresponse rate for various selection percentages.

50%

100%

0% 100%

24


cumulative percent response

percent selected

...

The response chart shows the expectedresponse rate for various selection percentages.

50%

100%

0% 100%

25


...

50%

100%

0% 100%

Each point on the response chart corresponds to a specific fraction of cases, ordered by their predicted values.

26


...

50%

100%

0% 100%

Each point on the response chart corresponds to a specific fraction of cases, ordered by their predicted values.

27


...

top 40%

For example, this point on the response chart corresponds to the 40% of cases with the highest predicted values.

50%

100%

0% 100%

28


...

top 40%

For example, this point on the response chart corresponds to the 40% of cases with the highest predicted values.

50%

100%

0% 100%

29


...

top 40%

50%

100%

0% 100%

The x-coordinate shows the percentage of selected cases.

40%

30


...

top 40%

50%

100%

0% 100%

The x-coordinate shows the percentage of selected cases.

40%

31


...

top 40%

50%

100%

0% 100%40%

The y-coordinate shows the percentage of primary outcome cases found in the top 40%.

32


...

top 40%

50%

100%

0% 100%40%

The y-coordinate shows the percentage of primary outcome cases found in the top 40%.

33


...

50%

100%

0% 100%40%

top 40%


35

6.01 PollIn practice, modelers often use several tools, sometimes both graphical and numerical, to choose a best model.

True

False

36

6.01 Poll – Correct AnswerIn practice, modelers often use several tools, sometimes both graphical and numerical, to choose a best model.

True

False

37

Comparing Modelswith Score Rankings Plots

This demonstration illustrates comparing models with Score Rankings plots.

38

Adjusting for Separate Sampling

This demonstration illustrates how to adjust for separate sampling in SAS Enterprise Miner.

39




6.3 Adjusting for Separate Sampling6.3 Adjusting for Separate Sampling

6.4 Profit Matrices

40

Outcome OverrepresentationA common predictive modeling practice is to build models from a sample with a primary outcome proportion different from the original population.

...

41

Outcome OverrepresentationA common predictive modeling practice is to build models from a sample with a primary outcome proportion different from the original population.

...

42

Separate Sampling

...

Target-based samples are created by considering the primary outcome cases separately from the secondary outcome cases.

primary outcomesecondary outcome

43

Separate Sampling

...

Target-based samples are created by considering the primary outcome cases separately from the secondary outcome cases.


44

Separate Sampling

...

Select all cases.Select some cases.


45

Separate Sampling

...

Select all cases.Select some cases.


46

The Modeling Sample

...

+ Similar predictive powerwith smaller case count

− Must adjust assessmentstatistics and graphics

− Must adjust predictionestimates for bias

47

Adjusting for Separate Sampling (continued)

This demonstration illustrates how to adjust for separate sampling in SAS Enterprise Miner.

48

Creating a Profit Matrix

This demonstration illustrates how to create a profit matrix.

49





6.4 Profit Matrices6.4 Profit Matrices

50

0

0

Profit Matrices

0profit distribution

for solicit decision

-0.68

solicit ignore

primaryoutcome

secondaryoutcome

51

Profit Matrices

profit distributionfor solicit decision

0

0

0

solicit ignore

primaryoutcome

secondaryoutcome

15.14

52

Expected Profit Solicit = 15.14 p1 – 0.68 p0

Expected Profit Ignore = 0

Choose the larger.

^ ^

Decision Expected Profits

0

...

solicit ignore

primaryoutcome

secondaryoutcome

53

decision threshold

Decision Threshold

^

^p1 ≥ 0.68 / 15.82 Solicit

p1 < 0.68 / 15.82 Ignore

0

solicit ignore

primaryoutcome

secondaryoutcome

54

Average Profit

average profit

Average profit = (15.14NPS – 0.68 NSS ) / N

NPS = # solicited primary outcome cases

NSS = # solicited secondary outcome cases

N = total number of assessment cases

0

solicit ignore

primaryoutcome

secondaryoutcome

55

Evaluating Model Profit

This demonstration illustrates viewing the consequences of incorporating a profit matrix.

56

Viewing Additional Assessments

This demonstration illustrates several other assessments of possible interest.

57

Optimizing with Profit (Self-Study)

This demonstration illustrates optimizing your model strictly on profit.

58

Exercises

This exercise reinforces the concepts discussed previously.

59

Assessment Tools Review

Compare model summary statistics and statistical graphics.

Create decision data; add prior probabilities and profit matrices.

Tune models with average squared error or appropriate profit matrix.

Obtain means and other statistics on data source variables.

Chapter 6: Model Assessment

Documents

Transcript of Chapter 6: Model Assessment