Trade-offs in Explanatory Model Learning 

53
Trade-offs in Explanatory Model Learning DCAP Meeting Madalina Fiterau 22 nd of February 2012 1

description

1. 22 nd of February 2012. Trade-offs in Explanatory Model Learning . DCAP Meeting Madalina Fiterau. 2. Outline. Motivation: need for interpretable models Overview of data analysis tools Model evaluation – accuracy vs complexity Model evaluation – understandability - PowerPoint PPT Presentation

Transcript of Trade-offs in Explanatory Model Learning 

Page 1: Trade-offs in Explanatory  Model Learning 

Trade-offs in Explanatory Model Learning 

DCAP MeetingMadalina Fiterau

22nd of February 2012

1

Page 2: Trade-offs in Explanatory  Model Learning 

Outline•Motivation: need for interpretable models

•Overview of data analysis tools

•Model evaluation – accuracy vs complexity

•Model evaluation – understandability

•Example applications

•Summary

2

Page 3: Trade-offs in Explanatory  Model Learning 

Example Application:Nuclear Threat Detection•Border control: vehicles are scanned•Human in the loop interpreting results

vehicle scan

prediction

feedback

3

Page 4: Trade-offs in Explanatory  Model Learning 

Boosted Decision Stumps•Accurate, but hard to interpret

How is the prediction derived from the input?

4

Page 5: Trade-offs in Explanatory  Model Learning 

Decision Tree – More Interpretable

Radiation > x%

Payload type = ceramics

Uranium level > max. admissible for ceramicsConsider balance

of Th232, Ra226 and Co60

Clear

yes no

yes

no

Threat

yes

no

5

Page 6: Trade-offs in Explanatory  Model Learning 

Motivation6

Many users are willing to trade accuracy to

better understand the system-yielded results

Need: simple, interpretable model

Need: explanatory prediction process

Page 7: Trade-offs in Explanatory  Model Learning 

Analysis Tools – Black-box• Very accurate tree ensemble• L. Breiman,‘Random Forests’,

2001

Random Forests

• Guarantee: decreases training error

• R. Schapire, ‘The boosting approach to machine learning’

Boosting

• Bagged boosting• G. Webb, ‘MultiBoosting: A

Technique for Combining Boosting and Weighted Bagging’

Multi-boosting

7

Page 8: Trade-offs in Explanatory  Model Learning 

Analysis Tools – White-box• Decision tree based on the Gini

Impurity criterionCART

• Dec. tree with leaf classifiers• K. Ting, G. Webb, ‘FaSS:

Ensembles for Stable Learners’Feating

• Ensemble: each discriminator trained on a random subset of features

• R. Bryll, ‘Attribute bagging ’

Subspacing

• Builds a decision list that selects the classifier to deal with a query point

EOP

8

Page 9: Trade-offs in Explanatory  Model Learning 

Explanation-Oriented Partitioning

-4-3

-2-1

01

23

45

-3-2

-10

12

34

5-3

-2

-1

0

1

2

3

4

5

2 Gaussians Uniform cube

-4 -3 -2 -1 0 1 2 3 4 5-3

-2

-1

0

1

2

3

4

5

(X,Y) plot

9

Page 10: Trade-offs in Explanatory  Model Learning 

EOP Execution Example – 3D data

Step 1: Select a projection - (X1,X2)

10

Page 11: Trade-offs in Explanatory  Model Learning 

Step 1: Select a projection - (X1,X2)

11

EOP Execution Example – 3D data

Page 12: Trade-offs in Explanatory  Model Learning 

Step 2: Choose a good classifier - call it h1

h1

12

EOP Execution Example – 3D data

Page 13: Trade-offs in Explanatory  Model Learning 

Step 2: Choose a good classifier - call it h1

13

EOP Execution Example – 3D data

Page 14: Trade-offs in Explanatory  Model Learning 

Step 3: Estimate accuracy of h1 at each point

OK NOT OK

14

EOP Execution Example – 3D data

Page 15: Trade-offs in Explanatory  Model Learning 

Step 3: Estimate accuracy of h1 for each point

15

EOP Execution Example – 3D data

Page 16: Trade-offs in Explanatory  Model Learning 

Step 4: Identify high accuracy regions

16

EOP Execution Example – 3D data

Page 17: Trade-offs in Explanatory  Model Learning 

Step 4: Identify high accuracy regions

17

EOP Execution Example – 3D data

Page 18: Trade-offs in Explanatory  Model Learning 

Step 5:Training points - removed from consideration

18

EOP Execution Example – 3D data

Page 19: Trade-offs in Explanatory  Model Learning 

19

Step 5:Training points - removed from consideration

EOP Execution Example – 3D data

Page 20: Trade-offs in Explanatory  Model Learning 

Finished first iteration

20

EOP Execution Example – 3D data

Page 21: Trade-offs in Explanatory  Model Learning 

21

EOP Execution Example – 3D data

Finished second iteration

Page 22: Trade-offs in Explanatory  Model Learning 

Iterate until all data is accounted for or error cannot be decreased

22

EOP Execution Example – 3D data

Page 23: Trade-offs in Explanatory  Model Learning 

Learned Model – Processing query [x1x2x3]

[x1x2] in R1 ?

[x2x3] in R2 ?

[x1x3] in R3 ?

h1(x1x2

)

h2(x2x3

)

h3(x1x3

)

Default Value

yes

yes

yes

no

no

no

23

Page 24: Trade-offs in Explanatory  Model Learning 

Parametric / Nonparametric RegionsBounding Polyhedra Nearest-neighbor ScoreEnclose points in convex shapes (hyper-rectangles /spheres).

Consider the k-nearest neighborsRegion: { X | Score(X) > t}t – learned threshold

Easy to test inclusion Easy to test inclusionVisually appealing Can look insularInflexible Deals with

irregularities

24

decision

pn1

n2

n3

n4

n5

Incorrectly classified Correctly classified Query point

decision

Page 25: Trade-offs in Explanatory  Model Learning 

EOP in context

Local models

Models trained on all features

Feating

25

Similarities Differences

CART Decision structure

Default classifiers in leafs

Subspacing Low-d projection Keeps all data

points

Boosting

MultiboostingCommittee decision Gradually

deals with difficult data

Ensemble learner

Page 26: Trade-offs in Explanatory  Model Learning 

Outline•Motivation: need for interpretable models

•Overview of data analysis tools

•Model evaluation – accuracy vs complexity

•Model evaluation – understandability

•Example applications

•Summary

26

Page 27: Trade-offs in Explanatory  Model Learning 

Overview of datasets•Real valued features, binary output•Artificial data – 10 features

▫Low-d Gaussians/uniform cubes•UCI repository•Application-related datasets•Results by k-fold cross validation

▫Accuracy▫Complexity = expected number of vector

operations performed for a classification task▫Understandability = w.r.t. acknowledged

metrics

27

Page 28: Trade-offs in Explanatory  Model Learning 

EOP vs AdaBoost - SVM base classifiers•EOP is often less accurate, but not

significantly•the reduction of complexity is statistically

significant

p-value of paired t-test: 0.832 p-value of paired t-test: 0.003

1

3

5

7

9

0.840.860.88 0.9 0.920.940.960.98 1BoostingEOP (nonparametric)

1

3

5

7

9

0 50 100 150 200 250Accuracy Complexity

28

mean diff in accuracy: 0.5% mean diff in complexity: 85

Page 29: Trade-offs in Explanatory  Model Learning 

EOP (stumps as base classifiers) vs CART on data from the UCI repository

BCW

MB

V

BT

0 0.2 0.4 0.6 0.8 1Accuracy

0 5 10 15 20 25 30 35Complexity

CARTEOP N.EOP P.

29

Dataset # of Features # of PointsBreast Tissue 10 1006Vowel 9 990MiniBOONE 10 5000Breast Cancer 10 596

CART is the most accurate

Parametric EOP yields the simplest models

Page 30: Trade-offs in Explanatory  Model Learning 

Typical XOR dataset

30

Why are EOP models less complex?

Page 31: Trade-offs in Explanatory  Model Learning 

Typical XOR dataset

CART• is accurate• takes many iterations• does not uncover or leverage structure of data

31

Why are EOP models less complex?

Page 32: Trade-offs in Explanatory  Model Learning 

Typical XOR dataset

EOP• equally accurate •uncovers structure

Iteration 1

Iteration 2

32

CART• is accurate• takes many iterations• does not uncover or leverage structure of data

+ o

o +

Why are EOP models less complex?

Page 33: Trade-offs in Explanatory  Model Learning 

• At low complexities, EOP is typically more accurate

Error Variation With Model Complexity for EOP and CART

Depth of decision tree/list

EO

P_E

rror

-CAR

T_E

rror

34

1 2 3 4 5 6 7 8

-2

-1.5

-1

-0.5

0

0.5

BCWBTMBVowel

Page 34: Trade-offs in Explanatory  Model Learning 

UCI data – Accuracy

BCW

MB

BT

Vow

0 0.2 0.4 0.6 0.8 1 1.2

R-EOP

N-EOP

CART

Feating

Sub-spacing

Multiboosting

Random Forests

35

White box models – including EOP – do not lose much

Page 35: Trade-offs in Explanatory  Model Learning 

UCI data – Model complexity

BCW

MB

BT

Vow

0 10 20 30 40 50 60 70

R-EOP

N-EOP

CART

Feating

Sub-spacing

Multiboosting

36

Complexity of Random Forests is huge- thousands of nodes -

White box models – especially EOP – are less complex

Page 36: Trade-offs in Explanatory  Model Learning 

Robustness•Accuracy-targeting EOP

▫Identifies which portions of the data can be confidently classified with a given rate.

▫Allowed to set aside the noisy part of the data.

37

Accuracy of AT-EOP on 3D data

Max allowed expected error

Accu

rac

y

Page 37: Trade-offs in Explanatory  Model Learning 

Outline•Motivation: need for interpretable models

•Overview of data analysis tools

•Model evaluation – accuracy vs complexity

•Model evaluation – understandability

•Example applications

•Summary

38

Page 38: Trade-offs in Explanatory  Model Learning 

Metrics of Explainability*39

Lift

Bayes Factor

J-Score

Normalized Mutual

Information

* L.Geng and H. J. Hamilton - ‘Interestingness measures for data mining: A survey’

Page 39: Trade-offs in Explanatory  Model Learning 

Evaluation with usefulness metrics •For 3 out of 4 metrics, EOP beats

CARTCART EOP

BF L J NMI BF L J NMIMB 1.982 0.004 0.389 0.040 1.889 0.007 0.201 0.502

BCW 1.057 0.007 0.004 0.011 2.204 0.069 0.150 0.635BT 0.000 0.009 0.210 0.000 Inf 0.021 0.088 0.643V Inf 0.020 0.210 0.010 2.166 0.040 0.177 0.383

Mean 1.520 0.010 0.203 0.015 2.047 0.034 0.154 0.541

BF =Bayes Factor. L = Lift. J = J-score. NMI = Normalized Mutual Info

40

Higher values are better

Page 40: Trade-offs in Explanatory  Model Learning 

Outline•Motivation: need for interpretable models

•Overview of data analysis tools

•Model evaluation – accuracy vs complexity

•Model evaluation – understandability

•Example applications

•Summary

41

Page 41: Trade-offs in Explanatory  Model Learning 

Spam Detection (UCI ‘SPAMBASE’)•10 features: frequencies of misc. words in e-

mails•Output: spam or not

Mult

iboos

ting

Sub-s

pacin

g

Featin

gCART

N-EOP

R-EOP

Adabo

ost

0102030405060708090

100Splits

Mult

iboos

ting

Sub-s

pacin

g

Featin

gCART

N-EOP

R-EOP

Adabo

ost

0.65

0.7

0.75

0.8

0.85

0.9 Accuracy

42

Complexity

Page 42: Trade-offs in Explanatory  Model Learning 

Spam Detection – Iteration 1▫classifier labels everything as spam▫high confidence regions do enclose mostly

spam and: Incidence of the word ‘your’ is low Length of text in capital letters is high

43

Page 43: Trade-offs in Explanatory  Model Learning 

Spam Detection – Iteration 2▫the required incidence

of capitals is increased▫the square region on the

left also encloses examples that will be marked as `not spam'

44

Page 44: Trade-offs in Explanatory  Model Learning 

Spam Detection – Iteration 345

word_frequency_hi

▫Classifier marks everything as spam▫Frequency of ‘our’ and ‘hi’ determine the regions

Page 45: Trade-offs in Explanatory  Model Learning 

46

Example ApplicationsStem Cells

ICU Medication

Fuel Consumption

Nuclear Threats

explain relationships between treatment and cell evolution

identify combinations of drugs that correlate with readmissions

identify causes of excess fuel consumption among driving behaviors

support interpretation of radioactivity detected in cargo containers

sparse featuresclass imbalance

- high-d data- train/test from different distributions

adapted regression problem

Page 46: Trade-offs in Explanatory  Model Learning 

Effects of Cell Treatment•Monitored population of cells•7 features: cycle time, area, perimeter ...•Task: determine which cells were treated

Mult

iboos

ting

Sub-s

pacin

g

Featin

gCART

N-EOP

R-EOP

Adabo

ost

0.70.720.740.760.780.8 Accuracy

Mult

iboos

ting

Sub-s

pacin

g

Featin

gCART

N-EOP

R-EOP

Adabo

ost

05

10152025 Splits

47

Complexity

Page 47: Trade-offs in Explanatory  Model Learning 

48

Page 48: Trade-offs in Explanatory  Model Learning 

Mimic Medication Data•Information about administered

medication•Features: dosage for each drug•Task: predict patient return to ICU

Mult

iboos

ting

Sub-s

pacin

g

Featin

gCART

N-EOP

R-EOP

Adabo

ost

0.99150.992

0.99250.993

0.99350.994

0.9945Accuracy

Mult

iboos

ting

Sub-s

pacin

g

Featin

gCART

N-EOP

R-EOP

Adabo

ost

0

5

10

15

20

25Splits

49

Complexity

Page 49: Trade-offs in Explanatory  Model Learning 

50

Page 50: Trade-offs in Explanatory  Model Learning 

Predicting Fuel Consumption•10 features: vehicle and driving style

characteristics•Output: fuel consumption level (high/low)

Mult

iboos

ting

Sub-s

pacin

g

Featin

gCART

N-EOP

R-EOP

Adabo

ost

00.10.20.30.40.50.60.70.8 Accuracy

Mult

iboos

ting

Sub-s

pacin

g

Featin

gCART

N-EOP

R-EOP

Adabo

ost

0

5

10

15

20

25Splits

51

Complexity

Page 51: Trade-offs in Explanatory  Model Learning 

52

Page 52: Trade-offs in Explanatory  Model Learning 

Nuclear threat detection data325 Features•Random Forests accuracy: 0.94•Rectangular EOP accuracy: 0.881… butRegions found in 1st iteration for Fold 0:

▫incident.riidFeatures.SNR [2.90,9.2]▫Incident.riidFeatures.gammaDose

[0,1.86]*10-8

Regions found in 1st iteration for Fold 1:▫incident.rpmFeatures.gamma.sigma [2.5,

17.381]▫incident.rpmFeatures.gammaStatistics.ske

wdose [1.31,…]

No

mat

ch53

Page 53: Trade-offs in Explanatory  Model Learning 

Summary•In most cases EOP:

▫ maintains accuracy▫reduces complexity▫identifies useful aspects of the data

•EOP typically wins in terms of expressiveness

•Open questions:▫What if no good low-dimensional projections

exist?▫What to do with inconsistent models in folds of

CV?▫What is the best metric of explainability?

55