Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual...
Transcript of Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual...
![Page 1: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/1.jpg)
Building Cost-
sensitive
ClassifiersTNM033 - Data mining
Daniel Eriksson ([email protected])
Sven Glansberg ([email protected])
Johan Jörtsö ([email protected])
Outline
• Cost-sensitive classifiers
• MetaCost
• Other techniques
• Applications
![Page 2: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/2.jpg)
Cost-Sensitive
classifiers
• Reminder: different measures of quality can be
used. Accuracy, sensitivity, specificity,
precision and recall.
• Another way is to calculate the cost by
defining a cost matrix and using the confusion
matrix.
Evaluating a model
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 3: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/3.jpg)
Repetition: What is
cost?
S
ML
We create a classifier M with algorithm L
from training set S
We want to evaluate the model...
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
Repetition: What is
cost?
We test the model M on test set T...
...obtaining a confusion matrix
T
Confusion matrix
M
Model M
Predicted classPredicted classPredicted class
Actual class
+ -
Actual class
+ 150 40Actual class
- 60 250
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 4: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/4.jpg)
Confusion matrix
Model
MPredicted classPredicted classPredicted class
Actual
class
+ -
Actual
class+ 150 40
Actual
class
- 60 250
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
Cost matrix
Predicted classPredicted classPredicted class
Actual
class
+ -
Actual
class+ -1 100
Actual
class
- 1 0
C(i,j) =
You (or an
expert) have
to define this!
Application-
dependant
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 5: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/5.jpg)
Total cost
Total cost = -1·150 + 100·40 + 1·60 + 0·250 = 3910
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
Total cost
Total cost = -1·150 + 100·40 + 1·60 + 0·250 = 3910
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 6: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/6.jpg)
MetaCost
Some definitions
• S - Training set
• L - Classification
learning algorithm
• M - The model
(classifier) we want
to build
• i,j - Class indices
• x - a record in S
• C(i,j) - Cost matrix
• P(j|x) - (! predicted
confusion matrix)
• R(i|x) - “Expected
cost of predicting
that x belongs to
class i”• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 7: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/7.jpg)
P(j,x)
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
“Confusion matrix”
Model
MPredicted classPredicted classPredicted class
Actual
class
+ -
Actual
class+ 150 40
Actual
class
- 60 250
P(j|x) !
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 8: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/8.jpg)
R(i,x)
“Expected cost of predicting that x belongs to class i”
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
The algorithm –
Parameters
• S - Training set
• L - The classifier
algorithm
• C - Cost matrix
• m - number of
resamples to generate
• n - number of examples
in each
resample (number of
different x)
n " |S|
• p - “Does L produce
class probabilities”
• q - Should all resamples
be used for each
example...
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 9: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/9.jpg)
SS
Training set S
x
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
The MetaCost algorithm
1. Create m resamples
Si from S
2. Create m models Mi
from applying
classifier L to Si
3. For each x in S:
1. For each class j
1. Calculate P(j|x)
2. Let class of x be:
4. Let M be the model
produced by
applying L to S
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 10: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/10.jpg)
S
S1
Sm…
M1
L
L
Mm
Si
LMi
1
2
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
S1
Sm
…M1
L
L
Mm
LMi
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 11: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/11.jpg)
Si
LMi
3
Relabel class of x
so that:
is minimized
for each x
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
¿Q? ¿Qué?
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 12: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/12.jpg)
¿p? ¿Qué?
Maybe deserves some explanation...
Just takes into account if L outputs class or class probabilities. We want: probabilities P(j|x, Mi)
If class:
set P(j|x, Mi) = 1 for that class, 0 for all others
If probabilities:
just take the probabilities as they are
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 13: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/13.jpg)
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
MALA
MBLB
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 14: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/14.jpg)
Classifier L?
• What kind of classifying algorithm?
(does not matter)
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
MetaCost
• Is available in WEKA
• Pros:
• Independent of L, (“wrapper
algorithm”)
• Works with multiclass problems
(better than for example stratification)
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 15: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/15.jpg)
MetaCost
• Cons:
• Takes more time to compute
• Accuracy goes down
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
Other Techniques
![Page 16: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/16.jpg)
Other Techniques
• Stratification
• Oversampling
• Undersampling
• Decision trees with minimal costs
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
Applications
![Page 17: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/17.jpg)
Medicine
• Comparison between C4.5 (J48) and
MetaCost + C4.5 in WEKA on heart-
c.arff data set
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
Cost matrix
Predicted classPredicted classPredicted class
Actual
class
+ -
Actual
class+ 0 1
Actual
class
- 4 0
C(i,j) =
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 18: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/18.jpg)
C4.5 Confusion matrix
C4.5 Predicted classPredicted classPredicted class
Actual
class
+ -
Actual
class+ 138 27
Actual
class
- 40 98
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
Meta Predicted classPredicted classPredicted class
Actual
class
+ -
Actual
class+ 104 61
Actual
class
- 21 117
MetaCost Confusion
matrix
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 19: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/19.jpg)
Comparison
Meta Predicted classPredicted classPredicted class
Actual
class
+ -
Actual
class+ 104 61
Actual
class
- 21 117
C4.5 Predicted classPredicted classPredicted class
Actual
class
+ -
Actual
class+ 138 27
Actual
class
- 40 98
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
Comparison
MetaCost total cost: 145
C4.5 total cost: 187
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 20: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/20.jpg)
Comparison – Cost
0
20
40
60
80
100
120
140
160
180
200
C4.5 MetaCost
145
187
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
Comparison – Cost
0 %
10 %
20 %
30 %
40 %
50 %
60 %
70 %
80 %
90 %
100 %
C4.5 MetaCost
77,5 %
100 %
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 21: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/21.jpg)
Comparison – Cost
0 %
13 %
26 %
39 %
52 %
65 %
78 %
91 %
104 %
117 %
130 %
C4.5 MetaCost
100 %
129 %
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
Comparison –
Classifications
0 %
10 %
20 %
30 %
40 %
50 %
60 %
70 %
80 %
90 %
100 %
Correct Incorrect
27,1 %
72,9 %
22,1 %
77,9 %
C4.5
MetaCost
• Cost-sensitive
classifiers• MetaCost • Other techniques • Applications
![Page 22: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/22.jpg)
Conclusions
References
[1] Pedro Domingos. Metacost:
A general method for making
classifiers cost-sensitive. In
KDD, pages 155–164, 1999.
[2] Charles X. Ling, Qiang Yang,
Jianning Wang, and Shichao
Zhang. Decision trees with
minimal costs. In ICML ’04:
Proceedings of the twenty-first
international conference on
Machine learning, page 69, New
York, NY, USA, 2004. ACM.
[3] Pang-Ning Tan, Michael
Steinbach, and Vipin Kumar.
Introduction to Data Mining,
(First Edition). Addison-Wesley
Longman Publishing Co., Inc.,
Boston, MA, USA, 2005.
[4] Heart data set. http://
staffwww.itn.liu.se/~aidvi/
courses/06/dm/labs/heart-
c.arff, accessed: 2009-12-02.
![Page 23: Building Cost- sensitive Classifiersstaffaidvi/courses/06/dm... · C4.5 Predicted class Actual class + - + 138 27 - 40 98 • Cost-sensitive classifiers • MetaCost • Other techniques](https://reader033.fdocuments.net/reader033/viewer/2022053017/5f1c6401c72cb57f4510859a/html5/thumbnails/23.jpg)
?