Teknik Menginterpretasikan Model Machine Learning
Transcript of Teknik Menginterpretasikan Model Machine Learning
Teknik MenginterpretasikanModel Machine Learning
Bagus [email protected]
Prodi Statistika dan Sains Data
Seminar Online – 2 Des 2020
Bagus [email protected]
Outline
• Pengantar: Persimpangan antara ketepatan prediksi dan
interpretasi
• Mengapa perlu interpretasi?
• Bagaimana menginterpretasikan model black box?
• Penutup
Bagus [email protected]
Model yang Diharapkan
dapatdiinterpretasikandengan mudah
memilikiketepatan prediksi
yang tinggi
Bagus [email protected]
Model yang “mudah” diinterpretasikan
lm(formula = harga ~ luasbangunan + kamarmandi + dekattol
+ umur, data = data)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 224.1412 20.4052 10.984 1.08e-15 ***
luasbangunan 1.1485 0.1025 11.205 4.93e-16 ***
kamarmandi 3.0094 6.7102 0.448 0.6555
dekattol 23.2252 12.4622 1.864 0.0675 .
umur -3.7378 0.5510 -6.784 7.29e-09 ***
REGRESI LINEAR© 1998 G. Meixner
http://www.vias.org/science_cartoons/regression.html
Bagus [email protected]
Model yang “mudah” diinterpretasikan
glm(formula = (myopic == "Yes") ~ sporthr + readhr + comphr +
spheq + dadmy, family = "binomial", data = data)
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 0.032236 0.392439 0.082 0.93453
sporthr -0.054138 0.020361 -2.659 0.00784 **
readhr 0.051944 0.045394 1.144 0.25250
comphr 0.002205 0.041148 0.054 0.95727
spheq -3.864433 0.435405 -8.875 < 2e-16 ***
dadmyYes 0.798310 0.301098 2.651 0.00802 **
REGRESI LOGISTIK
Bagus [email protected]
Model yang “mudah” diinterpretasikan
• Cenderung linear
• Tidak cukup fit untuk banyak kondisi
Bagus [email protected]
Machine Learning
• Mampu menangkap pola-pola hubungan tak linear
• Digunakan untuk mengejar keperluan ketepatan prediksi
sulit diinterpretasikan
cenderung rumit dan bersifat BLACK BOX
Bagus [email protected]
Persimpangan antara ketepatan prediksi dan interpretasi
interpretability
accuracy
model linear
pohonklasifikasi/regresi
random forest
boosting
neural network
SVM
Bagus [email protected]
Mengapa Perlu Interpretasi?
• Sering diperlukan penjelasan terhadap hasil prediksi dan keputusan
yang dihasilkan oleh model
• Meningkatkan keyakinan bahwa model akan bekerja dengan baik
• Memenuhi kebutuhan regulator dan hal lain yang berkaitan dengan etik
• Meningkatkan kolaborasi antara mesin dan manusia
Bagus [email protected]
Dua orang sama-sama diprediksi akan“tertarik membeli” produk kita…
…tapi bisa jadi karena alasan yang berbeda
konsumenprospektifkarena income-nya cukup besar
konsumenprospektif
karena faktordemografinya
sesuai profilproduk
Bagus [email protected]
Persimpangan antara ketepatan prediksi dan interpretasi
interpretability
accuracy
model linear
pohonklasifikasi/regresi
random forest
boosting
neural network
SVM
bagaimanacaranya bisamencapai ini?
Bagus [email protected]
2 paradigma
Bagus [email protected]
Beberapa pendekatan
• Global Surrogate Model – Shadow Model
• Partial dependence plot
• Feature importance
• Pertubation/Permutation
• SHAP - SHapley Additive exPlanations
• LIME
Bagus [email protected]
Surrogate Model: model bayangan
Data Latih
Output
Model yang Rumit
Data Latih Buatan
Output
Model yang Sederhana
Perform the following steps to obtain a surrogate model:1. Select a dataset X. This can be the same dataset that was
used for training the black box model or a new dataset from the same distribution. You could even select a subset of the data or a grid of points, depending on your application.
2. For the selected dataset X, get the predictions of the black box model.
3. Select an interpretable model type (linear model, decision tree, ...).
4. Train the interpretable model on the dataset X and its predictions.
5. Congratulations! You now have a surrogate model.6. Measure how well the surrogate model replicates the
predictions of the black box model.7. Interpret the surrogate model.
Sumber: Molnar (2019)
Bagus [email protected]
Mengukur “feature importance” menggunakanteknik pertubation
Sumber: Mike Lee Williams - Cloudera
Bagus [email protected]
Local interpretable model-agnostic explanations (LIME)
• LIME focuses on training local surrogate models to explain individual predictions.
• The recipe for training local surrogate models:• Select your instance of interest for which you want to have an explanation of its black box
prediction.
• Perturb your dataset and get the black box predictions for these new points.
• Weight the new samples according to their proximity to the instance of interest.
• Train a weighted, interpretable model on the dataset with the variations.
• Explain the prediction by interpreting the local model.
Sumber: Molnar (2019)
Bagus [email protected]
Sumber: Mike Lee Williams - Cloudera
Bagus [email protected]
Penutup
• Model machine learning
• Rumit bentuknya
• Black Box, tidak mudah dijelaskan
• Padahal interpretasi dari suatu model diperlukan dalam banyak hal
• Berbagai usulan teknik tersedia di literatur dan berbagai software
untuk dapat memberikan interpretasi terhadap model black box atau
hasil prediksinya
terima kasihBagus Sartono