Teknik Menginterpretasikan Model Machine Learning

Teknik MenginterpretasikanModel Machine Learning

Bagus [email protected]

Prodi Statistika dan Sains Data

Seminar Online – 2 Des 2020


Outline

• Pengantar: Persimpangan antara ketepatan prediksi dan

interpretasi

• Mengapa perlu interpretasi?

• Bagaimana menginterpretasikan model black box?

• Penutup


Model yang Diharapkan

dapatdiinterpretasikandengan mudah

memilikiketepatan prediksi

yang tinggi


Model yang “mudah” diinterpretasikan

lm(formula = harga ~ luasbangunan + kamarmandi + dekattol

+ umur, data = data)

Coefficients:

Estimate Std. Error t value Pr(>|t|)

(Intercept) 224.1412 20.4052 10.984 1.08e-15 ***

luasbangunan 1.1485 0.1025 11.205 4.93e-16 ***

kamarmandi 3.0094 6.7102 0.448 0.6555

dekattol 23.2252 12.4622 1.864 0.0675 .

umur -3.7378 0.5510 -6.784 7.29e-09 ***

REGRESI LINEAR© 1998 G. Meixner

http://www.vias.org/science_cartoons/regression.html



glm(formula = (myopic == "Yes") ~ sporthr + readhr + comphr +

spheq + dadmy, family = "binomial", data = data)

Coefficients:

Estimate Std. Error z value Pr(>|z|)

(Intercept) 0.032236 0.392439 0.082 0.93453

sporthr -0.054138 0.020361 -2.659 0.00784 **

readhr 0.051944 0.045394 1.144 0.25250

comphr 0.002205 0.041148 0.054 0.95727

spheq -3.864433 0.435405 -8.875 < 2e-16 ***

dadmyYes 0.798310 0.301098 2.651 0.00802 **

REGRESI LOGISTIK



POHON KLASIFIKASI



• Cenderung linear

• Tidak cukup fit untuk banyak kondisi


Machine Learning

XGBoost Neural NetworkRandom Forest


Machine Learning

• Mampu menangkap pola-pola hubungan tak linear

• Digunakan untuk mengejar keperluan ketepatan prediksi

sulit diinterpretasikan

cenderung rumit dan bersifat BLACK BOX


Persimpangan antara ketepatan prediksi dan interpretasi

interpretability

accuracy

model linear

pohonklasifikasi/regresi

random forest

boosting

neural network

SVM


Mengapa Perlu Interpretasi?

• Sering diperlukan penjelasan terhadap hasil prediksi dan keputusan

yang dihasilkan oleh model

• Meningkatkan keyakinan bahwa model akan bekerja dengan baik

• Memenuhi kebutuhan regulator dan hal lain yang berkaitan dengan etik

• Meningkatkan kolaborasi antara mesin dan manusia


Dua orang sama-sama diprediksi akan“tertarik membeli” produk kita…

…tapi bisa jadi karena alasan yang berbeda

konsumenprospektifkarena income-nya cukup besar

konsumenprospektif

karena faktordemografinya

sesuai profilproduk


Intinya…

interpretasi itu bagus dan diperlukan


Persimpangan antara ketepatan prediksi dan interpretasi

interpretability

accuracy

model linear

pohonklasifikasi/regresi

random forest

boosting

neural network

SVM

bagaimanacaranya bisamencapai ini?


2 paradigma


Beberapa pendekatan

• Global Surrogate Model – Shadow Model

• Partial dependence plot

• Feature importance

• Pertubation/Permutation

• SHAP - SHapley Additive exPlanations

• LIME


Surrogate Model: model bayangan

Data Latih

Output

Model yang Rumit

Data Latih Buatan

Output

Model yang Sederhana

Perform the following steps to obtain a surrogate model:1. Select a dataset X. This can be the same dataset that was

used for training the black box model or a new dataset from the same distribution. You could even select a subset of the data or a grid of points, depending on your application.

2. For the selected dataset X, get the predictions of the black box model.

3. Select an interpretable model type (linear model, decision tree, ...).

4. Train the interpretable model on the dataset X and its predictions.

5. Congratulations! You now have a surrogate model.6. Measure how well the surrogate model replicates the

predictions of the black box model.7. Interpret the surrogate model.

Sumber: Molnar (2019)


Mengukur “feature importance” menggunakanteknik pertubation

Sumber: Mike Lee Williams - Cloudera


Local interpretable model-agnostic explanations (LIME)

• LIME focuses on training local surrogate models to explain individual predictions.

• The recipe for training local surrogate models:• Select your instance of interest for which you want to have an explanation of its black box

prediction.

• Perturb your dataset and get the black box predictions for these new points.

• Weight the new samples according to their proximity to the instance of interest.

• Train a weighted, interpretable model on the dataset with the variations.

• Explain the prediction by interpreting the local model.

Sumber: Molnar (2019)


Sumber: Mike Lee Williams - Cloudera


Penutup

• Model machine learning

• Rumit bentuknya

• Black Box, tidak mudah dijelaskan

• Padahal interpretasi dari suatu model diperlukan dalam banyak hal

• Berbagai usulan teknik tersedia di literatur dan berbagai software

untuk dapat memberikan interpretasi terhadap model black box atau

hasil prediksinya

terima kasihBagus Sartono

[email protected]

Teknik Menginterpretasikan Model Machine Learning

Documents

Transcript of Teknik Menginterpretasikan Model Machine Learning