Incremental Approach to Interpretable Classification Rule ...

23
Incremental Approach to Interpretable Classification Rule Learning Bishwamittra Ghosh and Kuldeep S. Meel School of Computing, National University of Singapore CP 2019 Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 1

Transcript of Incremental Approach to Interpretable Classification Rule ...

Page 1: Incremental Approach to Interpretable Classification Rule ...

Incremental Approach to Interpretable Classification RuleLearning

Bishwamittra Ghosh and Kuldeep S. MeelSchool of Computing, National University of Singapore

CP 2019

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 1

Page 2: Incremental Approach to Interpretable Classification Rule ...

Introduction

Practical applications of machine learning

I Hiring employees

I Giving a loan to a person

I Predicting recidivism: likelihood of a person convicted of a crime tooffend again

I . . .

Should we believe the prediction of machine learning models?

Interpretable classification model

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 2

Page 3: Incremental Approach to Interpretable Classification Rule ...

Introduction

Practical applications of machine learning

I Hiring employees

I Giving a loan to a person

I Predicting recidivism: likelihood of a person convicted of a crime tooffend again

I . . .

Should we believe the prediction of machine learning models?

Interpretable classification model

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 2

Page 4: Incremental Approach to Interpretable Classification Rule ...

Introduction

Practical applications of machine learning

I Hiring employees

I Giving a loan to a person

I Predicting recidivism: likelihood of a person convicted of a crime tooffend again

I . . .

Should we believe the prediction of machine learning models?

Interpretable classification model

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 2

Page 5: Incremental Approach to Interpretable Classification Rule ...

Introduction

Example Dataset

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 3

Page 6: Incremental Approach to Interpretable Classification Rule ...

Introduction

Representation of an interpretable model and a black boxmodel

A sample is predicted as Iris Versicolor if(sepal length > 6.3 OR sepal width > 3OR petal width ≤ 1.5 )

AND(sepal width ≤ 2.7 OR petal length > 4OR petal width > 1.2)

AND(petal length ≤ 5)

Interpretable Model Black Box Model

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 4

Page 7: Incremental Approach to Interpretable Classification Rule ...

Introduction

Formula

I A CNF (Conjunctive Normal Form) formula is a conjunction ofclauses where each clause is a disjunction of literals

(a ∨ ¬b ∨ c) ∧ (d ∨ e)

I A DNF (Disjunctive Normal Form) formula is a disjunction of clauseswhere each clause is a conjunction of literals

(a ∧ b ∧ ¬c) ∨ (d ∧ e)

I Decision rules in CNF and DNF are highly interpretable[Malioutov’18; Lakkaraju’19]

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 5

Page 8: Incremental Approach to Interpretable Classification Rule ...

Introduction

Formula

I A CNF (Conjunctive Normal Form) formula is a conjunction ofclauses where each clause is a disjunction of literals

(a ∨ ¬b ∨ c) ∧ (d ∨ e)

I A DNF (Disjunctive Normal Form) formula is a disjunction of clauseswhere each clause is a conjunction of literals

(a ∧ b ∧ ¬c) ∨ (d ∧ e)

I Decision rules in CNF and DNF are highly interpretable[Malioutov’18; Lakkaraju’19]

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 5

Page 9: Incremental Approach to Interpretable Classification Rule ...

Preliminaries

Definition of interpretability in rule-based classifiers

I There exists different notions of interpretability of rules

R =(a ∨ b ∨ ¬c ∨ d ∨ e)∧(f ∨ g ∨ h ∨ ¬i)∧(j ∨ k ∨ ¬l)∧(¬m ∨ n ∨ o ∨ p ∨ q)∧

R = (a ∨ b ∨ ¬c) ∧ (f ∨ g)

I Rules with fewer terms are considered interpretable in medicaldomains [Letham’15]

I We refer rule size as a proxy of interpretability in rule-based classifiers

I For rules expressed as CNF/DNF, rule size = number of literals

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 6

Page 10: Incremental Approach to Interpretable Classification Rule ...

Preliminaries

Definition of interpretability in rule-based classifiers

I There exists different notions of interpretability of rules

R =(a ∨ b ∨ ¬c ∨ d ∨ e)∧(f ∨ g ∨ h ∨ ¬i)∧(j ∨ k ∨ ¬l)∧(¬m ∨ n ∨ o ∨ p ∨ q)∧

R = (a ∨ b ∨ ¬c) ∧ (f ∨ g)

I Rules with fewer terms are considered interpretable in medicaldomains [Letham’15]

I We refer rule size as a proxy of interpretability in rule-based classifiers

I For rules expressed as CNF/DNF, rule size = number of literals

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 6

Page 11: Incremental Approach to Interpretable Classification Rule ...

Preliminaries

Definition of interpretability in rule-based classifiers

I There exists different notions of interpretability of rules

R =(a ∨ b ∨ ¬c ∨ d ∨ e)∧(f ∨ g ∨ h ∨ ¬i)∧(j ∨ k ∨ ¬l)∧(¬m ∨ n ∨ o ∨ p ∨ q)∧

R = (a ∨ b ∨ ¬c) ∧ (f ∨ g)

I Rules with fewer terms are considered interpretable in medicaldomains [Letham’15]

I We refer rule size as a proxy of interpretability in rule-based classifiers

I For rules expressed as CNF/DNF, rule size = number of literals

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 6

Page 12: Incremental Approach to Interpretable Classification Rule ...

Design of an interpretable rule-based classifier

Outline

1 Introduction

2 Preliminaries

3 Design of an interpretable rule-based classifier

4 Incremental learning

5 Experimental Evaluation

6 Conclusion

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 7

Page 13: Incremental Approach to Interpretable Classification Rule ...

Design of an interpretable rule-based classifier

Design of an interpretable classifier [Malioutov’18]

I We design objective function toI minimize prediction errorI minimize rule size (i.e., maximize interpretability)

I Consider decision variables:I feature variables bji = 1{j-th feature is selected in i-th clause}I noise variables ηq = 1{sample q is misclassified}

min∑i ,j

bji + λ∑q

ηq

I Constraints:I a positive labeled sample satisfies the ruleI a negative labeled sample does not satisfy the ruleI otherwise the sample is considered as noise

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 8

Page 14: Incremental Approach to Interpretable Classification Rule ...

Design of an interpretable rule-based classifier

Design of an interpretable classifier [Malioutov’18]

I We design objective function toI minimize prediction errorI minimize rule size (i.e., maximize interpretability)

I Consider decision variables:I feature variables bji = 1{j-th feature is selected in i-th clause}I noise variables ηq = 1{sample q is misclassified}

min∑i ,j

bji + λ∑q

ηq

I Constraints:I a positive labeled sample satisfies the ruleI a negative labeled sample does not satisfy the ruleI otherwise the sample is considered as noise

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 8

Page 15: Incremental Approach to Interpretable Classification Rule ...

Design of an interpretable rule-based classifier

MaxSAT

In MaxSAT

I Hard Clause: always satisfied, weight = ∞I Soft Clause: can be falsified, weight = R+

MaxSAT finds an assignment that satisfies all hard clauses and most softclauses such that the weight of satisfied soft clauses is maximized

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 9

Page 16: Incremental Approach to Interpretable Classification Rule ...

Design of an interpretable rule-based classifier

MaxSAT-based approach for interpretable rule-basedclassification

I the objective function is encoded as soft clauses

I the constraints are encoded as hard clauses

Analysis

I To generate a k-clause CNF rule for a dataset of n samples over mboolean features, the number of clauses of the MaxSAT instance isO(n ·m · k)

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 10

Page 17: Incremental Approach to Interpretable Classification Rule ...

Incremental learning

An Incremental Rule-learning Approach [Ghosh’19]

I We attribute large formula size of the MaxSAT instance for the poorscalability

I We propose mini-batch incremental learning

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 11

Page 18: Incremental Approach to Interpretable Classification Rule ...

Incremental learning

Solution Technique

I We propose a mini-batch incremental learning framework with thefollowing objective function on batch t

min∑i ,j

bji · I (bji ) + λ

∑q

ηq.

where indicator function I (·) is defined as follows.

I (bji ) =

{−1 if bji = 1 in the (t − 1)-th batch (t 6= 1)

1 otherwise

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 12

Page 19: Incremental Approach to Interpretable Classification Rule ...

Incremental learning

Continued. . .

(t − 1)-th batchwe learn assignment

I b1 = 0

I b2 = 1

I b3 = 0

I b4 = 1

t-th batchwe construct soft unit clause

I ¬b1

I b2

I ¬b3

I b4

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 13

Page 20: Incremental Approach to Interpretable Classification Rule ...

Experimental Evaluation

Experimental Results

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 14

Page 21: Incremental Approach to Interpretable Classification Rule ...

Experimental Evaluation

Accuracy and training time of different classifiers

Dataset Size n Features m LR SVC RIPPER IMLI

PIMA 768 13475.32 75.32 75.32 73.38(0.3s) (0.37s) (2.58s) (0.74s)

Credit-default 30000 33480.81 80.69 80.97 79.41

(6.87s) (847.93s) (20.37s) (32.58s)

Twitter 49999 105095.67

Timeout95.56 94.69

(3.99s) (98.21s) (59.67s)

Table: Each cell in the last 5 columns refers to test accuracy (%) and trainingtime (s).

IMLI exhibits better training time by costing a little bit of accuracy

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 15

Page 22: Incremental Approach to Interpretable Classification Rule ...

Experimental Evaluation

Size of rules of different rule-based classifiers

Dataset RIPPER IMLI

PIMA 8.25 3.5

Twitter 21.6 6

Credit 14.25 3

Table: Average size of the rules of different rule-based models.

IMLI generates shorter rules compared to other rule-based models

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 16

Page 23: Incremental Approach to Interpretable Classification Rule ...

Conclusion

Conclusion

I Interpretable ML model ensures reliability of prediction models inpractice

I We propose an incremental learning approach of classification rules

I IMLI1 achieves up to three orders of magnitude improvement intraining time by sacrificing a bit of accuracy

I The generated rules appear to be more interpretable

Python library:

$ p i p i n s t a l l r u l e l e a r n i n g

Thank You !!

1Source code: https://github.com/meelgroup/MLICBishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 17