Incremental Approach to Interpretable Classification Rule ...

Incremental Approach to Interpretable Classification RuleLearning

Bishwamittra Ghosh and Kuldeep S. MeelSchool of Computing, National University of Singapore

CP 2019

Bishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 1

Introduction

Practical applications of machine learning

I Hiring employees

I Giving a loan to a person

I Predicting recidivism: likelihood of a person convicted of a crime tooffend again

I . . .

Should we believe the prediction of machine learning models?

Interpretable classification model

Introduction

I Hiring employees

I . . .

Introduction

I Hiring employees

I . . .

Introduction

Example Dataset

Introduction

Representation of an interpretable model and a black boxmodel

A sample is predicted as Iris Versicolor if(sepal length > 6.3 OR sepal width > 3OR petal width ≤ 1.5 )

AND(sepal width ≤ 2.7 OR petal length > 4OR petal width > 1.2)

AND(petal length ≤ 5)

Interpretable Model Black Box Model

Introduction

Formula

I A CNF (Conjunctive Normal Form) formula is a conjunction ofclauses where each clause is a disjunction of literals

(a ∨ ¬b ∨ c) ∧ (d ∨ e)

I A DNF (Disjunctive Normal Form) formula is a disjunction of clauseswhere each clause is a conjunction of literals

(a ∧ b ∧ ¬c) ∨ (d ∧ e)

I Decision rules in CNF and DNF are highly interpretable[Malioutov’18; Lakkaraju’19]

Introduction

Formula

I A CNF (Conjunctive Normal Form) formula is a conjunction ofclauses where each clause is a disjunction of literals

(a ∨ ¬b ∨ c) ∧ (d ∨ e)

I A DNF (Disjunctive Normal Form) formula is a disjunction of clauseswhere each clause is a conjunction of literals

(a ∧ b ∧ ¬c) ∨ (d ∧ e)

I Decision rules in CNF and DNF are highly interpretable[Malioutov’18; Lakkaraju’19]

Preliminaries

Definition of interpretability in rule-based classifiers

I There exists different notions of interpretability of rules

R =(a ∨ b ∨ ¬c ∨ d ∨ e)∧(f ∨ g ∨ h ∨ ¬i)∧(j ∨ k ∨ ¬l)∧(¬m ∨ n ∨ o ∨ p ∨ q)∧

R = (a ∨ b ∨ ¬c) ∧ (f ∨ g)

I Rules with fewer terms are considered interpretable in medicaldomains [Letham’15]

I We refer rule size as a proxy of interpretability in rule-based classifiers

I For rules expressed as CNF/DNF, rule size = number of literals

Preliminaries

R = (a ∨ b ∨ ¬c) ∧ (f ∨ g)

Preliminaries

R = (a ∨ b ∨ ¬c) ∧ (f ∨ g)

Design of an interpretable rule-based classifier

Outline

1 Introduction

2 Preliminaries

3 Design of an interpretable rule-based classifier

4 Incremental learning

5 Experimental Evaluation

6 Conclusion

Design of an interpretable classifier [Malioutov’18]

I We design objective function toI minimize prediction errorI minimize rule size (i.e., maximize interpretability)

I Consider decision variables:I feature variables bji = 1{j-th feature is selected in i-th clause}I noise variables ηq = 1{sample q is misclassified}

min∑i ,j

bji + λ∑q

I Constraints:I a positive labeled sample satisfies the ruleI a negative labeled sample does not satisfy the ruleI otherwise the sample is considered as noise

Design of an interpretable classifier [Malioutov’18]

I We design objective function toI minimize prediction errorI minimize rule size (i.e., maximize interpretability)

I Consider decision variables:I feature variables bji = 1{j-th feature is selected in i-th clause}I noise variables ηq = 1{sample q is misclassified}

min∑i ,j

bji + λ∑q

I Constraints:I a positive labeled sample satisfies the ruleI a negative labeled sample does not satisfy the ruleI otherwise the sample is considered as noise

MaxSAT

In MaxSAT

I Hard Clause: always satisfied, weight = ∞I Soft Clause: can be falsified, weight = R+

MaxSAT finds an assignment that satisfies all hard clauses and most softclauses such that the weight of satisfied soft clauses is maximized

MaxSAT-based approach for interpretable rule-basedclassification

I the objective function is encoded as soft clauses

I the constraints are encoded as hard clauses

Analysis

I To generate a k-clause CNF rule for a dataset of n samples over mboolean features, the number of clauses of the MaxSAT instance isO(n ·m · k)

Incremental learning

An Incremental Rule-learning Approach [Ghosh’19]

I We attribute large formula size of the MaxSAT instance for the poorscalability

I We propose mini-batch incremental learning

Solution Technique

I We propose a mini-batch incremental learning framework with thefollowing objective function on batch t

min∑i ,j

bji · I (bji ) + λ

where indicator function I (·) is defined as follows.

I (bji ) =

{−1 if bji = 1 in the (t − 1)-th batch (t 6= 1)

1 otherwise

Continued. . .

(t − 1)-th batchwe learn assignment

I b1 = 0

I b2 = 1

I b3 = 0

I b4 = 1

t-th batchwe construct soft unit clause

I ¬b1

I ¬b3

Experimental Evaluation

Experimental Results

Accuracy and training time of different classifiers

Dataset Size n Features m LR SVC RIPPER IMLI

PIMA 768 13475.32 75.32 75.32 73.38(0.3s) (0.37s) (2.58s) (0.74s)

Credit-default 30000 33480.81 80.69 80.97 79.41

(6.87s) (847.93s) (20.37s) (32.58s)

Twitter 49999 105095.67

Timeout95.56 94.69

(3.99s) (98.21s) (59.67s)

Table: Each cell in the last 5 columns refers to test accuracy (%) and trainingtime (s).

IMLI exhibits better training time by costing a little bit of accuracy

Size of rules of different rule-based classifiers

Dataset RIPPER IMLI

PIMA 8.25 3.5

Twitter 21.6 6

Credit 14.25 3

Table: Average size of the rules of different rule-based models.

IMLI generates shorter rules compared to other rule-based models

Conclusion

I Interpretable ML model ensures reliability of prediction models inpractice

I We propose an incremental learning approach of classification rules

I IMLI1 achieves up to three orders of magnitude improvement intraining time by sacrificing a bit of accuracy

I The generated rules appear to be more interpretable

Python library:

$ p i p i n s t a l l r u l e l e a r n i n g

Thank You !!

1Source code: https://github.com/meelgroup/MLICBishwamittra Ghosh Incremental Approach to Interpretable Classification Rule Learning CP 2019 17

Incremental Approach to Interpretable Classification Rule ...

Documents

Transcript of Incremental Approach to Interpretable Classification Rule ...

InfoGAIL: Interpretable Imitation Learning from Visual ...papers.nips.cc/paper/6971-infogail-interpretable-imitation-learning... · InfoGAIL: Interpretable Imitation Learning from

Interpretable Convolutional Neural Networksopenaccess.thecvf.com/content...Convolutional_Neural_CVPR_2018_… · Interpretable Convolutional Neural Networks Quanshi Zhang, Ying Nian

INTERPRETABLE NEURAL ARCHITECTURE SEARCH VIA …

Interpretable Reinforcement Learning Inspired by Piaget’s ...

Interpretable Emotion Classiﬁcation Using Temporal ...

Multiobjective Genetic Fuzzy Systems - Accurate and Interpretable Fuzzy Rule-Based Classifier Design - Hisao Ishibuchi Osaka Prefecture University, Japan.

From association rules to interpretable classification models ...klit01/presentations/TomasKliegr-CBA...2019/04/26 · • EasyMiner.eu –web-based rule learning system • Inbeat.eu

Interpretable and Globally Optimal Prediction for Textual ...papers.nips.cc/paper/6787-interpretable-and... · Interpretable and Globally Optimal Prediction for Textual Grounding

A novel approach for incremental uncertainty rule ...

Interpretable Convolutional Neural Networks · conv-layers of the CNN. In an interpretable CNN, each ﬁl-ter in a high conv-layer represents a speciﬁc object part. Our interpretable

Interpretable Machine Learning: Fundamental Principles and ...

Interpretable Nonnegative Matrix Decompositions

Interpretable recommender system with heterogeneous ...

Interpretable Biomanufacturing Process Risk and ...

Explainable and Interpretable Machine Learning

Multi-objective Molecule Generation using Interpretable ...

Interpretable Clustering: An Optimization Approach

An Interpretable Graph-based Image Classiﬁerispac.diet.uniroma1.it/scardapane/pdf/2014 - An interpretable graph... · An Interpretable Graph-based Image Classiﬁer Filippo M. Bianchi,

Designing and Evaluating an Interpretable Predictive ...

Data Mining: Efficiently Extracting Interpretable and ... · – Efficient data structures • Parallel, distributed and incremental mining methods – Interpretable and actionable