Lecture 21 Decision Tree - Nakul Gopalan

50
Nakul Gopalan Georgia Tech Decision Tree Machine Learning CS 4641 These slides are adopted from Polo, Vivek Srikumar, Mahdi Roozbahani and Chao Zhang.

Transcript of Lecture 21 Decision Tree - Nakul Gopalan

Page 1: Lecture 21 Decision Tree - Nakul Gopalan

Nakul Gopalan

Georgia Tech

Decision Tree

Machine Learning CS 4641

These slides are adopted from Polo, Vivek Srikumar, Mahdi Roozbahani and Chao Zhang.

Page 2: Lecture 21 Decision Tree - Nakul Gopalan

𝑋1

𝑋2

Page 3: Lecture 21 Decision Tree - Nakul Gopalan

3

Visual Introduction to Decision Tree

Building a tree to distinguish homes in New

York from homes in San Francisco

Interpretable decisions

for verifiable outcomes!!!

Page 4: Lecture 21 Decision Tree - Nakul Gopalan

Decision Tree: Example (2)

4

Will I play tennis today?

Page 5: Lecture 21 Decision Tree - Nakul Gopalan

The classifier:

fT(x): majority class in the leaf in the tree T containing x

Model parameters: The tree structure and size

5

Outlook?

Decision trees (DT)

Page 6: Lecture 21 Decision Tree - Nakul Gopalan

Flow charts - xkcd

Page 7: Lecture 21 Decision Tree - Nakul Gopalan

7

Pieces:

1. Find the best attribute to split on

2. Find the best split on the chosen attribute

3. Decide on when to stop splitting

Decision trees

Page 8: Lecture 21 Decision Tree - Nakul Gopalan

Label

Categorical or Discrete attributes

Page 9: Lecture 21 Decision Tree - Nakul Gopalan
Page 10: Lecture 21 Decision Tree - Nakul Gopalan

Attribute

Page 11: Lecture 21 Decision Tree - Nakul Gopalan

Continuous attributes or ordered attributes

Page 12: Lecture 21 Decision Tree - Nakul Gopalan

Test data

Page 13: Lecture 21 Decision Tree - Nakul Gopalan
Page 14: Lecture 21 Decision Tree - Nakul Gopalan
Page 15: Lecture 21 Decision Tree - Nakul Gopalan
Page 16: Lecture 21 Decision Tree - Nakul Gopalan
Page 17: Lecture 21 Decision Tree - Nakul Gopalan
Page 18: Lecture 21 Decision Tree - Nakul Gopalan

Information Content

Coin flip

Which coin will give us the purest information? Entropy ~ Uncertainty

Lower uncertainty, higher information gain

𝐶2𝐻𝐶2𝑇

𝐶2𝑇

𝐶2𝐻

𝐶1𝐻𝐶1𝑇

𝐶1𝑇

𝐶1𝐻

𝐶3𝐻𝐶3𝑇

𝐶3𝑇

𝐶3𝐻

5

1

2

3

Page 19: Lecture 21 Decision Tree - Nakul Gopalan
Page 20: Lecture 21 Decision Tree - Nakul Gopalan
Page 21: Lecture 21 Decision Tree - Nakul Gopalan
Page 22: Lecture 21 Decision Tree - Nakul Gopalan
Page 23: Lecture 21 Decision Tree - Nakul Gopalan
Page 24: Lecture 21 Decision Tree - Nakul Gopalan
Page 25: Lecture 21 Decision Tree - Nakul Gopalan
Page 26: Lecture 21 Decision Tree - Nakul Gopalan
Page 27: Lecture 21 Decision Tree - Nakul Gopalan
Page 28: Lecture 21 Decision Tree - Nakul Gopalan

Announcements

• Touchpoint deliverables due today. Video + PPT

• Touchpoint next class. Individual BlueJeans calls with mentors.

• Please make sure to attend your touchpoints for feedback!!!

• Physical touchpoint survey needed back today

• Questions??

Page 29: Lecture 21 Decision Tree - Nakul Gopalan
Page 30: Lecture 21 Decision Tree - Nakul Gopalan
Page 31: Lecture 21 Decision Tree - Nakul Gopalan
Page 32: Lecture 21 Decision Tree - Nakul Gopalan
Page 33: Lecture 21 Decision Tree - Nakul Gopalan

Left direction for smaller value, right direction for bigger value

Page 34: Lecture 21 Decision Tree - Nakul Gopalan
Page 35: Lecture 21 Decision Tree - Nakul Gopalan
Page 36: Lecture 21 Decision Tree - Nakul Gopalan
Page 37: Lecture 21 Decision Tree - Nakul Gopalan
Page 38: Lecture 21 Decision Tree - Nakul Gopalan
Page 39: Lecture 21 Decision Tree - Nakul Gopalan
Page 40: Lecture 21 Decision Tree - Nakul Gopalan
Page 41: Lecture 21 Decision Tree - Nakul Gopalan
Page 42: Lecture 21 Decision Tree - Nakul Gopalan

different

Page 43: Lecture 21 Decision Tree - Nakul Gopalan
Page 44: Lecture 21 Decision Tree - Nakul Gopalan
Page 45: Lecture 21 Decision Tree - Nakul Gopalan

𝑁𝐷

Page 46: Lecture 21 Decision Tree - Nakul Gopalan
Page 47: Lecture 21 Decision Tree - Nakul Gopalan

What will happen if a tree is too large?

Overfitting

High variance

Instability in predicting test data

Page 48: Lecture 21 Decision Tree - Nakul Gopalan

How to avoid overfitting?

• Acquire more training data

• Remove irrelevant attributes (manual process – not always

possible)

• Grow full tree, then post-prune

• Ensemble learning

Page 49: Lecture 21 Decision Tree - Nakul Gopalan

Reduced-Error Pruning

Split data into training and validation sets

Grow tree based on training set

Do until further pruning is harmful:

1. Evaluate impact on validation set of pruning each possible

node (plus those below it)

2. Greedily remove the node that most improves validation set

accuracy

Page 50: Lecture 21 Decision Tree - Nakul Gopalan

How to decide to remove it a node using pruning

• Pruning of the decision tree is done by replacing a whole

subtree by a leaf node.

• The replacement takes place if a decision rule establishes that

the expected error rate in the subtree is greater than in the

single leaf.

3 training data points

Actual Label: 1 positive class and 2 negative class

Predicted Label: 1 positive class and 2 negative class

3 correct and 0 incorrect

6 validation data points

Actual label:2 positive and 4 negative

Predicted Label: 4 positive and 2 negative

2 correct and 4 incorrect

If we had simply predicted the

majority class (negative), we

make 2 errors instead of 4

Pruned!