Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels...

25
Lirong Xia Perceptrons Friday, April 11, 2014

Transcript of Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels...

Page 1: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Lirong Xia

Perceptrons

Friday, April 11, 2014

Page 2: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

• Classification

– Given inputs x, predict labels (classes) y

• Examples

– Spam detection. input: documents; classes:

spam/ham

– OCR. input: images; classes: characters

– Medical diagnosis. input: symptoms; classes:

diseases

– Autograder. input: codes; output: grades2

Classification

Page 3: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Important Concepts

3

• Data: labeled instances, e.g. emails marked spam/ham– Training set– Held out set (we will give examples today)– Test set

• Features: attribute-value pairs that characterize each x

• Experimentation cycle– Learn parameters (e.g. model probabilities) on training set– (Tune hyperparameters on held-out set)– Compute accuracy of test set– Very important: never “peek” at the test set!

• Evaluation– Accuracy: fraction of instances predicted correctly

• Overfitting and generalization– Want a classifier which does well on test data– Overfitting: fitting the training data very closely, but not

generalizing well

Page 4: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

General Naive Bayes

4

• A general naive Bayes model:

• We only specify how each feature depends on the class

• Total number of parameters is linear in n

Page 5: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Tuning on Held-Out Data

5

• Now we’ve got two kinds of unknowns– Parameters: the probabilities p(Y|X),

p(Y)– Hyperparameters, like the amount of

smoothing to do: k,α

• Where to learn?– Learn parameters from training data– For each value of the

hyperparameters, train and test on the held-out data

– Choose the best value and do a final test on the test data

Page 6: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Today’s schedule

6

• Generative vs. Discriminative

• Binary Linear Classifiers

• Perceptron

• Multi-class Linear Classifiers

• Multi-class Perceptron

Page 7: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Classification: Feature Vectors

7

x f x y

Page 8: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Generative vs. Discriminative

8

• Generative classifiers:– E.g. naive Bayes– A causal model with evidence variables– Query model for causes given evidence

• Discriminative classifiers:– No causal model, no Bayes rule, often no probabilities at

all!– Try to predict the label Y directly from X– Robust, accurate with varied features– Loosely: mistake driven rather than model driven

Page 9: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Outline

9

• Generative vs. Discriminative

• Binary Linear Classifiers

• Perceptron

• Multi-class Linear Classifiers

• Multi-class Perceptron

Page 10: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Some (Simplified) Biology

10

Very loose inspiration: human neurons

Page 11: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Linear Classifiers

11

• Inputs are feature values• Each feature has a weight• Sum is the activation

• If the activation is:– Positive: output +1– Negative, output -1

Page 12: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Classification: Weights

12

• Binary case: compare features to a weight vector• Learning: figure out the weight vector from examples

Page 13: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Linear classifiers Mini Exercise

13

• 1.Draw the 4 feature vectors and the weight vector w• 2.Which feature vectors are classified as +? As -?• 3.Draw the line separating feature vectors being

classified + and -.

1

# free : 2

YOUR_NAME : 0f x

3

# free : 1

YOUR_NAME : 1f x

1

2w

Page 14: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Linear classifiers Mini Exercise 2---Bias Term

14

• 1.Draw the 4 feature vectors and the weight vector w• 2.Which feature vectors are classified as +? As -?• 3.Draw the line separating feature vectors being

classified + and -.

1 # free : 2

YOUR_NAME :

Bias :

0

1

f x

1

3

2

w

2 # free : 4

YOUR_NAME :

Bias :

1

1

f x

3 # free : 1

YOUR_NAME :

Bias :

1

1

f x

Page 15: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Linear classifiers Mini Exercise 3---adding features

15

• 1.Draw the 4 1-dimensional feature vectors – f(x1)=-2, f(x2)=-1, f(x3)=1, f(x4)=2

• 2.Is there a separating vector w that– classifies x1 and x4 as +1 and

– classifies x2 and x3 as -1?

• 3. can you add one more feature to make this happen?

Page 16: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Binary Decision Rule

16

• In the space of feature vectors– Examples are points– Any weight vector is a hyperplane– One side corresponds to Y = +1– Other corresponds to Y = -1

Page 17: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Outline

17

• Generative vs. Discriminative

• Binary Linear Classifiers

• Perceptron: how to find the weight vector w from data.

• Multi-class Linear Classifiers

• Multi-class Perceptron

Page 18: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Learning: Binary Perceptron

18

• Start with weights = 0• For each training instance:

– Classify with current weights

– If correct (i.e. y=y*), no change!– If wrong: adjust the weight vector

by adding or subtracting the feature vector. Subtract if y* is -1.

Page 19: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Outline

19

• Generative vs. Discriminative

• Binary Linear Classifiers

• Perceptron

• Multi-class Linear Classifiers

• Multi-class Perceptron

Page 20: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Multiclass Decision Rule

20

• If we have multiple classes:– A weight vector for each class:

– Score (activation) of a class y:

– Prediction highest score wins

yw

Binary = multiclass where the negative class has weight zero

Page 21: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Example Exercise --- Which Category is Chosen?

21

SPORTSw POLITICSw TECHw

“win the vote”

Page 22: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Exercise: Multiclass linear classifier for 2 classes and binary linear

classifier

22

• Consider the multiclass linear classifier for two classes with

• Is there an equivalent binary linear classifier, i.e., one that classifies all points x = (x1,x2) the same way?

1 2

1 1

2 2w w

Page 23: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Outline

23

• Generative vs. Discriminative

• Binary Linear Classifiers

• Perceptron

• Multi-class Linear Classifiers

• Multi-class Perceptron: learning the weight vectors wi from data

Page 24: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Learning: Multiclass Perceptron

24

• Start with all weights = 0• Pick up training examples one by one• Predict with current weights

• If correct, no change!• If wrong: lower score of wrong

answer, raise score of right answer

* *

y y

y y

w w f xw w f x

Page 25: Lirong Xia Perceptrons Friday, April 11, 2014. Classification –Given inputs x, predict labels (classes) y Examples –Spam detection. input: documents;

Example: Multiclass Perceptron

25

SPORTSw POLITICSw TECHw

“win the vote”“win the election”“win the game”