Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/09...Natural Language...
Transcript of Natural Language Processingdemo.clab.cs.cmu.edu/NLP/S20/files/slides/09...Natural Language...
Natural Language Processing
Lecture 9: Classifcaton
Classifcaton
Notaton
• Training examples: x = (x1, x2, ..., xN)
• Their categories: y = (y1, y2, ..., yN)
• A classifer C seeks to map xi to yi
• A learner L infers C from (x, y)x
yL C
Cx y
Probabilistc Classifers
Cx y
return arg maxy’ p(y’ | x)
Noisy Channel Model (General)
sourcesource
channel
y x
decode
p(y)p(x | y)
What proporton of emails are expectedto be spam vs. not spam?
What proporton of product reviews areexpected to get 1,2,3,4,5 stars?
Noisy Channel Classifers
Cx y
returnargmaxy p(y) × p(x | y)
Representng Text: Features
• Any object you might be given to classify can be represented as a vector in a vector space– Vectors of representng text are ofen sparse and
high-dimensional
• Designing Φ (“Feature engineering”)– What informaton do you need to solve the
problem?
– What informaton do you need to avoid mistakes?
– Very common: bag-of-words
Naïve Bayes Classifer
Cx y
ϕj ← [Φ(x)]j
return argmaxy’ p(y’)×Πj p(ϕj | y’)
Naïve Bayes Learner
x
y
L
p∀y, p(y)
∀y, ∀j, ∀f, p(ϕj(x) = f | y)
Linear Classifers
C:
1. Use Φ(x) to map x onto a real-valued feature space.
2. Calculate the linear score z = w ᵀ Φ(x).
3. If z > 0, then return y = YES, else y = NO.
Cx y
Linear Classifers
Cx y
Linear Classifers
Cx y
u : wᵀu =
0
w
Linear Classifers
Cx y
u : w
ᵀu =
0
w
Linear Classifers
Cx y
u : w
ᵀu =
0
Linear Classifers (> 2 Classes)
Cx y
returnarg maxy w ᵀ Φ(x, y)
Perceptron Learner
x
y
L
w
w ← 0for t = 1 ... T:
select (xt, yt)# run current classifery ← arg maxy’ w ᵀ Φ(x, y’)
if y != yt then # mistakew ← w + α [Φ(xt, yt) − Φ(xt, y)]
return w