Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides ›...
Transcript of Natural Language Processingdemo.clab.cs.cmu.edu › NLP › S19 › files › slides ›...
Natural Language Processing
Lecture 13: More on CFG Parsing
Probabilistc/Weighted Parsing
Example: ambiguous parse
Probabilistc CFG
Ambiguous parse w/probabilites
P(lef) = 2.2 *10^-6 P(right) = 6.1 *10^-7
0.05
0.20
0.20
0.20
0.75
0.30
0.60
0.10
0.40
0.05
0.10
0.20 0.20
0.75 0.75
0.30
0.60
0.10 0.40
Review: Context-Free Grammars
• Vocabulary of terminal symbols, Σ
• Set of nonterminal symbols (a.k.a. variables), N
• Special start symbol S N∈• Producton rules of the form X → α where
X N∈α (N Σ)* ∈ ∪ (in CNF: α N∈ 2 Σ)∪
Probabilistc Context-Free Grammars
• Vocabulary of terminal symbols, Σ
• Set of nonterminal symbols (a.k.a. variables), N
• Special start symbol S N∈• Producton rules of the form X → α, each with
a positve weight p(X → α), whereX N∈α (N Σ)* ∈ ∪ (in CNF: α N∈ 2 Σ)∪∀X N, ∑∈ α p(X → α) = 1
CKY Algorithm: Review
for i = 1 ... n
C[i-1, i] = { V | V → wi }
for ℓ = 2 ... n // width
for i = 0 ... n - ℓ // lef boundary
k = i + ℓ // right boundary
for j = i + 1 ... k – 1 // midpoint
C[i, k] = C[i, k] ∪ { V | V → YZ, Y ∈ C[i, j], Z ∈ C[j, k] }
return true if S ∈ C[0, n]
Weighted CKY Algorithm
for i = 1 ... n, V N∈C[V, i-1, i] = p(V → wi)
for ℓ = 2 ... n // width of span
for i = 0 ... n - ℓ // lef boundary
k = i + ℓ // right boundary
for j = i + 1 ... k – 1 // midpoint
for each binary rule V → YZ
C[V, i, k] = max{ C[V, i, k], C[Y, i, j] × C[Z, j, k] × p(V → YZ) }
return true if S ∈ C[·,0, n]
CKY Algorithm: Review
Weighted CKY Algorithm
P-CKY algorithm from book
Parsing as (Weighted) Deducton
Earley’s Algorithm
Example Grammar (same for CKY)
02/26/2019 Speech and
Language Processing - Jurafsky and Martn
18
Earley Parsing
• Allows arbitrary CFGs• Top-down control• Fills a table (or chart) in a single sweep over
the input– Table is length N+1; N is number of words– Table entries represent
• Completed consttuents and their locatons• In-progress consttuents• Predicted consttuents
02/26/2019 Speech and
Language Processing - Jurafsky and Martn
19
States
• The table-entries are called states and are represented with doted-rules.
S . VP A VP is predicted
NP Det . Nominal An NP is in progress
VP V NP . A VP has been found
02/26/2019 Speech and
Language Processing - Jurafsky and Martn
20
States/Locatons
• S . VP [0,0]
• NP Det .Nominal [1,2]
• VP V NP . [0,3]
A VP is predicted at the start of the sentence
An NP is in progress; the Det goes from 1 to 2
A VP has been found startng at 0 and ending at 3
02/26/2019 Speech and
Language Processing - Jurafsky and Martn
21
Earley top-level
• As with most dynamic programming approaches, the answer is found by looking in the table in the right place.
• In this case, there should be an S state in the final column that spans from 0 to N and is complete. That is,
S α . [0,N]
• If that’s the case, you’re done.
02/26/2019 Speech and
Language Processing - Jurafsky and Martn
22
Earley top-level (2)
• So sweep through the table from 0 to N…– New predicted states are created by startng top-
down from S
– New incomplete states are created by advancing existng states as new consttuents are discovered
– New complete states are created in the same way.
02/26/2019 Speech and
Language Processing - Jurafsky and Martn
23
Earley top-level (3)
• More specifically…1. Predict all the states you can upfront
2. Read a word1. Extend states based on matches
2. Generate new predictons
3. Go to step 2
3. When you’re out of words, look at the chart to see if you have a winner
Earley code: top-level
Earley code: 3 main functons
02/26/2019 Speech and
Language Processing - Jurafsky and Martn
26
Extended Earley Example
• Book that fight
• We should find: an S from 0 to 3 that is a completed state
Earley’s Algorithm in equatons
• We can look at this from the declaratve programming point of view too.
ROOT → • S [0,0] goal:ROOT → S• [0,n]
book the fight through Chicago
Earley’s Algorithm: PREDICT
Given V → α•Xβ [i, j] and the rule X → γ,create X → •γ [j, j]
ROOT → • S [0,0]S → • VP [0,0]S → • NP VP [0,0]...VP → • V NP [0,0]...NP → • DT N [0,0]...
book the fight through Chicago
ROOT → • S [0,0]S→ VPS → • VP [0,0]
Earley’s Algorithm: SCAN
Given V → α•Tβ [i, j] and the rule T → wj+1,
create T → wj+1• [j, j+1]
ROOT → • S [0,0]S → • VP [0,0]S → • NP VP [0,0]...VP → • V NP [0,0]...NP → • DT N [0,0]...
V → book• [0, 1]
book the fight through Chicago
VP → • V NP [0,0]V → bookV → book • [0,1]
Earley’s Algorithm: COMPLETE
Given V → α•Xβ [i, j] and X → γ• [j, k],create V → αX•β [i, k]
ROOT → • S [0,0]S → • VP [0,0]S → • NP VP [0,0]...VP → • V NP [0,0]...NP → • DT N [0,0]...
V → book• [0, 1]VP → V • NP [0,1]
book the fight through Chicago
VP → • V NP [0,0]V → book • [0,1]VP → V • NP [0,1]
Thought Questons
• Runtme?– O(n3)
• Memory?– O(n2)
• Can we make it faster?
• Recovering trees?
Make it an Earley Parser
Parsing as Search, Again
Implementng Recognizers as Search
Agenda = { state0 }
while(Agenda not empty) s = pop a state from Agendaif s is a success-state return s // valid parse treeelse if s is not a failure-state: generate new states from spush new states onto Agenda
return nil // no parse!
Agenda-Based Probabilistc Parsing
Agenda = { (item, value) : inital updates from equatons }// items take the form [X, i, j]; values are reals
while(Agenda not empty) u = pop an update from Agendaif u.item is goal return u.value // valid parse treeelse if u.value -> Chart[u.item] store Chart[u.item] ← u.valueif u.item combines with other Chart items:
generate new updates from u and items stored in Chartpush new updates onto Agenda
return nil // no parse!
Catalog of CF Parsing Algorithms
• Recogniton/Boolean vs. parsing/probabilistc
• Chomsky normal form/CKY vs. general/Earley’s
• Exhaustve vs. agenda