Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th , 10 th March , 2011
description
Transcript of Pushpak Bhattacharyya CSE Dept., IIT Bombay 8 th , 10 th March , 2011
CS460/626 : Natural Language Processing/Speech, NLP and the Web
(Lecture 23, 24– Parsing Algorithms; Parsing in case of Ambiguity; Probabilistic Parsing)
Pushpak BhattacharyyaCSE Dept., IIT Bombay
8th, 10th March, 2011
(Lectures 21 and 22 were on Sentiment Analysis by Aditya Joshi)
A note on Language Modeling Example sentence
“ ^ The tortoise beat the hare in the race.”
Guided Guided by Guided byby frequency Language world
Knowledge KnowledgeN-gram (n=3) CFG Probabilistic
CFGDependencyGrammar
Prob. DG
^ the tortoise 5*10-3
S-> NP VP S->NP VP 1.0
Semantic Roles agt, obj, sen, etc.Semantic Rules are always between “Heads”
Semantic Roles with probabilitiesthe tortoise beat
3*10-2NP->DT N NP->DT N
0.5tortoise beat the 7*10-5
VP->V NP PP
VP->V NP PP 0.4
beat the hare5*10-6
PP-> P NP PP-> P NP 1.0
Parse Tree
S
NP VP
DT N V NP PP
The Tortoise beat DT N P NP
the hare in DT N
the race
UNL Expression
beat@past
tortoise@def hare@defrace@de
f
agt obj scn (scene)
Purpose of LM Prediction of next word (Speech
Processing) Language Identification (for same
script) Belongingness check (parsing) P(NP->DT N) means what is the
probability that the ‘YIELD’ of the non terminal NP is DT N
Need for Deep Parsing Sentences are linear structures But there is a hierarchy- a tree-
hidden behind the linear structure There are constituents and
branches
PPs are at the same level: flat with respect to the head word “book”
NP
PPAP
big
The
of poems
with the blue cover
[The big book of poems with theBlue cover] is on the table.
book
No distinction in terms of dominance or c-
command
PP
“Constituency test of Replacement” runs into problems
One-replacement: I bought the big [book of poems with
the blue cover] not the small [one] One-replacement targets book of
poems with the blue cover Another one-replacement:
I bought the big [book of poems] with the blue cover not the small [one] with the red cover
One-replacement targets book of poems
More deeply embedded structureNP
PP
AP
big
The
of poems
with the blue cover
N’1
Nbook
PP
N’2
N’3
Grammar and Parsing Algorithms
A simplified grammar S NP VP NP DT N | N VP V ADV | V
Example Sentence
People laugh1 2 3
Lexicon:People - N, V Laugh - N, V
These are positions
This indicate that both Noun and Verb is
possible for the word “People”
Top-Down Parsing State Backup State Action-----------------------------------------------------------------------------------------------------1. ((S) 1) - -
2. ((NP VP)1) - -3a. ((DT N VP)1) ((N VP) 1) -3b. ((N VP)1) - -4. ((VP)2) - Consume “People”5a. ((V ADV)2) ((V)2) -6. ((ADV)3) ((V)2) Consume “laugh”5b. ((V)2) - -6. ((.)3) - Consume “laugh”
Termination Condition : All inputs over. No symbols remaining.Note: Input symbols can be pushed back.
Position of input pointer
Discussion for Top-Down Parsing This kind of searching is goal driven. Gives importance to textual precedence
(rule precedence). No regard for data, a priori (useless
expansions made).
Bottom-Up Parsing
Some conventions:N12
S1? -> NP12 ° VP2?
Represents positions
End position unknownWork on the LHS done, while the work on RHS remaining
Bottom-Up Parsing (pictorial representation)
S -> NP12 VP23 °
People Laugh 1 2 3
N12 N23
V12 V23
NP12 -> N12 ° NP23 -> N23 ° VP12 -> V12 ° VP23 -> V23 ° S1? -> NP12 ° VP2?
Problem with Top-Down Parsing• Left Recursion
• Suppose you have A-> AB rule. Then we will have the expansion as
follows:• ((A)K) -> ((AB)K) -> ((ABB)K) ……..
Combining top-down and bottom-up strategies
Top-Down Bottom-Up Chart Parsing Combines advantages of top-down &
bottom-up parsing. Does not work in case of left recursion.
e.g. – “People laugh” People – noun, verb Laugh – noun, verb
Grammar – S NP VPNP DT N | NVP V ADV | V
Transitive ClosurePeople laugh
1 2 3
S NP VP NP N VP V
NP DT N S NPVP S NP VP NP N VP V ADV success
VP V
Arcs in Parsing Each arc represents a chart which
records Completed work (left of ) Expected work (right of )
ExamplePeople laugh loudly
1 2 3 4
S NP VP NP N VP V VP V ADVNP DT N S NPVP VP VADV S NP VPNP N VP V ADV S NP VP
VP V
Dealing With Structural Ambiguity Multiple parses for a sentence
The man saw the boy with a telescope.
The man saw the mountain with a telescope.
The man saw the boy with the ponytail.
At the level of syntax, all these sentences are ambiguous. But semantics can disambiguate 2nd & 3rd sentence.
Prepositional Phrase (PP) Attachment ProblemV – NP1 – P – NP2
(Here P means preposition)NP2 attaches to NP1 ?or NP2 attaches to V ?
Parse Trees for a Structurally Ambiguous SentenceLet the grammar be – S NP VPNP DT N | DT N PPPP P NPVP V NP PP | V NPFor the sentence,“I saw a boy with a telescope”
Parse Tree - 1S
NP VP
N V NP
Det N PP
P NP
Det N
I saw
a boy
with
a telescope
Parse Tree -2S
NP VP
N V NP
Det N
PP
P NP
Det NI saw
a boy with
a telescope
Parsing Structural Ambiguity
Parsing for Structurally Ambiguous Sentences Sentence “I saw a boy with a telescope” Grammar:
S NP VPNP ART N | ART N PP | PRONVP V NP PP | V NP
ART a | an | theN boy | telescopePRON IV saw
Ambiguous Parses Two possible parses:
PP attached with Verb (i.e. I used a telescope to see)
( S ( NP ( PRON “I” ) ) ( VP ( V “saw” ) ( NP ( (ART “a”) ( N “boy”))( PP (P “with”) (NP ( ART “a” ) ( N
“telescope”))))) PP attached with Noun (i.e. boy had a
telescope) ( S ( NP ( PRON “I” ) ) ( VP ( V “saw” )
( NP ( (ART “a”) ( N “boy”) (PP (P “with”) (NP ( ART “a” ) ( N
“telescope”))))))
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON
3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )
(b) ( ( PRON VP ) 1)
− ART does not match “I”,
backup state (b) used
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON
3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )
(b) ( ( PRON VP ) 1)
− ART does not match “I”,
backup state (b) used
3B
( ( PRON VP ) 1 ) − −
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON
3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )
(b) ( ( PRON VP ) 1)
− ART does not match “I”,
backup state (b) used
3B
( ( PRON VP ) 1 ) − −
4 ( ( VP ) 2 ) − Consumed “I”
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON
3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )
(b) ( ( PRON VP ) 1)
− ART does not match “I”,
backup state (b) used
3B
( ( PRON VP ) 1 ) − −
4 ( ( VP ) 2 ) − Consumed “I”
5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON
3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )
(b) ( ( PRON VP ) 1)
− ART does not match “I”,
backup state (b) used
3B
( ( PRON VP ) 1 ) − −
4 ( ( VP ) 2 ) − Consumed “I”
5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used
6 ( ( NP PP ) 3 ) − Consumed “saw”
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON
3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )
(b) ( ( PRON VP ) 1)
− ART does not match “I”,
backup state (b) used
3B
( ( PRON VP ) 1 ) − −
4 ( ( VP ) 2 ) − Consumed “I”
5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used
6 ( ( NP PP ) 3 ) − Consumed “saw”
7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )
(b) ( ( PRON PP ) 3 )
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON
3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )
(b) ( ( PRON VP ) 1)
− ART does not match “I”,
backup state (b) used
3B
( ( PRON VP ) 1 ) − −
4 ( ( VP ) 2 ) − Consumed “I”
5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used
6 ( ( NP PP ) 3 ) − Consumed “saw”
7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )
(b) ( ( PRON PP ) 3 )
8 ( ( N PP) 4 ) − Consumed “a”
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON
3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )
(b) ( ( PRON VP ) 1)
− ART does not match “I”,
backup state (b) used
3B
( ( PRON VP ) 1 ) − −
4 ( ( VP ) 2 ) − Consumed “I”
5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used
6 ( ( NP PP ) 3 ) − Consumed “saw”
7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )
(b) ( ( PRON PP ) 3 )
8 ( ( N PP) 4 ) − Consumed “a”
9 ( ( PP ) 5 ) − Consumed “boy”
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
2 ( ( NP VP ) 1 ) − − Use NP ART N | ART N PP | PRON
3 ( ( ART N VP ) 1 ) (a) ( ( ART N PP VP ) 1 )
(b) ( ( PRON VP ) 1)
− ART does not match “I”,
backup state (b) used
3B
( ( PRON VP ) 1 ) − −
4 ( ( VP ) 2 ) − Consumed “I”
5 ( ( V NP PP ) 2 ) ( ( V NP ) 2 ) − Verb Attachment Rule used
6 ( ( NP PP ) 3 ) − Consumed “saw”
7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )
(b) ( ( PRON PP ) 3 )
8 ( ( N PP) 4 ) − Consumed “a”
9 ( ( PP ) 5 ) − Consumed “boy”
10
( ( P NP ) ) − −
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
… … … … …
7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )
(b) ( ( PRON PP ) 3 )
8 ( ( N PP) 4 ) − Consumed “a”
9 ( ( PP ) 5 ) − Consumed “boy”
10
( ( P NP ) 5 ) − −
11
( ( NP ) 6 ) − Consumed “with”
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
… … … … …
7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )
(b) ( ( PRON PP ) 3 )
8 ( ( N PP) 4 ) − Consumed “a”
9 ( ( PP ) 5 ) − Consumed “boy”
10
( ( P NP ) 5 ) − −
11
( ( NP ) 6 ) − Consumed “with”
12
( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 )(b) ( ( PRON ) 6)
−
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
… … … … …
7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )
(b) ( ( PRON PP ) 3 )
8 ( ( N PP) 4 ) − Consumed “a”
9 ( ( PP ) 5 ) − Consumed “boy”
10
( ( P NP ) 5 ) − −
11
( ( NP ) 6 ) − Consumed “with”
12
( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 )(b) ( ( PRON ) 6)
−
13
( ( N ) 7 ) − Consumed “a”
Top Down ParseState Backup State Action Comments
1 ( ( S ) 1 ) − − Use S NP VP
… … … … …
7 ( ( ART N PP ) 3 ) (a) ( ( ART N PP PP ) 3 )
(b) ( ( PRON PP ) 3 )
8 ( ( N PP) 4 ) − Consumed “a”
9 ( ( PP ) 5 ) − Consumed “boy”
10
( ( P NP ) 5 ) − −
11
( ( NP ) 6 ) − Consumed “with”
12
( ( ART N ) 6 ) (a) ( ( ART N PP ) 6 )(b) ( ( PRON ) 6)
−
13
( ( N ) 7 ) − Consumed “a”
14
( ( − ) 8 ) − Consume “telescope”
Finish Parsing
Top Down Parsing - Observations Top down parsing gave us the Verb
Attachment Parse Tree (i.e., I used a telescope)
To obtain the alternate parse tree, the backup state in step 5 will have to be invoked
Is there an efficient way to obtain all parses ?
I saw a boy with a telescope1 2 3 4 5 6 7 8
Colour Scheme : Blue for Normal Parse Green for Verb Attachment Parse Purple for Noun Attachment Parse Red for Invalid Parse
Bottom Up Parse
I saw a boy with a telescope1 2 3 4 5 6 7 8
Bottom Up Parse
NP12 PRON12S1?NP12VP2?
I saw a boy with a telescope1 2 3 4 5 6 7 8
Bottom Up Parse
NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??
VP2?V23NP3?
I saw a boy with a telescope1 2 3 4 5 6 7 8
Bottom Up Parse
NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??
VP2?V23NP3? NP35 ART34N45
NP3?ART34N45PP5?
I saw a boy with a telescope1 2 3 4 5 6 7 8
NP3?ART34N45PP5?
Bottom Up Parse
NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??
VP2?V23NP3? NP35 ART34N45
NP3?ART34N45PP5?
NP35ART34N45
I saw a boy with a telescope1 2 3 4 5 6 7 8
NP3?ART34N45PP5?
Bottom Up Parse
NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??
VP2?V23NP3? NP35 ART34N45
NP3?ART34N45PP5?
NP35ART34N45
VP25V23NP35 S15NP12VP25
VP2?V23NP35PP5?
I saw a boy with a telescope1 2 3 4 5 6 7 8
NP3?ART34N45PP5?
Bottom Up Parse
NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??
VP2?V23NP3? NP35 ART34N45
NP3?ART34N45PP5?
PP5?P56NP6?NP35ART34N45
VP25V23NP35 S15NP12VP25
VP2?V23NP35PP5?
I saw a boy with a telescope1 2 3 4 5 6 7 8
NP3?ART34N45PP5?
Bottom Up Parse
NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??
VP2?V23NP3? NP35 ART34N45
NP3?ART34N45PP5?
PP5?P56NP6?NP35ART34N45 NP68ART67N7?
NP6?ART67N78PP8?
VP25V23NP35 S15NP12VP25
VP2?V23NP35PP5?
I saw a boy with a telescope1 2 3 4 5 6 7 8
NP3?ART34N45PP5?
Bottom Up Parse
NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??
VP2?V23NP3? NP35 ART34N45
NP3?ART34N45PP5?
PP5?P56NP6?NP35ART34N45 NP68ART67N7?
NP6?ART67N78PP8?
NP68ART67N78
VP25V23NP35 S15NP12VP25
VP2?V23NP35PP5?
I saw a boy with a telescope1 2 3 4 5 6 7 8
NP3?ART34N45PP5?
Bottom Up Parse
NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??
VP2?V23NP3? NP35 ART34N45
NP3?ART34N45PP5?
PP5?P56NP6?NP35ART34N45 NP68ART67N7?
NP6?ART67N78PP8?
NP68ART67N78
VP25V23NP35 S15NP12VP25
VP2?V23NP35PP5? PP58P56NP68
I saw a boy with a telescope1 2 3 4 5 6 7 8
NP3?ART34N45PP5?
Bottom Up Parse
NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??
VP2?V23NP3? NP35 ART34N45
NP3?ART34N45PP5?
PP5?P56NP6?NP35ART34N45 NP68ART67N7?
NP6?ART67N78PP8?
NP68ART67N78
VP25V23NP35 S15NP12VP25
VP2?V23NP35PP5? PP58P56NP68
NP38ART34N45PP58
I saw a boy with a telescope1 2 3 4 5 6 7 8
NP3?ART34N45PP5?
Bottom Up Parse
NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??
VP2?V23NP3? NP35 ART34N45
NP3?ART34N45PP5?
PP5?P56NP6?NP35ART34N45 NP68ART67N7?
NP6?ART67N78PP8?
NP68ART67N78
VP25V23NP35 S15NP12VP25
VP2?V23NP35PP5? PP58P56NP68
NP38ART34N45PP58
VP28V23NP38VP28V23NP35PP58
I saw a boy with a telescope1 2 3 4 5 6 7 8
NP3?ART34N45PP5?
Bottom Up Parse
NP12 PRON12S1?NP12VP2? VP2?V23NP3?PP??
VP2?V23NP3? NP35 ART34N45
NP3?ART34N45PP5?
PP5?P56NP6?NP35ART34N45 NP68ART67N7?
NP6?ART67N78PP8?
NP68ART67N78
VP25V23NP35 S15NP12VP25
VP2?V23NP35PP5? PP58P56NP68
NP38ART34N45PP58
VP28V23NP38VP28V23NP35PP58
S18NP12VP28
Bottom Up Parsing - Observations Both Noun Attachment and Verb
Attachment Parses obtained by simply systematically applying the rules
Numbers in subscript help in verifying the parse and getting chunks from the parse
ExerciseFor the sentence,“The man saw the boy with a
telescope” & the grammar given previously, compare the performance of top-down, bottom-up & top-down chart parsing.
Start of Probabilistic Parsing
Example of Sentence labeling: Parsing
[S1[S[S[VP[VBCome][NP[NNPJuly]]]][,,][CC and][S [NP [DT the] [JJ IIT] [NN campus]][VP [AUX is][ADJP [JJ abuzz][PP[IN with][NP[ADJP [JJ new] [CC and] [ VBG returning]][NNS students]]]]]][..]]]
Noisy Channel ModelingNoisy ChannelSource
sentenceTarget parse
T*= argmax [P(T|S)]T
= argmax [P(T).P(S|T)]T
= argmax [P(T)], since given the parse theT sentence is completely
determined and P(S|T)=1
Corpus A collection of text called corpus, is used for
collecting various language data With annotation: more information, but manual labor
intensive Practice: label automatically; correct manually The famous Brown Corpus contains 1 million tagged
words. Switchboard: very famous corpora 2400
conversations, 543 speakers, many US dialects, annotated with orthography and phonetics
Discriminative vs. Generative Model
W* = argmax (P(W|SS)) W
Compute directly fromP(W|SS)
Compute fromP(W).P(SS|W)
DiscriminativeModel
GenerativeModel
Language Models N-grams: sequence of n consecutive
words/characters Probabilistic / Stochastic Context Free
Grammars: Simple probabilistic models capable of
handling recursion A CFG with probabilities attached to rules Rule probabilities how likely is it that a
particular rewrite rule is used?
PCFGs Why PCFGs?
Intuitive probabilistic models for tree-structured languages
Algorithms are extensions of HMM algorithms
Better than the n-gram model for language modeling.
Formal Definition of PCFG A PCFG consists of
A set of terminals {wk}, k = 1,….,V {wk} = { child, teddy, bear, played…}
A set of non-terminals {Ni}, i = 1,…,n {Ni} = { NP, VP, DT…} A designated start symbol N1
A set of rules {Ni j}, where j is a
sequence of terminals & non-terminals NP DT NN A corresponding set of rule probabilities
Rule Probabilities Rule probabilities are such that
E.g., P( NP DT NN) = 0.2 P( NP NN) = 0.5 P( NP NP PP) = 0.3
P( NP DT NN) = 0.2 Means 20 % of the training data
parses use the rule NP DT NN
i
P(N ) 1i ji
Probabilistic Context Free Grammars
S NP VP 1.0 NP DT NN 0.5 NP NNS 0.3 NP NP PP 0.2 PP P NP 1.0 VP VP PP 0.6 VP VBD NP 0.4
DT the 1.0 NN gunman 0.5 NN building 0.5 VBD sprayed 1.0 NNS bullets 1.0
Example Parse t1`
The gunman sprayed the building with bullets. S1.0
NP0.5 VP0.6
DT1.0NN0.5
VBD1.0NP0.5
PP1.0
DT1.0 NN0.5
P1.0 NP0.3
NNS1.0
bullets
with
buildingthe
The gunman
sprayed
P (t1) = 1.0 * 0.5 * 1.0 * 0.5 * 0.6 * 0.4 * 1.0 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = 0.00225VP0.4
Another Parse t2
S1.0
NP0.5 VP0.4
DT1.0 NN0.5VBD1.0
NP0.5 PP1.0
DT1.0 NN0.5 P1.0 NP0.3
NNS1.
0bullets
withbuilding
the
Thegunman
sprayed
NP0.2
P (t2) = 1.0 * 0.5 * 1.0 * 0.5 * 0.4 * 1.0 * 0.2 * 0.5 * 1.0 * 0.5 * 1.0 * 1.0 * 0.3 * 1.0 = 0.0015
The gunman sprayed the building with bullets.
Probability of a sentence Notation :
wab – subsequence wa….wb Nj dominates wa….wb
or yield(Nj) = wa….wbwa……………..wb
Nj
1
1 1
1
: ( )
( ) ( , )
= ( ) ( | )
( )m
m mt
mt
t yield t w
P w P w t
P t P w t
P t
Where t is a parse tree of the sentence
the..sweet..teddy ..bear
NP
• Probability of a sentence = P(w1m)
1( | ) = 1mP w t If t is a parse tree for the sentence w1m, this will be 1 !!
Assumptions of the PCFG model Place invariance :
P(NP DT NN) is same in locations 1 and 2 Context-free :
P(NP DT NN | anything outside “The child”) = P(NP DT NN)
Ancestor free : At 2,P(NP DT NN|its ancestor is VP)
= P(NP DT NN)
S
NP
The child
VP
NP
The toy
1
2
Probability of a parse tree Domination :We say Nj dominates from k
to l, symbolized as , if Wk,l is derived from Nj
P (tree |sentence) = P (tree | S1,l ) where S1,l means that the start symbol S dominates the word sequence
W1,l
P (t |s) approximately equals joint probability of constituent non-terminals dominating the sentence fragments (next slide)
Probability of a parse tree (cont.)S1,l
NP1,2 VP3,l
N2V3,3 PP4,l
P4,4 NP5,lw2
w4
DT1
w1 w3
w5 wl
P ( t|s ) = P (t | S1,l )= P ( NP1,2, DT1,1 , w1,
N2,2, w2,VP3,l, V3,3 , w3,
PP4,l, P4,4 , w4, NP5,l, w5…l | S1,l )
= P ( NP1,2 , VP3,l | S1,l) * P ( DT1,1 , N2,2 | NP1,2) * D(w1 | DT1,1) * P (w2 | N2,2) * P (V3,3, PP4,l | VP3,l) * P(w3 | V3,3) * P( P4,4, NP5,l | PP4,l ) * P(w4|P4,4) * P (w5…l | NP5,l)
(Using Chain Rule, Context Freeness and Ancestor Freeness )
HMM ↔ PCFG O observed sequence ↔ w1m
sentence X state sequence ↔ t parse
tree model ↔ G grammar
Three fundamental questions
HMM ↔ PCFG How likely is a certain observation given the
model? ↔ How likely is a sentence given the grammar?
How to choose a state sequence which best explains the observations? ↔ How to choose a parse which best supports the sentence?
arg max ( | , )X
P X O 1arg max ( | , )mt
P t w G↔
1( | )mP w G( | )P O ↔
HMM ↔ PCFG
How to choose the model parameters that best explain the observed data? ↔ How to choose rule probabilities which maximize the probabilities of the observed sentences?arg max ( | )P O
1arg max ( | )m
GP w G↔
Interesting Probabilities
The gunman sprayed the building with bullets 1 2 3 4 5 6 7
(4,5) NP
N1
NP
(4,5)NP
What is the probability of having a NP at this position such that it will derive “the building” ? -
What is the probability of starting from N1 and deriving “The gunman sprayed”, a NP and “with bullets” ? -
Inside Probabilities
Outside Probabilities
Interesting Probabilities Random variables to be considered
The non-terminal being expanded. E.g., NP
The word-span covered by the non-terminal. E.g., (4,5) refers to words “the building”
While calculating probabilities, consider: The rule to be used for expansion :
E.g., NP DT NN The probabilities associated with the RHS
non-terminals : E.g., DT subtree’s inside/outside probabilities & NN subtree’s inside/outside probabilities
Outside Probability j(p,q) : The probability of beginning
with N1 & generating the non-terminal Nj
pq and all words outside wp..wq1( 1) ( 1)( , ) ( , , | )j
j p pq q mp q P w N w G
w1 ………wp-1 wp…wqwq+1 ……… wm
N1
Nj
Inside Probabilities j(p,q) : The probability of generating
the words wp..wq starting with the non-terminal Nj
pq.( , ) ( | , )jj pq pqp q P w N G
w1 ………wp-1 wp…wqwq+1 ……… wm
N1
Nj
Outside & Inside Probabilities: example
The gunman sprayed the building with bullets 1 2 3 4 5 6 7
4,5
(4,5) for "the building"
(The gunman sprayed, , with bullets | )NP
P NP G
N1
NP
4,5(4,5) for "the building" (the building | , )NP P NP G