Context-Free Grammar Parsing by Message Passing Paper by Dekang Lin and Randy Goebel Presented by...
-
date post
20-Dec-2015 -
Category
Documents
-
view
221 -
download
0
Transcript of Context-Free Grammar Parsing by Message Passing Paper by Dekang Lin and Randy Goebel Presented by...
Context-Free Grammar Parsing by Message Passing
Paper by Dekang Lin and Randy Goebel
Presented by Matt Watkins
Context-Free Grammars
A context-free grammar is represented by a 4-tuple:
Vt → A set of terminals
Vn → A set of non-terminals
P → A set of production rules
S → A member of Vn, representing the starting non-terminal
Context-free grammars are used to represent the syntax of both programming languages and natural languages, as well as other things
Context-Free Grammars
Example:
<Rs:S> → <NP><VP>
<Rnp1:NP> → <n>
<Rnp2:NP> → <d><n>
<Rnp3:NP> → <NP><PP>
<Rvp1:VP> → <VP><PP>
<Rvp2:VP> → <v><NP>
<Rpp:PP> → <p><NP>
<n> → "I"
<n> → "saw"
<n> → "man"
<n> → "park"
<v> → "saw"
<d> → "a"
<d> → "the"
<p> → "in"
<S>
I saw a man in the park
<NP> <VP>
<PP>
<n> <v> <d> <n> <p>
Context-Free Grammars
<d> <n>
<NP>
<VP>
<NP>
Parsing Context-Free Grammars
Given only the definition of a context-free grammar, determine if a particular expression is a valid output of the grammar, and if so, how it is generated.
Earley’s parser
CYK parser
Message passing parser
Grammar Representation
Message passing algorithm represents a CFG as a 6-tuple<N, O, T, s, P, L>
N → set of non-terminalsO → set of pre-terminalsT → set of terminalss → start symbolP → production rulesL → a lexicon consisting of pairs (w, p), wT and pO
Grammar Representation
N = {S, NP, VP, PP, n, v, p, d}O = {n, v, p, d}T = {I, saw, a, man, in, the, park}s = SP = <Rs:S> → <NP><VP>
<Rnp1:NP> → <n>
<Rnp2:NP> → <d><n>
<Rnp3:NP> → <NP><PP>
<Rvp1:VP> → <VP><PP>
<Rvp2:VP> → <v><NP>
<Rpp:PP> → <p><NP>
L = { (I, {n}), (saw, {v, n}), (a, {d}), (man, {n}),(in, {p}), (the, {d}), (park, {n}) }
Message Passing Rules
Non-terminal nodes are called NT nodesPhrase structure rule nodes are called PSR nodes
Messages that are passed are integer pairs representing an interval in the expression being parsed
I saw a man in the park
NT nodes and PSR nodes have different rules for receiving and sending messages
0 1 2 3 4 5 6 7
NT nodes:•Never send the same message twice•Always send all unique messages to parents
PSR nodes:•Have a memory bank of pairs (I, n) where I is an interval and n is a link number•Store pairs where n = 0 in memory bank•Combines pairs where applicable if n ≠ 0
Message Passing Rules
Pair (I, n) is combined with (I´, n´) iff: i, j, k such that I = {i, j} and I´ = {j, k}•n´ = n + 1•If n´ is the last link, send a message to parents
Use T to identify the locations of terminals in the expression to be parsed
Use the lexicon to determine the terminal’s part of speech.
Pass a message to all parts of speech indicating the starting and ending position of all the terminals in the expression.
Message Passing Rules
Message Passing Example
Rs
S
NP
VP
PPn d p
vRnp1
Rnp2
Rnp3
Rvp2
Rvp1
Rpp
0
0
0
0
0
0
0
1
1
1
1
1
1
Parsing “I”
{0,1}
Message Passing Example
Rs
S
NP
VP
PPn d p
vRnp1
Rnp2
Rnp3
Rvp2
Rvp1
Rpp
0
0
0
0
0
0
0
1
1
1
1
1
1
Parsing “I”
{0,1}
({0,1}, 0)
Message Passing Example
Rs
S
NP
VP
PPn d p
vRnp1
Rnp2
Rnp3
Rvp2
Rvp1
Rpp
0
0
0
0
0
0
0
1
1
1
1
1
1
Parsing “I”
{0,1}
({0,1}, 0)
{0,1}
Message Passing Example
Rs
S
NP
VP
PPn d p
vRnp1
Rnp2
Rnp3
Rvp2
Rvp1
Rpp
0
0
0
0
0
0
0
1
1
1
1
1
1
Parsing “I”
{0,1}
({0,1}, 0)
{0,1}
({0,1}, 0)
({0,1}, 0)
Each node will contain a set of intervals that represent where in the expression the non-terminals can be found.
After message passing has completed, if the expression is represented by the grammar, then the network will contain a packed parse forest
Completion
Parse Time Distribution
0
0.05
0.1
0.15
0.2
0.25
0.3
0 5 10 15 20 25 30 35 40
Expression Length
Tim
e (s
)
AnalysisStrengths
•O(|G|n3) time complexity•Easily parallelizable•Can handle empty rules
Weaknesses•Must convert some grammars in Backus Naur form