Context-Free Grammar Parsing by Message Passing Paper by Dekang Lin and Randy Goebel Presented by...

21
Context-Free Grammar Parsing by Message Passing Paper by Dekang Lin and Randy Goebel Presented by Matt Watkins
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    221
  • download

    0

Transcript of Context-Free Grammar Parsing by Message Passing Paper by Dekang Lin and Randy Goebel Presented by...

Context-Free Grammar Parsing by Message Passing

Paper by Dekang Lin and Randy Goebel

Presented by Matt Watkins

Context-Free Grammars

A context-free grammar is represented by a 4-tuple:

Vt → A set of terminals

Vn → A set of non-terminals

P → A set of production rules

S → A member of Vn, representing the starting non-terminal

Context-free grammars are used to represent the syntax of both programming languages and natural languages, as well as other things

Context-Free Grammars

Example:

<Rs:S> → <NP><VP>

<Rnp1:NP> → <n>

<Rnp2:NP> → <d><n>

<Rnp3:NP> → <NP><PP>

<Rvp1:VP> → <VP><PP>

<Rvp2:VP> → <v><NP>

<Rpp:PP> → <p><NP>

<n> → "I"

<n> → "saw"

<n> → "man"

<n> → "park"

<v> → "saw"

<d> → "a"

<d> → "the"

<p> → "in"

<S>

I saw a man in the park

<NP> <VP>

<PP>

<n> <v> <d> <n> <p>

Context-Free Grammars

<d> <n>

<NP>

<VP>

<NP>

Parsing Context-Free Grammars

Given only the definition of a context-free grammar, determine if a particular expression is a valid output of the grammar, and if so, how it is generated.

Earley’s parser

CYK parser

Message passing parser

Grammar Representation

Message passing algorithm represents a CFG as a 6-tuple<N, O, T, s, P, L>

N → set of non-terminalsO → set of pre-terminalsT → set of terminalss → start symbolP → production rulesL → a lexicon consisting of pairs (w, p), wT and pO

Grammar Representation

N = {S, NP, VP, PP, n, v, p, d}O = {n, v, p, d}T = {I, saw, a, man, in, the, park}s = SP = <Rs:S> → <NP><VP>

<Rnp1:NP> → <n>

<Rnp2:NP> → <d><n>

<Rnp3:NP> → <NP><PP>

<Rvp1:VP> → <VP><PP>

<Rvp2:VP> → <v><NP>

<Rpp:PP> → <p><NP>

L = { (I, {n}), (saw, {v, n}), (a, {d}), (man, {n}),(in, {p}), (the, {d}), (park, {n}) }

Message Passing Network

Rs

S

NP

VP

PPn d p

vRnp1

Rnp2

Rnp3

Rvp2

Rvp1

Rpp

0

0

0

0

0

0

0

1

1

1

1

1

1

Message Passing Rules

Non-terminal nodes are called NT nodesPhrase structure rule nodes are called PSR nodes

Messages that are passed are integer pairs representing an interval in the expression being parsed

I saw a man in the park

NT nodes and PSR nodes have different rules for receiving and sending messages

0 1 2 3 4 5 6 7

NT nodes:•Never send the same message twice•Always send all unique messages to parents

PSR nodes:•Have a memory bank of pairs (I, n) where I is an interval and n is a link number•Store pairs where n = 0 in memory bank•Combines pairs where applicable if n ≠ 0

Message Passing Rules

Pair (I, n) is combined with (I´, n´) iff: i, j, k such that I = {i, j} and I´ = {j, k}•n´ = n + 1•If n´ is the last link, send a message to parents

Use T to identify the locations of terminals in the expression to be parsed

Use the lexicon to determine the terminal’s part of speech.

Pass a message to all parts of speech indicating the starting and ending position of all the terminals in the expression.

Message Passing Rules

Message Passing Example

Rs

S

NP

VP

PPn d p

vRnp1

Rnp2

Rnp3

Rvp2

Rvp1

Rpp

0

0

0

0

0

0

0

1

1

1

1

1

1

Parsing “I”

{0,1}

Message Passing Example

Rs

S

NP

VP

PPn d p

vRnp1

Rnp2

Rnp3

Rvp2

Rvp1

Rpp

0

0

0

0

0

0

0

1

1

1

1

1

1

Parsing “I”

{0,1}

({0,1}, 0)

Message Passing Example

Rs

S

NP

VP

PPn d p

vRnp1

Rnp2

Rnp3

Rvp2

Rvp1

Rpp

0

0

0

0

0

0

0

1

1

1

1

1

1

Parsing “I”

{0,1}

({0,1}, 0)

{0,1}

Message Passing Example

Rs

S

NP

VP

PPn d p

vRnp1

Rnp2

Rnp3

Rvp2

Rvp1

Rpp

0

0

0

0

0

0

0

1

1

1

1

1

1

Parsing “I”

{0,1}

({0,1}, 0)

{0,1}

({0,1}, 0)

({0,1}, 0)

Each node will contain a set of intervals that represent where in the expression the non-terminals can be found.

After message passing has completed, if the expression is represented by the grammar, then the network will contain a packed parse forest

Completion

Completion

Completion

Tested on SPARCstation SLC

Parse Time Distribution

0

0.05

0.1

0.15

0.2

0.25

0.3

0 5 10 15 20 25 30 35 40

Expression Length

Tim

e (s

)

AnalysisStrengths

•O(|G|n3) time complexity•Easily parallelizable•Can handle empty rules

Weaknesses•Must convert some grammars in Backus Naur form

Questions?