CKY and Earley Algorithms Chapter 13

70
1 CKY and Earley Algorithms Chapter 13 October 2012 Lecture #8

description

CKY and Earley Algorithms Chapter 13. Lecture #8. October 2012. Review. Top-Down vs. Bottom-Up Parsers Both generate too many useless trees Combine the two to avoid over-generation: Top-Down Parsing with Bottom-Up look-ahead Left-corner table provides more efficient look-ahead - PowerPoint PPT Presentation

Transcript of CKY and Earley Algorithms Chapter 13

Page 1: CKY and Earley Algorithms Chapter 13

1

CKY and Earley AlgorithmsChapter 13

October 2012

Lecture 8

2

Review

bull Top-Down vs Bottom-Up Parsersndash Both generate too many useless treesndash Combine the two to avoid over-generation Top-Down

Parsing with Bottom-Up look-ahead

bull Left-corner table provides more efficient look-aheadndash Pre-compute all POS that can serve as the leftmost POS in

the derivations of each non-terminal category

bull More problems remain

3

Left Recursion

bull Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

)(

4

Left-Recursion

bull What happens in the following situationndash S -gt NP VPndash S -gt Aux NP VPndash NP -gt NP PPndash NP -gt Det Nominalndash hellipndash With the sentence starting with

bull Did the flighthellip

5

Rule Ordering

bull S -gt Aux NP VPbull S -gt NP VPbull NP -gt Det Nominalbull NP -gt NP PP

bull The key for the NP is that you want the recursive option after any base case Duhh

6

Rule Ordering

bull S -gt Aux NP VPbull S -gt NP VPbull NP -gt Det Nominalbull NP -gt NP PP

bull What happens withndash Book that flight

7

bull Solutionsndash Rewrite the grammar (automatically) to a weakly equivalent

one which is not left-recursiveeg The man on the hill with the telescopehellip

NP NP PP

NP Nom PP

NP Nom

hellipbecomeshellip

NP Nom NPrsquo

NPrsquo PP NPrsquo

NPrsquo e

bull This may make rules unnatural

8

ndash Harder to eliminate non-immediate left recursionndash NP --gt Nom PPndash Nom --gt NP

ndash Fix depth of search explicitlyndash Rule ordering non-recursive rules first

NP --gt Det Nom

NP --gt NP PP

9

Structural ambiguity

bull Multiple legal structuresndash Attachment (eg I saw a man on a hill with a telescope)ndash Coordination (eg younger cats and dogs)ndash NP bracketing (eg Spanish language teachers)

042223 Speech and Language Processing - Jurafsky and Martin 10

Ambiguity

11

bull Solution ndash Return all possible parses and disambiguate using ldquoother

methodsrdquo

12

Avoiding Repeated Work

bull Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

bull Consider an attempt to top-down parse the following as an NP

A flight from India to Houston on TWA

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 2: CKY and Earley Algorithms Chapter 13

2

Review

bull Top-Down vs Bottom-Up Parsersndash Both generate too many useless treesndash Combine the two to avoid over-generation Top-Down

Parsing with Bottom-Up look-ahead

bull Left-corner table provides more efficient look-aheadndash Pre-compute all POS that can serve as the leftmost POS in

the derivations of each non-terminal category

bull More problems remain

3

Left Recursion

bull Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

)(

4

Left-Recursion

bull What happens in the following situationndash S -gt NP VPndash S -gt Aux NP VPndash NP -gt NP PPndash NP -gt Det Nominalndash hellipndash With the sentence starting with

bull Did the flighthellip

5

Rule Ordering

bull S -gt Aux NP VPbull S -gt NP VPbull NP -gt Det Nominalbull NP -gt NP PP

bull The key for the NP is that you want the recursive option after any base case Duhh

6

Rule Ordering

bull S -gt Aux NP VPbull S -gt NP VPbull NP -gt Det Nominalbull NP -gt NP PP

bull What happens withndash Book that flight

7

bull Solutionsndash Rewrite the grammar (automatically) to a weakly equivalent

one which is not left-recursiveeg The man on the hill with the telescopehellip

NP NP PP

NP Nom PP

NP Nom

hellipbecomeshellip

NP Nom NPrsquo

NPrsquo PP NPrsquo

NPrsquo e

bull This may make rules unnatural

8

ndash Harder to eliminate non-immediate left recursionndash NP --gt Nom PPndash Nom --gt NP

ndash Fix depth of search explicitlyndash Rule ordering non-recursive rules first

NP --gt Det Nom

NP --gt NP PP

9

Structural ambiguity

bull Multiple legal structuresndash Attachment (eg I saw a man on a hill with a telescope)ndash Coordination (eg younger cats and dogs)ndash NP bracketing (eg Spanish language teachers)

042223 Speech and Language Processing - Jurafsky and Martin 10

Ambiguity

11

bull Solution ndash Return all possible parses and disambiguate using ldquoother

methodsrdquo

12

Avoiding Repeated Work

bull Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

bull Consider an attempt to top-down parse the following as an NP

A flight from India to Houston on TWA

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 3: CKY and Earley Algorithms Chapter 13

3

Left Recursion

bull Depth-first search will never terminate if grammar is left recursive (eg NP --gt NP PP)

)(

4

Left-Recursion

bull What happens in the following situationndash S -gt NP VPndash S -gt Aux NP VPndash NP -gt NP PPndash NP -gt Det Nominalndash hellipndash With the sentence starting with

bull Did the flighthellip

5

Rule Ordering

bull S -gt Aux NP VPbull S -gt NP VPbull NP -gt Det Nominalbull NP -gt NP PP

bull The key for the NP is that you want the recursive option after any base case Duhh

6

Rule Ordering

bull S -gt Aux NP VPbull S -gt NP VPbull NP -gt Det Nominalbull NP -gt NP PP

bull What happens withndash Book that flight

7

bull Solutionsndash Rewrite the grammar (automatically) to a weakly equivalent

one which is not left-recursiveeg The man on the hill with the telescopehellip

NP NP PP

NP Nom PP

NP Nom

hellipbecomeshellip

NP Nom NPrsquo

NPrsquo PP NPrsquo

NPrsquo e

bull This may make rules unnatural

8

ndash Harder to eliminate non-immediate left recursionndash NP --gt Nom PPndash Nom --gt NP

ndash Fix depth of search explicitlyndash Rule ordering non-recursive rules first

NP --gt Det Nom

NP --gt NP PP

9

Structural ambiguity

bull Multiple legal structuresndash Attachment (eg I saw a man on a hill with a telescope)ndash Coordination (eg younger cats and dogs)ndash NP bracketing (eg Spanish language teachers)

042223 Speech and Language Processing - Jurafsky and Martin 10

Ambiguity

11

bull Solution ndash Return all possible parses and disambiguate using ldquoother

methodsrdquo

12

Avoiding Repeated Work

bull Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

bull Consider an attempt to top-down parse the following as an NP

A flight from India to Houston on TWA

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 4: CKY and Earley Algorithms Chapter 13

4

Left-Recursion

bull What happens in the following situationndash S -gt NP VPndash S -gt Aux NP VPndash NP -gt NP PPndash NP -gt Det Nominalndash hellipndash With the sentence starting with

bull Did the flighthellip

5

Rule Ordering

bull S -gt Aux NP VPbull S -gt NP VPbull NP -gt Det Nominalbull NP -gt NP PP

bull The key for the NP is that you want the recursive option after any base case Duhh

6

Rule Ordering

bull S -gt Aux NP VPbull S -gt NP VPbull NP -gt Det Nominalbull NP -gt NP PP

bull What happens withndash Book that flight

7

bull Solutionsndash Rewrite the grammar (automatically) to a weakly equivalent

one which is not left-recursiveeg The man on the hill with the telescopehellip

NP NP PP

NP Nom PP

NP Nom

hellipbecomeshellip

NP Nom NPrsquo

NPrsquo PP NPrsquo

NPrsquo e

bull This may make rules unnatural

8

ndash Harder to eliminate non-immediate left recursionndash NP --gt Nom PPndash Nom --gt NP

ndash Fix depth of search explicitlyndash Rule ordering non-recursive rules first

NP --gt Det Nom

NP --gt NP PP

9

Structural ambiguity

bull Multiple legal structuresndash Attachment (eg I saw a man on a hill with a telescope)ndash Coordination (eg younger cats and dogs)ndash NP bracketing (eg Spanish language teachers)

042223 Speech and Language Processing - Jurafsky and Martin 10

Ambiguity

11

bull Solution ndash Return all possible parses and disambiguate using ldquoother

methodsrdquo

12

Avoiding Repeated Work

bull Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

bull Consider an attempt to top-down parse the following as an NP

A flight from India to Houston on TWA

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 5: CKY and Earley Algorithms Chapter 13

5

Rule Ordering

bull S -gt Aux NP VPbull S -gt NP VPbull NP -gt Det Nominalbull NP -gt NP PP

bull The key for the NP is that you want the recursive option after any base case Duhh

6

Rule Ordering

bull S -gt Aux NP VPbull S -gt NP VPbull NP -gt Det Nominalbull NP -gt NP PP

bull What happens withndash Book that flight

7

bull Solutionsndash Rewrite the grammar (automatically) to a weakly equivalent

one which is not left-recursiveeg The man on the hill with the telescopehellip

NP NP PP

NP Nom PP

NP Nom

hellipbecomeshellip

NP Nom NPrsquo

NPrsquo PP NPrsquo

NPrsquo e

bull This may make rules unnatural

8

ndash Harder to eliminate non-immediate left recursionndash NP --gt Nom PPndash Nom --gt NP

ndash Fix depth of search explicitlyndash Rule ordering non-recursive rules first

NP --gt Det Nom

NP --gt NP PP

9

Structural ambiguity

bull Multiple legal structuresndash Attachment (eg I saw a man on a hill with a telescope)ndash Coordination (eg younger cats and dogs)ndash NP bracketing (eg Spanish language teachers)

042223 Speech and Language Processing - Jurafsky and Martin 10

Ambiguity

11

bull Solution ndash Return all possible parses and disambiguate using ldquoother

methodsrdquo

12

Avoiding Repeated Work

bull Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

bull Consider an attempt to top-down parse the following as an NP

A flight from India to Houston on TWA

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 6: CKY and Earley Algorithms Chapter 13

6

Rule Ordering

bull S -gt Aux NP VPbull S -gt NP VPbull NP -gt Det Nominalbull NP -gt NP PP

bull What happens withndash Book that flight

7

bull Solutionsndash Rewrite the grammar (automatically) to a weakly equivalent

one which is not left-recursiveeg The man on the hill with the telescopehellip

NP NP PP

NP Nom PP

NP Nom

hellipbecomeshellip

NP Nom NPrsquo

NPrsquo PP NPrsquo

NPrsquo e

bull This may make rules unnatural

8

ndash Harder to eliminate non-immediate left recursionndash NP --gt Nom PPndash Nom --gt NP

ndash Fix depth of search explicitlyndash Rule ordering non-recursive rules first

NP --gt Det Nom

NP --gt NP PP

9

Structural ambiguity

bull Multiple legal structuresndash Attachment (eg I saw a man on a hill with a telescope)ndash Coordination (eg younger cats and dogs)ndash NP bracketing (eg Spanish language teachers)

042223 Speech and Language Processing - Jurafsky and Martin 10

Ambiguity

11

bull Solution ndash Return all possible parses and disambiguate using ldquoother

methodsrdquo

12

Avoiding Repeated Work

bull Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

bull Consider an attempt to top-down parse the following as an NP

A flight from India to Houston on TWA

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 7: CKY and Earley Algorithms Chapter 13

7

bull Solutionsndash Rewrite the grammar (automatically) to a weakly equivalent

one which is not left-recursiveeg The man on the hill with the telescopehellip

NP NP PP

NP Nom PP

NP Nom

hellipbecomeshellip

NP Nom NPrsquo

NPrsquo PP NPrsquo

NPrsquo e

bull This may make rules unnatural

8

ndash Harder to eliminate non-immediate left recursionndash NP --gt Nom PPndash Nom --gt NP

ndash Fix depth of search explicitlyndash Rule ordering non-recursive rules first

NP --gt Det Nom

NP --gt NP PP

9

Structural ambiguity

bull Multiple legal structuresndash Attachment (eg I saw a man on a hill with a telescope)ndash Coordination (eg younger cats and dogs)ndash NP bracketing (eg Spanish language teachers)

042223 Speech and Language Processing - Jurafsky and Martin 10

Ambiguity

11

bull Solution ndash Return all possible parses and disambiguate using ldquoother

methodsrdquo

12

Avoiding Repeated Work

bull Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

bull Consider an attempt to top-down parse the following as an NP

A flight from India to Houston on TWA

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 8: CKY and Earley Algorithms Chapter 13

8

ndash Harder to eliminate non-immediate left recursionndash NP --gt Nom PPndash Nom --gt NP

ndash Fix depth of search explicitlyndash Rule ordering non-recursive rules first

NP --gt Det Nom

NP --gt NP PP

9

Structural ambiguity

bull Multiple legal structuresndash Attachment (eg I saw a man on a hill with a telescope)ndash Coordination (eg younger cats and dogs)ndash NP bracketing (eg Spanish language teachers)

042223 Speech and Language Processing - Jurafsky and Martin 10

Ambiguity

11

bull Solution ndash Return all possible parses and disambiguate using ldquoother

methodsrdquo

12

Avoiding Repeated Work

bull Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

bull Consider an attempt to top-down parse the following as an NP

A flight from India to Houston on TWA

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 9: CKY and Earley Algorithms Chapter 13

9

Structural ambiguity

bull Multiple legal structuresndash Attachment (eg I saw a man on a hill with a telescope)ndash Coordination (eg younger cats and dogs)ndash NP bracketing (eg Spanish language teachers)

042223 Speech and Language Processing - Jurafsky and Martin 10

Ambiguity

11

bull Solution ndash Return all possible parses and disambiguate using ldquoother

methodsrdquo

12

Avoiding Repeated Work

bull Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

bull Consider an attempt to top-down parse the following as an NP

A flight from India to Houston on TWA

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 10: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 10

Ambiguity

11

bull Solution ndash Return all possible parses and disambiguate using ldquoother

methodsrdquo

12

Avoiding Repeated Work

bull Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

bull Consider an attempt to top-down parse the following as an NP

A flight from India to Houston on TWA

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 11: CKY and Earley Algorithms Chapter 13

11

bull Solution ndash Return all possible parses and disambiguate using ldquoother

methodsrdquo

12

Avoiding Repeated Work

bull Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

bull Consider an attempt to top-down parse the following as an NP

A flight from India to Houston on TWA

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 12: CKY and Earley Algorithms Chapter 13

12

Avoiding Repeated Work

bull Parsing is hard and slow Itrsquos wasteful to redo stuff over and over and over

bull Consider an attempt to top-down parse the following as an NP

A flight from India to Houston on TWA

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 13: CKY and Earley Algorithms Chapter 13

13

flight

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 14: CKY and Earley Algorithms Chapter 13

14

flight

flight

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 15: CKY and Earley Algorithms Chapter 13

15

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 16: CKY and Earley Algorithms Chapter 13

16

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 17: CKY and Earley Algorithms Chapter 13

17

Dynamic Programming

bull We need a method that fills a table with partial results thatndash Does not do (avoidable) repeated workndash Does not fall prey to left-recursionndash Solves an exponential problem in polynomial time (sort of)

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 18: CKY and Earley Algorithms Chapter 13

18

Dynamic Programming

bull Create table of solutions to sub-problems (eg subtrees) as parse proceeds

bull Look up subtrees for each constituent rather than re-parsing

bull Since all parses implicitly stored all available for later disambiguation

bull Examples Cocke-Younger-Kasami (CYK) (1960) Graham-Harrison-Ruzzo (GHR) (1980) and Earley (1970) algorithms

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 19: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 19

Dynamic Programming

DP search methods fill tables with partial results and thereby Avoid doing avoidable repeated work Solve exponential problems in polynomial time

(well no not really) Efficiently store ambiguous structures with

shared sub-parts Wersquoll cover two approaches that roughly

correspond to top-down and bottom-up approaches CKY Earley

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 20: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 20

CKY Parsing

First wersquoll limit our grammar to epsilon-free binary rules (more later)

Consider the rule A BC If there is an A somewhere in the

input then there must be a B followed by a C in the input

If the A spans from i to j in the input then there must be some k st iltkltj Ie The B splits from the C someplace

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 21: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 21

Problem

What if your grammar isnrsquot binary As in the case of the TreeBank grammar

Convert it to binaryhellip any arbitrary CFG can be rewritten into Chomsky-Normal Form automatically

What does this mean The resulting grammar accepts (and rejects)

the same set of strings as the original grammar

But the resulting derivations (trees) are different

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 22: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 22

Problem

More specifically we want our rules to be of the formA B C

Or

A w

That is rules can expand to either 2 non-terminals or to a single terminal

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 23: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 23

Binarization Intuition

Eliminate chains of unit productions Introduce new intermediate non-

terminals into the grammar that distribute rules with length gt 2 over several rules Sohellip S A B C turns into S X C andX A BWhere X is a symbol that doesnrsquot occur

anywhere else in the the grammar

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 24: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 24

Sample L1 Grammar

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 25: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 25

CNF Conversion

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 26: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 26

CKY

So letrsquos build a table so that an A spanning from i to j in the input is placed in cell [ij] in the table

So a non-terminal spanning an entire string will sit in cell [0 n] Hopefully an S

If we build the table bottom-up wersquoll know that the parts of the A must go from i to k and from k to j for some k

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 27: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 27

CKY

Meaning that for a rule like A B C we should look for a B in [ik] and a C in [kj]

In other words if we think there might be an A spanning ij in the inputhellip AND

A B C is a rule in the grammar THEN

There must be a B in [ik] and a C in [kj] for some iltkltj

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 28: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 28

CKY

So to fill the table loop over the cell[ij] values in some systematic way What constraint should we put on that

systematic search

For each cell loop over the appropriate k values to search for things to add

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 29: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 29

CKY Algorithm

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 30: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 30

Note

We arranged the loops to fill the table a column at a time from left to right bottom to top This assures us that whenever wersquore

filling a cell the parts needed to fill it are already in the table (to the left and below)

Itrsquos somewhat natural in that it processes the input a left to right a word at a time Known as online

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 31: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 31

Example

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 32: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 32

Example

Filling column 5

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 33: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 33

Example

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 34: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 34

Example

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 35: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 35

Example

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 36: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 36

Example

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 37: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 37

CKY Notes

Since itrsquos bottom up CKY populates the table with a lot of phantom constituents Segments that by themselves are

constituents but cannot really occur in the context in which they are being suggested

To avoid this we can switch to a top-down control strategy

Or we can add some kind of filtering that blocks constituents where they can not happen in a final analysis

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 38: CKY and Earley Algorithms Chapter 13

38

Earleyrsquos Algorithm

bull Uses dynamic programming to do parallel top-down search in (worst case) O(N3) time

bull First L2R pass fills out a chart with N+1 states (N the number of words in the input)ndash Think of chart entries as sitting between words in the input

string keeping track of states of the parse at these positionsndash For each word position chart contains set of states

representing all partial parse trees generated to date Eg chart[0] contains all partial parse trees generated at the beginning of the sentence

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 39: CKY and Earley Algorithms Chapter 13

39

Earley Parsing

bull Fills a table in a single sweep over the input wordsndash Table is length N+1 N is number of wordsndash Table entries represent

bull Completed constituents and their locations

bull In-progress constituents

bull Predicted constituents

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 40: CKY and Earley Algorithms Chapter 13

40

States

bull The table-entries are called states and are represented with dotted-rules

S -gt VP A VP is predicted

NP -gt Det Nominal An NP is in progress

VP -gt V NP A VP has been found

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 41: CKY and Earley Algorithms Chapter 13

41

StatesLocations

bull It would be nice to know where these things are in the input sohellip[xy] tells us where the state begins (x) and where the dot lies (y) wrt the input

S -gt VP [00] A VP is predicted at the start of the sentence

NP -gt Det Nominal [12] An NP is in progress the Det goes from 1 to 2

VP -gt V NP [03] A VP has been found starting at 0 and ending at 3

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 42: CKY and Earley Algorithms Chapter 13

42

S --gt bull VP [00] ndash First 0 means S constituent begins at the start of the inputndash Second 0 means the dot is there toondash So this is a top-down prediction

NP --gt Det bull Nom [12]ndash the NP begins at position 1ndash the dot is at position 2ndash so Det has been successfully parsedndash Nom predicted next

0 Book 1 that 2 flight 3

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 43: CKY and Earley Algorithms Chapter 13

43

VP --gt V NP bull [03]ndash Successful VP parse of entire input

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 44: CKY and Earley Algorithms Chapter 13

44

Successful Parse

bull Final answer found by looking at last entry in chartbull If entry resembles S --gt bull [0N] then input parsed

successfullybull But note that chart will also contain a record of all

possible parses of input string given the grammar -- not just the successful one(s)

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 45: CKY and Earley Algorithms Chapter 13

45

Earley

bull As with most dynamic programming approaches the answer is found by looking in the table in the right place

bull In this case there should be an S state in the final column that spans from 0 to n+1 and is complete

bull If thatrsquos the case yoursquore donendash S ndash α [0n+1]

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 46: CKY and Earley Algorithms Chapter 13

46

Earley

bull So sweep through the table from 0 to n+1hellipndash New predicted states are created by starting top-down from

S In each case use rules in the grammar to expand a state when what is being looked for is a non-terminal symbol A new prediction is made for every rule that has that non-terminal as its left-hand side

ndash New incomplete states are created by advancing existing states as new constituents are discovered

ndash New complete states are created in the same way

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 47: CKY and Earley Algorithms Chapter 13

47

Earley

bull More specificallyhellip1 Predict all the states you can upfront

2 Read a word1 Extend states based on matches

2 Add new predictions

3 Go to 2

3 When you are out of words look in the chart at N+1 to see if you have a winner

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 48: CKY and Earley Algorithms Chapter 13

48

Parsing Procedure for the Earley Algorithm

bull Move through each set of states in order applying one of three operators to each statendash predictor add predictions to the chartndash scanner read input and add corresponding state to chartndash completer move dot to right when new constituent found

bull Results (new states) added to current or next set of states in chart

bull No backtracking and no states removed keep complete history of parse

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 49: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 49

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 50: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 50

Earley Code

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 51: CKY and Earley Algorithms Chapter 13

51

Predictor

bull Intuition new states represent top-down expectations

bull Applied when non part-of-speech non-terminals are to the right of a dotS --gt bull VP [00]

bull Adds new states to current chartndash One new state for each expansion of the non-terminal in the

grammarVP --gt bull V [00]

VP --gt bull V NP [00]

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 52: CKY and Earley Algorithms Chapter 13

52

Scanner

bull New states for predicted part of speechbull Applicable when part of speech is to the right of a dot

VP --gt bull V NP [00] lsquoBookhelliprsquo

bull Looks at current word in inputbull If match adds state(s) to next chart

VP --gt V bull NP [01]

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 53: CKY and Earley Algorithms Chapter 13

53

Completer

bull Intuition parser has discovered a constituent so must find and advance all states that were waiting for this

bull Applied when dot has reached right end of ruleNP --gt Det Nom bull [13]

bull Find all states wdot at 1 and expecting an NPVP --gt V bull NP [01]

bull Adds new (completed) state(s) to current chartVP --gt V NP bull [03]

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 54: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 54

Core Earley Code

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 55: CKY and Earley Algorithms Chapter 13

042223 Speech and Language Processing - Jurafsky and Martin 55

Earley Code

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 56: CKY and Earley Algorithms Chapter 13

56

0 Book 1 that 2 flight 3 (Chart [0])bull Seed chart with top-down predictions for S from

grammar

[00] Dummy start state

S NP VP [00] Predictor

S Aux NP VP [00] Predictor

S VP [00] Predictor

NP Det Nom [00] Predictor

NP PropN [00] Predictor

VP V [00] Predictor

VP V NP [00] Predictor

S

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 57: CKY and Earley Algorithms Chapter 13

57

CFG for Fragment of English

PropN Houston | TWA

Prep from | to | on

NP Det Nom

S VP

S Aux NP VP

S NP VP

Nom N NomNom N

Det that | this | a

N book | flight | meal | money

V book | include | prefer

Aux does

VP V NP

VP V

NP PropN

Nom Nom PP

PP Prep NP

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 58: CKY and Earley Algorithms Chapter 13

58

bull When dummy start state is processed itrsquos passed to Predictor which produces states representing every possible expansion of S and adds these and every expansion of the left corners of these trees to bottom of Chart[0]

bull When VP --gt bull V [00] is reached Scanner called which consults first word of input Book and adds first state to Chart[1] V --gt Book bull [01]

bull Note When VP --gt bull V NP [00] is reached in Chart[0] Scanner expands the rule yielding VP --gt V NP [01] but does not put V --gt Book bull [01] in again

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 59: CKY and Earley Algorithms Chapter 13

59

Example

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 60: CKY and Earley Algorithms Chapter 13

60

Chart[1]

V book [01] Scanner

VP V [01] Completer

VP V NP [01] Completer

S VP [01] Completer

NP Det Nom [11] Predictor

NP PropN [11] Predictor

V--gt book passed to Completer which finds 2 states in Chart[00] whose left corner is V and adds them to Chart[01] moving dots to right

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 61: CKY and Earley Algorithms Chapter 13

61

bull When VP V is itself processed by the Completer S VP is added to Chart[1] since VP is a left corner of S

bull Last 2 rules in Chart[1] are added by Predictor when VP V NP is processed

bull And so onhellip

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 62: CKY and Earley Algorithms Chapter 13

62

Example

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 63: CKY and Earley Algorithms Chapter 13

63

Example

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 64: CKY and Earley Algorithms Chapter 13

64

What is it

bull What kind of parser did we just describe (trick question)ndash Earley parserhellip yesndash Not a parser ndash a recognizer

bull The presence of an S state with the right attributes in the right place indicates a successful recognition

bull But no parse treehellip no parser

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 65: CKY and Earley Algorithms Chapter 13

65

How do we retrieve the parses at the end

bull Augment the Completer to add ptr to prior states it advances as a field in the current statendash Ie what state did we advance herendash Read the ptrs back from the final state

bull Do we NEED the pointers

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 66: CKY and Earley Algorithms Chapter 13

66

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 67: CKY and Earley Algorithms Chapter 13

67

Useful Properties

bull Error handlingbull Alternative control strategies

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 68: CKY and Earley Algorithms Chapter 13

68

Error Handling

bull What happens when we look at the contents of the last table column and dont find a S --gt rulendash Is it a total loss Nondash Chart contains every constituent and combination of

constituents possible for the input given the grammar

bull Also useful for partial parsing or shallow parsing used in information extraction

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 69: CKY and Earley Algorithms Chapter 13

69

Alternative Control Strategies

bull Change Earley top-down strategy to bottom-up or bull Change to best-first strategy based on the

probabilities of constituents ndash Compute and store probabilities of constituents in the chart

as you parsendash Then instead of expanding states in fixed order allow

probabilities to control order of expansion

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up
Page 70: CKY and Earley Algorithms Chapter 13

70

Summing Upbull Ambiguity left-recursion and repeated re-parsing of

subtrees present major problems for parsersbull Solutions

ndash Combine top-down predictions with bottom-up look-aheadndash Use dynamic programmingndash Example the Earley algorithm

bull Next time Read Ch 14

  • CKY and Earley Algorithms Chapter 13
  • Review
  • Left Recursion
  • Left-Recursion
  • Rule Ordering
  • Slide 6
  • Slide 7
  • Slide 8
  • Structural ambiguity
  • Slide 11
  • Avoiding Repeated Work
  • Slide 13
  • Slide 14
  • Slide 15
  • Slide 16
  • Dynamic Programming
  • Slide 18
  • Earleyrsquos Algorithm
  • Earley Parsing
  • States
  • StatesLocations
  • 0 Book 1 that 2 flight 3
  • Slide 43
  • Successful Parse
  • Earley
  • Slide 46
  • Slide 47
  • Parsing Procedure for the Earley Algorithm
  • Predictor
  • Scanner
  • Completer
  • 0 Book 1 that 2 flight 3 (Chart [0])
  • CFG for Fragment of English
  • Slide 58
  • Slide 59
  • Chart[1]
  • Slide 61
  • Slide 62
  • Slide 63
  • What is it
  • How do we retrieve the parses at the end
  • Slide 66
  • Useful Properties
  • Error Handling
  • Alternative Control Strategies
  • Summing Up