Computational Grammar - PAN Localization

69
Computational Grammar Sarmad Hussain, Tafseer Ahmed, Nayyara Karamat, Shanza Nayyer Center for Research in Urdu Language Processing National University of Computer and Emerging Sciences Lahore, Pakistan www.crulp.org [email protected]

Transcript of Computational Grammar - PAN Localization

Page 1: Computational Grammar - PAN Localization

Computational Grammar

Sarmad Hussain, Tafseer Ahmed, Nayyara Karamat, Shanza NayyerCenter for Research in Urdu Language ProcessingNational University of Computer and Emerging SciencesLahore, [email protected]

Page 2: Computational Grammar - PAN Localization

www.PANL10n.net 2

How to develop a computational grammar?1. Define a basic POS set2. Identify simple constituents: phrases and

sentences 3. Develop phrasal and sentence level context-free

rules4. Add additonal information

i. Agreementii. Grammatical relationsiii. Sub-categorization frames

5. Repeat with more complex structures until analysis is stable

Page 3: Computational Grammar - PAN Localization

www.PANL10n.net 3

Part-of-Speech (POS)/Word Class

Open ClassesNounsVerbsAdjectivesAdverbs

Closed ClassesPrepositionsDeterminersPronounsConjunctionsAuxiliary VerbsParticlesNumeralsCase Markers

Page 4: Computational Grammar - PAN Localization

www.PANL10n.net 4

How many POS tags?

Penn Tree Bank = 45Brown Corpus = 87C5 tagset = 61C7 tagset = 146

Requirement dependent on the applicationDefine/refine tags as per requirement

Page 5: Computational Grammar - PAN Localization

www.PANL10n.net 5

Constituency

Words normally group together into phrasesPhrases group together to give larger phrases and sentencesHow to identify similar phrases or constituents

Similar syntactic environmentMovement

Page 6: Computational Grammar - PAN Localization

www.PANL10n.net 6

Similar Syntactic Environment

He sitsThe boy sitsThe naughty boy sitsThe three young men sitThe sitThree sitYoung sit

Page 7: Computational Grammar - PAN Localization

www.PANL10n.net 7

Movement

I am meeting with the nice old manWith the nice old man, I am meetingWith, I am meeting, the nice old manWith the, I am meeting, nice old manWith the nice, I am meeting, old man

Page 8: Computational Grammar - PAN Localization

www.PANL10n.net 8

Rule Definition using CFG

Context-Free GrammarA set of non-terminal symbols, NA set of terminal symbols, ΣA set of productions, P, such that each production is of the form A α, where A belongs to N and αis a string from (Σ U N)*A start symbol S

S AA aA | a

a, aa, aaa, aaaa, …

Page 9: Computational Grammar - PAN Localization

Grammar Development

Page 10: Computational Grammar - PAN Localization

www.PANL10n.net 10

Noun Phrase

[John] went[She] came[The big brown loyal dog] is missing[Some of the books] have got ruined by the rainThe child tried [ten ice-cream flavors]I saw [the naughty girls planning mischief in the corner].[The man who took the secret packet from me], vanished

Page 11: Computational Grammar - PAN Localization

www.PANL10n.net 11

Noun Phrase

[The book on the table] is good.Head = bookPre-Nominal Phrase = ThePost Nominal Phrase = on the table

[The man who took the secret packet] vanishedHead = manPre-Nominal Phrase = ThePost Nominal Phrase = who took the secret packet

Page 12: Computational Grammar - PAN Localization

www.PANL10n.net 12

NP – Grammar Rules

NP NPpronoun

NP NPnoun

NPnoun (PreNomP) n (PostNomP)

Page 13: Computational Grammar - PAN Localization

www.PANL10n.net 13

Pre-Nominal Phrase

A good book The nice girl Few boysFirst three hoursThe three old mice

Page 14: Computational Grammar - PAN Localization

www.PANL10n.net 14

PreNomP – Grammar Rules

PreNomP (DetP) (NumberP) (AdjP) (NounModP)

DetPa, the , an

NumberPsome, few , three, third, first three

AdjPnice, good, white, cloudy, warm

NounModPcricket( in “cricket team”), grammar (in “grammar book”)

Page 15: Computational Grammar - PAN Localization

www.PANL10n.net 15

PreNomP - Adjectives

NPnoun (PreNomP) n (PostNomP)PreNomP (DetP) (NumberP) (AdjP) (NounModP)

AdjP adj

n: bookadj: good

good

AdjP

adj

n

NPnoun

PreNomPbook

Page 16: Computational Grammar - PAN Localization

www.PANL10n.net 16

PreNomP - Determiner

NPnoun (PreNomP) n (PostNomP)PreNomP (DetP) (NumberP) (AdjP) (NounModP) AdjP adj

DetP det

n: bookadj: good

det: aa

DetP

det

book

NPnoun

PreNomP n

adjgood

AdjP

Page 17: Computational Grammar - PAN Localization

www.PANL10n.net 17

Over-Generation

NPnoun (PreNomP) n (PostNomP)PreNomP (DetP) (NumberP) (AdjP) (NounModP) AdjP adj

DetP det

n: bookadj: gooddet: a

n: books a

DetP

det

book

NPnoun

PreNomP n

adjgood

AdjP

a

DetP

det

books

NPnoun

PreNomP n

adjgood

AdjP

Page 18: Computational Grammar - PAN Localization

www.PANL10n.net 18

Agreement Problem

A bookA books

The bookThe books

Page 19: Computational Grammar - PAN Localization

www.PANL10n.net 19

Solution

CFG is not sufficientNeed to augment it Many frameworks availableWe look at Lexical Functional Grammar

Page 20: Computational Grammar - PAN Localization

www.PANL10n.net 20

Lexical Functional Grammar

Defines two different syntactic structuresC(Constituent) Structure represents linear and hierarchical organization of words into phrasesF(Functional) Structure represents abstract functional syntactic organization of the sentence, representing syntactic predicate-argument structure and functional relations like subject and object

Page 21: Computational Grammar - PAN Localization

www.PANL10n.net 21

Page 22: Computational Grammar - PAN Localization

www.PANL10n.net 22

Grammatical Functions

SUBJect, OBJect, OBJ2, He(SUBJ) made a cake(OBJ)

He(SUBJ) made me(OBJ) a cake(OBJ2)

COMPlement, XCOMPlementCOMP is a closed function that contain an internal SUBJ phrase

David complained that Chris yawned. (COMP)XCOMP is an open function that does not contain an internal SUBJ phrase

David seemed to yawn. (XCOMP)

Page 23: Computational Grammar - PAN Localization

www.PANL10n.net 23

Grammatical Functions

ADJunct, XADJunctADJ is optional

He works in the morning. (ADJ)XADJ is an adjunct having a verb without subject (subject outside the clause)

Stretching his arms, David yawned. (XADJ).

OBLiqueRequired arguments, normally attached with verbs

David gave the book to Chris. (OBL)

Page 24: Computational Grammar - PAN Localization

www.PANL10n.net 24

Grammatical Functions

Governable Grammatical FunctionsThese are subcategorized (required) by predicate

SUBJ, OBJ, OBJ2, COMP, XCOMP, OBL

Modifying Grammatical FunctionsThese are not subcategorized (required) by predicate

ADJ, XADJ

Page 25: Computational Grammar - PAN Localization

www.PANL10n.net 25

Well-formedness

CompletenessF-Structure contains all the governable grammatical functions that its predicate governs: eat<SUBJ,OBJ>They eat an appleThey eat

CoherenceF-Structure disallows extra governable grammatical functions that its predicate does not governsrun<SUBJ>

They runThey run stadium

ConsistencyAn Attribute in F-Structure can have at most one value

A boy runsA boys runs

Page 26: Computational Grammar - PAN Localization

www.PANL10n.net 26

Sample F-Structures

Good book

Good books⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎦

⎤⎢⎣

⎡ good''PREDADJUNCT{NOM,ACC}CASE

SGNUMbook''PRED

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎦

⎤⎢⎣

⎡ good''PREDADJUNCT{NOM,ACC}CASE

PLNUMbook''PRED

Page 27: Computational Grammar - PAN Localization

www.PANL10n.net 27

PreNomP - Adjectives

NPnoun (PreNomP) n (PostNomP)PreNomP (DetP) (NumberP) (AdjP) (NounModP)

AdjP adj

n: bookadj: good

good

AdjP

adj

n

NPnoun

PreNomPbook

Page 28: Computational Grammar - PAN Localization

www.PANL10n.net 28

Lexicon

good: adj, ↑ PRED = ‘good’

book: noun, ↑PRED = ‘book’, ↑ NUM = SG, ↑CASE = {NOM,ACC}

books: noun, ↑PRED = ‘book’, ↑ NUM = PL, ↑CASE = {NOM,ACC}

Page 29: Computational Grammar - PAN Localization

www.PANL10n.net 29

NP – Grammar Rules

Simple Syntax:

NP NPnounNPnoun (PreNomP) n (PostNomP)

With Functional Annotation:

NP NPnoun:↑=↓; NPnoun (PreNomP:↑=↓;) n:↑=↓; (PostNomP:↑=↓;)PreNomP (AdjP:↑ ADJUNCT = ↓;) (NumberP) (AdjP) (NounModP) AdjP adj:↑ = ↓;

Page 30: Computational Grammar - PAN Localization

www.PANL10n.net 30

Unification

Unification is a (partial) operation on feature structures that combines two feature structures such that the new feature structure contains all the information of the original two, and nothing more. If there is no smallest feature structure that is subsumed by both then we say that the unification is undefined.

Page 31: Computational Grammar - PAN Localization

www.PANL10n.net 31

Unification Examples

⎥⎥⎥⎥

⎢⎢⎢⎢

=3PERS

SGNUM

⎥⎦

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

⎡ SGNUMandPLSG,NUM ⎥⎦

⎤⎢⎣

⎡= SGNUM

⎥⎦

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

⎡ 3PERSandSGNUM

⎥⎦

⎤⎢⎣

⎡⎥⎦

⎤⎢⎣

⎡ PLNUMandSGNUM UNDEFINED=

Page 32: Computational Grammar - PAN Localization

www.PANL10n.net 32

Sample F-Structures

Good book

Good books⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎦

⎤⎢⎣

⎡ good''PREDADJUNCT{NOM,ACC}CASE

SGNUMbook''PRED

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎦

⎤⎢⎣

⎡ good''PREDADJUNCT{NOM,ACC}CASE

PLNUMbook''PRED

Page 33: Computational Grammar - PAN Localization

www.PANL10n.net 33

Over-Generation

NPnoun (PreNomP) n (PostNomP)PreNomP (DetP) (NumberP) (AdjP) (NounModP) AdjP adj

DetP det

n: bookadj: gooddet: a

n: books a

DetP

det

book

NPnoun

PreNomP n

adjgood

AdjP

a

DetP

det

books

NPnoun

PreNomP n

adjgood

AdjP

Page 34: Computational Grammar - PAN Localization

www.PANL10n.net 34

NP – Lexicon and Grammar (2)

a: det, ↑ DEFINITE = NEG, ↑ NUM = SG.the: det, ↑ DEFINITE = POS, ↑ NUM = {SG,PL}.

PreNomP (DetP: ↑ SPEC = ↓, ↑NUM =c ↓NUM;) (AdjP:↑ ADJUNCT = ↓;)

DetP det:↑=↓; .

Page 35: Computational Grammar - PAN Localization

www.PANL10n.net 35

Agreement

A book

The books ⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎥⎥⎥

⎢⎢⎢⎢

SGNUMNEGDEFINITESPEC

{NOM,ACC}CASESGNUM

book''PRED

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎥⎥⎥

⎢⎢⎢⎢

PL}{SG,NUMPOSDEFINITESPEC

ACC}{NOM,CASEPLNUM

book''PRED

Page 36: Computational Grammar - PAN Localization

www.PANL10n.net 36

Agreement

A books

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎥⎥⎥

⎢⎢⎢⎢

SG*NUMNEGDEFINITESPEC

{NOM,ACC}CASEPL*NUM

book''PRED

Page 37: Computational Grammar - PAN Localization

www.PANL10n.net 37

Sentence

John drives the carJohn sleeps

S NP VPVP v (NP)

Page 38: Computational Grammar - PAN Localization

www.PANL10n.net 38

Sentence with F-Descriptions

S NP: ↑ SUBJ= ↓; VP:↑=↓; .VP v:↑ =↓; (NP: ↑ OBJ=↓;) .

sleeps: v, ↑ PRED = ‘sleep’, …drives: v ↑ PRED = ‘drive’ , …

Page 39: Computational Grammar - PAN Localization

www.PANL10n.net 39

Sentence – Sample F-Structure

John drives the car.

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎥⎥⎥

⎢⎢⎢⎢

⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢

........PL}{SG,NUM

POSDEFINITESPEC

{NOM,ACC}CASESGNUM'car'PRED

OBJ

{NOM,ACC}CASESGNUM

John''PREDSUBJ

drive''PRED

Page 40: Computational Grammar - PAN Localization

www.PANL10n.net 40

Problem

John drivesJohn sleeps the car

Page 41: Computational Grammar - PAN Localization

www.PANL10n.net 41

Solution: Sub-Categorization Frame

sleeps: v, ↑ PRED = ‘sleep<SUBJ>’, …drives: v ↑ PRED = ‘drive<SUBJ,OBJ>’ , …

likes:v, ↑ PRED = ‘like<SUBJ,OBJ>, …likes:v, ↑ PRED = ‘like<SUBJ,XCOMP>, …

John likes the car.John likes to drive the car.

Page 42: Computational Grammar - PAN Localization

www.PANL10n.net 42

Sentence – Sample F-Structure

John drives the car.

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎥⎥⎥

⎢⎢⎢⎢

⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢

........PL}{SG,NUM

POSDEFINITESPEC

{NOM,ACC}CASESGNUM'car'PRED

OBJ

{NOM,ACC}CASESGNUM

John''PREDSUBJ

drive''PRED

Page 43: Computational Grammar - PAN Localization

www.PANL10n.net 43

Pronouns and Cases

NP NPpronounNPpronoun pronoun

He likes her John likes Mary

Her likes he Mary likes JohnShe Likes him

Page 44: Computational Grammar - PAN Localization

www.PANL10n.net 44

Pronouns : Lexicon

he: pronoun, ↑ PRED = ‘pro’, ↑ PERS = 3, ↑ NUM = SG, ↑ GEND = M, ↑ CASE = NOM

she: pronoun, ↑ PRED = ‘pro’, ↑ PERS = 3, ↑ NUM = SG, ↑GEND = F, ↑ CASE = NOM

I: pronoun, ↑ PRED = ‘pro’, ↑ PERS = 3, ↑ NUM = SG, ↑ GEND = {M,F}, ↑ CASE = NOM

him: pronoun, ↑ PRED = ‘pro’, ↑ PERS = 3, ↑ NUM = SG, ↑GEND = M, ↑ CASE = ACC

her: pronoun, ↑ PRED = ‘pro’, ↑ PERS = 3, ↑ NUM = SG, ↑GEND = F, ↑ CASE = ACC

Page 45: Computational Grammar - PAN Localization

www.PANL10n.net 45

Pronoun Case

him/her/itthem

he/ she/itthey

3rd person

youyou2nd person

meus

I we

1st person

Objective(ACC/DAT)

Subjective (NOM)Singular (SG)

Personal Pronouns (PERS)

Page 46: Computational Grammar - PAN Localization

www.PANL10n.net 46

Sentence – Revised Grammar Rules

S -> NP: ↑ SUBJ= ↓, ↓ CASE =c NOM; VP:↑=↓;

VP -> v:↑ =↓; (NP: ↑ OBJ=↓, ↓ CASE =c ACC;)

Page 47: Computational Grammar - PAN Localization

www.PANL10n.net 47

Other Phrases

Verbal Phrase with tense and aspectPrepositional PhraseAdverbial PhraseConjunctions…

Page 48: Computational Grammar - PAN Localization

www.PANL10n.net 48

Urdu Grammar – Some Remarks

Agreements within Noun Phrase

Case MarkingFree Order Language

Page 49: Computational Grammar - PAN Localization

www.PANL10n.net 49

Agreements within NP: Gender and Numberö achcha)ا larka)

achchi) ا larki)achche) ا larke)

öا * (achchi larka)ا * (achcha larki)öا * (achche larka)

Page 50: Computational Grammar - PAN Localization

www.PANL10n.net 50

NP - Urdu Lexicon and Grammar

↑ ,verb :ا PRED = ‘ ↑ ,’ا NUM = SG, ↑GEND = M.ا : verb, ↑ PRED = ‘ ↑ ,’ا NUM = SG, ↑

GEND = F.

ö : verb, ↑ PRED = ‘ ö ’, ↑ NUM = SG, ↑GEND = M.

: verb, ↑ PRED = ‘ ’, ↑ NUM = SG, ↑GEND = F.

Page 51: Computational Grammar - PAN Localization

www.PANL10n.net 51

NP - Urdu Lexicon and Grammar (2)

NPnoun -> (AdjP: ↑ ADJUNCT =↓, ↑ NUM= ↓NUM, ↑GEND = ↓ GEND,…;) n:↑=↓; .

ö ا (achcha larka)

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢

MGENDSGNUM

'achcha'PREDADJUNCT

MGENDSGNUM

lerka''PRED

Page 52: Computational Grammar - PAN Localization

www.PANL10n.net 52

Case MarkingEight (Morphological) Cases of Sanskrit

Addressing/CallingVocative relationGenitive comparisonAblative positionLocative CauseInstrumental Indirect ObjectDative Direct ObjectAccusativeSubjectNominative

RepresentsCase

Page 53: Computational Grammar - PAN Localization

www.PANL10n.net 53

Case Phrase and Case Markers

]ö [ ]۔ ] مآ[larka] [aam] khata hay

]ð ] اŠ] [ [ ] ] õ رى[را] وق [

[shikari ne] [shair ko] [[africa ke] jungle mein] [bandooq se] mara.] Š[ ]دى۔ ] وق[ ] رى

[larke ne] [shikari ko] [bandooq] di.] Š[ ]دى۔ ] وق[ ]ا

[larke ne] [usse] [bandooq] di.

Page 54: Computational Grammar - PAN Localization

www.PANL10n.net 54

Case Markers

Urdu has case markers to represent casesŠ:cm, ↑ CASE = ERG

: cm, ↑ CASE = {ACC,DAT} : cm, ↑ CASE = INST : cm, ↑ CASE = LOC

Some pronouns have case through morphology

↑ ,pronoun:ا PRED = ‘pro’, ↑ PERS = 3, NUM = SG, GEND ={M,F}, ↑ CASE = {ACC,DAT}

Page 55: Computational Grammar - PAN Localization

www.PANL10n.net 55

Case Phrase

KP NPnoun : ↑=↓, ↑CASE = NOM; .KP NPnoun : ↑=↓; cm : ↑=↓; .

Š رى (shikari ne) ð (jungle mein)

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

ERGCASEMGEND

SGNUMshikari''PRED

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

LOC_MENCASEMGEND

SGNUMjungle''PRED

Page 56: Computational Grammar - PAN Localization

www.PANL10n.net 56

Word Order

۔ د] ] [Š رى[[shikari ne] [shair ko] dekha. ۔د] Š رى][ [[shair ko] [shikari ne] daikha. ۔] [ د]Š رى[[shikari ne] daikha [shair ko].

] Š ][۔د] رى[shair ne] [shikari ko] dekha. ۔د] Š [ ] رى[[shikari ko] [shair ne] daikha.

Page 57: Computational Grammar - PAN Localization

www.PANL10n.net 57

Word Order and Shuffle Operator

S KP:↑=SUBJ=↓, ↓ CASE =c NOM;KP:↑=OBJ=↓, ↓ CASE =c ACC;VerbalP: ↑=↓, …;

S [KP:↑=SUBJ=↓, ↓ CASE =c NOM;% KP:↑=OBJ=↓, ↓ CASE =c ACC;% VerbalP: ↑=↓, …; ]

Page 58: Computational Grammar - PAN Localization

www.PANL10n.net 58

F-Structure of “He walks”

⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢⎢

⎥⎥⎥⎥⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢⎢⎢⎢⎢

><

NominativeCaseTrueAnimatedHeForm

ThirdPersonMasculineGenderSingularNumberPronounPred

Subject

FalsePerfFalseProg

PresentTenseSubjectwalkPred

Page 59: Computational Grammar - PAN Localization

www.PANL10n.net 59

Parsing Algorithms

Parsing as searchChart Parsing

Page 60: Computational Grammar - PAN Localization

www.PANL10n.net 60

Parsing as Search

Top Down AlgorithmOnly builds trees that are rooted in SWastes time building trees that don't match the input

Bottom Up AlgorithmOnly builds trees that match the inputWastes time building trees that never lead to S

Page 61: Computational Grammar - PAN Localization

www.PANL10n.net 61

Problems in Search Algorithm

Left RecursionNP NP PPVP VP PPS S and S

Left Recursion can be removed from productions of the CFG, but the transformed grammar gives different parse structures

Page 62: Computational Grammar - PAN Localization

www.PANL10n.net 62

Problems in Search AlgorithmRepeated ParsingAmbiguous Grammar results in repeated parses of same structures.

The phrases “I”, “an elephant”, “in my pajamas” are parsed twice in two different parses

Page 63: Computational Grammar - PAN Localization

www.PANL10n.net 63

Chart Parsing

Chart parsing is an example of dynamic programming.Dynamic Programming solves problem by finding solutions of sub-problems, and then combining them to form complete solution.In Chart parsing, Sub-trees are built, that are used by more than one parse trees.Chart parsing avoids Left Recursion and Repeated parsing problems.

Page 64: Computational Grammar - PAN Localization

www.PANL10n.net 64

Chart Parsing Algorithm

A chart consists of N + 1 states (where N is the number of words in the input). The algorithm goes through the states from left to right and processes each state in order applying one of the following operators to the state:

PredictorCompleterScanner

Page 65: Computational Grammar - PAN Localization

www.PANL10n.net 65

Predictor

Predictor creates new states from top-down application of rules.

S -> .VP, [0,0] applying Predictor on above results in adding the following new states on the chart.VP -> .verb, [0,0]VP -> .verb NP, [0,0]

Page 66: Computational Grammar - PAN Localization

www.PANL10n.net 66

Completer

The completer is applied to a state when its dot has reached the right end of a rule. (whole phrase is parsed.)The completer takes old states and combines them into new states.

S -> . NP VP [0,0]NP -> det noun. [0,2]Applying Cpmpleter on above results in adding the following new state on the chart.S -> NP . VP [0,2]

Page 67: Computational Grammar - PAN Localization

www.PANL10n.net 67

Scanner

The dot advances and new state is formed if there is POS Category at right of the dot, and this category is also predicted by a state (by current word of input sentence).

VP -> . verb NP [0, 1] scanner checks that current word is a verb and adds new state. VP -> verb . NP [0, 2]

Page 68: Computational Grammar - PAN Localization

Thank You

Page 69: Computational Grammar - PAN Localization

www.PANL10n.net 69

References and Further Readings

Jurafsky, Daniel. (2000), Speech and language Processing, Prentice Hall, New Jersey.Butt, Miraiam. (1999), A Grammar Writer's Cookbook. CSLI Publications, Stanford, CA.http://cslu.cse.ogi.edu/HLTsurvey/ch3node5.htmlhttp://www.coli.uni-saarland.de/~kris/nlp-with-prolog/html/node83.htmlhttp://emsah.uq.edu.au/linguistics/Working%20Papers/ananda_ling/LFG_Summary.htmhttp://www.stanford.edu/group/nasslli/courses/as-cr-da/apbook-nasslli.pdfhttp://www.sfu.ca/person/dearmond/222/222.x-bar.htm