Dependency Parsing Parsing Algorithms Peng.Huang [email protected]

of 56 /56
Dependency Parsing Dependency Parsing Parsing Algorithms Parsing Algorithms Peng.Huang [email protected]

Embed Size (px)

Transcript of Dependency Parsing Parsing Algorithms Peng.Huang [email protected]

  • Dependency ParsingParsing [email protected]

    asc irm-saa 08

  • OutlineIntroductionPhrase Structure Grammar Dependency GrammarComparison and ConversionDependency ParsingFormal definitionParsing AlgorithmsIntroductionDynamic programming Constraint satisfactionDeterministic search

    *

    asc irm-saa 08

  • IntroductionSyntactic Parsingfinding the correct syntactic structure of that sentence in a given formalism/grammar

    Two FormalismDependency Parsing (DG)Phrase Structure Grammar (PSG)*

    asc irm-saa 08

  • Phrase Structure Grammar (PSG)Breaks sentence into constituents (phrases)Which are then broken into smaller constituentsDescribes phrase structure, clause structureE.g.. NP, PP, VP etc..Structures often recursiveThe clever tall blue-eyed old man

    *

    asc irm-saa 08

  • Tree Structure

    *

    asc irm-saa 08

  • Dependency GrammarSyntactic structure consists of lexical items, linked by binary asymmetric relations called dependenciesInterested in grammatical relations between individual words (governing & dependent words)Does not propose a recursive structureRather a network of relationsThese relations can also have labels

    *

    asc irm-saa 08

  • ComparisonDependency structures explicitly represent Head-dependent relations (directed arcs)Functional categories (arc labels)Possibly some structural categories (parts-of-speech)

    Phrase structure explicitly represent Phrases (non-terminal nodes)Structural categories (non-terminal labels)Possibly some functional categories (grammatical functions)

    *

    asc irm-saa 08

  • Conversion PSG to DGHead of a phrase governs/dominates all the siblings

    Heads are calculated using heuristics

    Dependency relations are established between the head of each phrase as the parent and its siblings as children

    The tree thus formed is the unlabeled dependency tree

    *

    asc irm-saa 08

  • PSG to DG

    *

    asc irm-saa 08

  • Conversion DG to PSGEach head together with its dependents (and their dependents) forms a constituent of the sentence

    Difficult to assign the structural categories (NP, VP, S, PP etc) to these derived constituents

    Every projective dependency grammar has a strongly equivalent context-free grammar, but not vice versa [Gaifman 1965]

    *

    asc irm-saa 08

  • OutlineIntroductionPhrase Structure Grammar Dependency GrammarComparison and ConversionDependency ParsingFormal definitionParsing AlgorithmsIntroductionDynamic programming Constraint satisfactionDeterministic search

    *

    asc irm-saa 08

  • Dependency TreeFormal definitionAn input word sequence w1wnDependency graph D = (W,E) whereW is the set of nodes i.e. word tokens in the input seq E is the set of unlabeled tree edges (wi, wj) (wi, wj W)(wi, wj) indicates an edge from wi (parent) to wj (child)Formal definitionTask of mapping an input string to a dependency graph satisfying certain conditions is called dependency parsing

    *

    asc irm-saa 08

  • Well-formednessA dependency graph is well-formed iff

    Single head: Each word has only one head

    Acyclic: The graph should be acyclic

    Connected: The graph should be a single tree with all the words in the sentence

    Projective: If word A depends on word B, then all words between A and B are also subordinate to B (i.e. dominated by B)*

    asc irm-saa 08

  • *Non-projective dependency treeRam saw a dog yesterday which was a Yorkshire Terrier* Crossing linesEnglish has very few non-projective cases.

    asc irm-saa 08

  • OutlineIntroductionPhrase Structure Grammar Dependency GrammarComparison and ConversionDependency ParsingFormal definitionParsing AlgorithmsIntroductionDynamic programming Constraint satisfactionDeterministic search

    *

    asc irm-saa 08

  • Dependency Parsing (P. Mannem)*Dependency Parsing

    Dependency based parsers can be broadly categorized into Grammar driven approachesParsing done using grammars.Data driven approachesParsing by training on annotated/un-annotated data.

    May 28, 2008

    asc irm-saa 08

  • Well-formednessA dependency graph is well-formed iff

    Single head: Each word has only one head

    Acyclic: The graph should be acyclic

    Connected: The graph should be a single tree with all the words in the sentence

    Projective: If word A depends on word B, then all words between A and B are also subordinate to B (i.e. dominated by B)*

    asc irm-saa 08

  • *Parsing MethodsThree main traditionsDynamic programmingCYK, Eisner, McDonald

    Constraint satisfactionMaruyama, Foth et al., Duchier

    Deterministic searchCovington, Yamada and Matsumuto, Nivre

    asc irm-saa 08

  • Dynamic ProgrammingBasic Idea: Treat dependencies as constituentsUse, e.g. , CYK parser (with minor modifications)*

    asc irm-saa 08

  • Eisner 1996Two novel aspects:Modified parsing algorithmProbabilistic dependency parsingTime requirement: O(n3) Modification: Instead of storing subtrees, store spansSpan: Substring such that no interior word links to any word outside the spanIdea: In a span, only the boundary words are active, i.e. still need a head or a child

    *

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_Example

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleindicatedfallingstocksRedfiguresSpans:

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_Assembly of correct parseRedfiguresStart by combining adjacent words to minimal spansfiguresononthe

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_Assembly of correct parseCombine spans which overlap in one word; this word must be governed by a word in the left or right span.onthethescreen+onthescreen

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_Assembly of correct parseCombine spans which overlap in one word; this word must be governed by a word in the left or right span.figureson+onthescreenfiguresonthescreen

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_Assembly of correct parseCombine spans which overlap in one word; this word must be governed by a word in the left or right span.Invalid spanRedfiguresonthescreen

    asc irm-saa 08

  • McDonalds Maximum Spanning TreesScore of a dependency tree = sum of scores of dependencies

    Scores are independent of other dependencies

    If scores are available, parsing can be formulated as maximum spanning tree problem

    Uses online learning for determining weight vector w

    *

    asc irm-saa 08

  • McDonalds AlgorithmScore entire dependency treeOnline Learning

    *

    asc irm-saa 08

  • *Parsing MethodsThree main traditionsDynamic programmingCYK, Eisner, McDonald

    Constraint satisfactionMaruyama, Foth et al., Duchier

    Deterministic searchCovington, Yamada and Matsumuto, Nivre

    asc irm-saa 08

  • Dependency Parsing (P. Mannem)*Constraint SatisfactionUses Constraint Dependency GrammarGrammar consists of a set of boolean constraints, i.e. logical formulas that describe well-formed treesA constraint is a logical formula with variables that range over a set of predefined valuesParsing is defined as a constraint satisfaction problemConstraint satisfaction removes values that contradict constraintsMay 28, 2008

    asc irm-saa 08

  • *Parsing MethodsThree main traditionsDynamic programmingCYK, Eisner, McDonald

    Constraint satisfactionMaruyama, Foth et al., Duchier

    Deterministic searchCovington, Yamada and Matsumuto, Nivre

    asc irm-saa 08

  • Deterministic ParsingBasic ideaDerive a single syntactic representation (dependency graph) through a deterministic sequence of elementary parsing actions Sometimes combined with backtracking or repair

    *

    asc irm-saa 08

  • Shift-Reduce Type AlgorithmsData structures:Stack [. . . ,wi ]S of partially processed tokensQueue [wj , . . .]Q of remaining input tokensParsing actions built from atomic actions:Adding arcs (wi wj , wi wj )Stack and queue operationsLeft-to-right parsing*

    asc irm-saa 08

  • *Yamadas AlgorithmThree parsing actions:Shift [. . .]S [wi , . . .]Q [. . . , wi ]S [. . .]QLeft[. . . , wi , wj ]S [. . .]Q [. . . , wi ]S [. . .]Q wi wjRight[. . . , wi , wj ]S [. . .]Q [. . . , wj ]S [. . .]Q wi wjMultiple passes over the input give time complexity O(n2)

    asc irm-saa 08

  • *Yamada and MatsumotoParsing in several rounds: deterministic bottom-up O(n2) Looks at pairs of words3 actions: shift, left, right

    Shift: shifts focus to next word pair

    asc irm-saa 08

  • *Yamada and MatsumotoRight: decides that the right word depends on the left wordLeft: decides that the left word depends on the right one

    asc irm-saa 08

  • *Nivres AlgorithmFour parsing actions:Shift[. . .]S [wi , . . .]Q[. . . , wi ]S [. . .]QReduce[. . . , wi ]S [. . .]Q wk : wk wi[. . .]S [. . .]QLeft-Arc[. . . , wi ]S [wj , . . .]Q wk : wk wi[. . .]S [wj , . . .]Q wi wjRight-Arc [. . . ,wi ]S [wj , . . .]Q wk : wk wj[. . . , wi , wj ]S [. . .]Q wi wj

    asc irm-saa 08

  • Dependency Parsing (P. Mannem)*Nivres Algorithm

    Characteristics:Arc-eager processing of right-dependentsSingle pass over the input gives time worst case complexity O(2n)

    May 28, 2008

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQ

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQShift

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQLeft-arc

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQShift

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQRight-arc

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQShift

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQLeft-arc

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQRight-arc

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQReduce

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQReduce

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQLeft-arc

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQRight-arc

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQShift

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQLeft-arc

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQRight-arc

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQReduce

    asc irm-saa 08

  • *Redfiguresonthescreenindicatedfallingstocks_ROOT_ExampleSQReduce

    asc irm-saa 08

  • The End

    QA*

    asc irm-saa 08

    1. A Shift action adds no dependency construction between the target words wiand wi+1 but simply moves the point of focus to the right, making wi+1 andwi+2 the new target words2. A Right action constructs a dependency relation between the target words,adding the left node wi as a child of the right node wi+1 and reducing thetarget words to wi+1, making wi1 and wi+1 the new target words3. A Left action constructs a dependency relation between the target words,adding the right node wi+1 as a child of the left node wi and reducing thetarget words to wi, making wi1 and wi the new target words**