Inducing Structure for Perception

download Inducing Structure for Perception

of 99

  • date post

    30-Jan-2016
  • Category

    Documents

  • view

    18
  • download

    0

Embed Size (px)

description

Inducing Structure for Perception. a.k.a. Slav’s split&merge Hammer. Slav Petrov Advisors: Dan Klein, Jitendra Malik Collaborators: L. Barrett, R. Thibaux, A. Faria, A. Pauls, P. Liang, A. Berg. The Main Idea. True structure. Manually specified structure. MLE structure. He was right. - PowerPoint PPT Presentation

Transcript of Inducing Structure for Perception

  • Inducing Structure for PerceptionSlav Petrov

    Advisors: Dan Klein, Jitendra MalikCollaborators: L. Barrett, R. Thibaux, A. Faria, A. Pauls, P. Liang, A. Berga.k.a. Slavs split&merge Hammer

  • The Main IdeaComplex underlying processObservationManually specified structureTrue structureMLE structureHe was right.

  • The Main IdeaComplex underlying processObservationHe was right.Manually specified structureAutomatically refined structureEM

  • Why Structure?the the the food cat dog ate andt e c a e h t g f a o d o o d n h e t d a

  • Structure is important

  • Syntactic AmbiguityLast night I shot an elephant in my pajamas.

  • Visual AmbiguityOld or young?

  • Three Peaks?

  • No, One Mountain!

  • Three Domains

  • Timeline

  • SyntaxLanguageModelingSplit & MergeLearningSyntacticMachineTranslationCoarse-to-FineInferenceNon-parametricBayesianLearningGenerativevs. ConditionalLearningSyntax

  • Learning accurate, compact and interpretable Tree AnnotationSlav Petrov, Leon Barrett, Romain Thibaux, Dan Klein

  • Motivation (Syntax)Task:He was right. Why? Information Extraction Syntactic Machine Translation

  • Treebank Parsing

  • Non-IndependenceIndependence assumptions are often too strong.All NPs

    Chart5

    0.113

    0.093

    0.055

    Sheet1

    ALLSUBJECTOBJECT

    NP PP11.39.323

    DT NN9.38.86.7

    PRP5.520.53.7

    ALLSUBJECTOBJECT

    NP PP0.1130.0930.23

    DT NN0.0930.0880.067

    PRP0.0550.2050.037

    Sheet1

    0

    0

    0

    Sheet2

    0

    0

    0

    Sheet3

    0.23

    0.067

    0.037

  • The Game of Designing a GrammarAnnotation refines base treebank symbols to improve statistical fit of the grammarParent annotation [Johnson 98]

  • The Game of Designing a GrammarAnnotation refines base treebank symbols to improve statistical fit of the grammarParent annotation [Johnson 98]Head lexicalization [Collins 99, Charniak 00]

  • The Game of Designing a GrammarAnnotation refines base treebank symbols to improve statistical fit of the grammarParent annotation [Johnson 98]Head lexicalization [Collins 99, Charniak 00]Automatic clustering?

  • Learning Latent AnnotationsEM algorithm:

    Brackets are known Base categories are known Only induce subcategoriesJust like Forward-Backward for HMMs.

  • Inside/Outside ScoresInside:Outside: Ax

  • Learning Latent Annotations (Details)E-Step:

    M-Step:

  • Overview- Hierarchical Training- Adaptive Splitting- Parameter Smoothing

  • Refinement of the DT tagDT

  • Refinement of the DT tagDT

  • Hierarchical refinement of the DT tagDT

  • Hierarchical Estimation Results

    ModelF1Baseline87.3Hierarchical Training88.4

    Chart2

    63.863.8

    76.176.1

    83.283.6

    86.887.2

    87.388.4

    Flat Training

    Hierarchical Training

    Total Number of grammar symbols

    Parsing accuracy (F1)

    Sheet1

    9863.863.8

    19876.176.114676.175.1

    39383.283.621683.784.3

    78586.887.232187.287.4

    156987.388.447988.489.1

    71689.190

    104389.590.7

    Sheet1

    Flat Training

    Hierarchical Training

    Total Number of grammar symbols

    Parsing accuracy (F1)

    Hierarchical Training

    Sheet2

    50% Merging

    Hierarchical Training

    Flat Training

    Total Number of grammar symbols

    Parsing accuracy (F1)

    Sheet3

    50% Merging and Smoothing

    50% Merging

    Hierarchical Training

    Flat Training

    Total Number of grammar symbols

    Parsing accuracy (F1)

    Flat Training

    Total Number of grammar symbols

    Parsing accuracy (F1)

  • Refinement of the , tagSplitting all categories the same amount is wasteful:

  • The DT tag revisited

  • Adaptive SplittingWant to split complex categories moreIdea: split everything, roll back splits which were least useful

  • Adaptive SplittingWant to split complex categories moreIdea: split everything, roll back splits which were least useful

  • Adaptive SplittingEvaluate loss in likelihood from removing each split =Data likelihood with split reversedData likelihood with splitNo loss in accuracy when 50% of the splits are reversed.

  • Adaptive Splitting (Details)True data likelihood:

    Approximate likelihood with split at n reversed:

    Approximate loss in likelihood:

  • Adaptive Splitting Results

    ModelF1Previous88.4With 50% Merging89.5

  • Number of Phrasal Subcategories

    Chart8

    37

    32

    28

    22

    21

    19

    15

    9

    5

    4

    4

    3

    2

    2

    2

    2

    2

    2

    2

    2

    1

    1

    1

    1

    1

    1

    1

    Sheet1

    NNP62NP37

    JJ58VP32

    NNS57PP28

    NN56ADVP22

    VBN49S21

    RB47ADJP19

    VBG40SBAR15

    VB37QP9

    VBD36WHNP5

    CD32PRN4

    IN27NX4

    VBZ25SINV3

    VBP19PRT2

    DT17WHPP2

    NNPS11SQ2

    CC7CONJP2

    JJR5FRAG2

    JJS5NAC2

    :5UCP2

    PRP4WHADVP2

    PRP$4INTJ1

    MD3SBARQ1

    RBR3RRC1

    WP2WHADJP1

    POS2X1

    PDT2ROOT1

    WRB2LST1

    -LRB-2

    .2

    EX2

    WP$2

    WDT2

    -RRB-2

    ''1

    FW1

    RBS1

    TO1

    $1

    UH1

    ,1

    ``1

    SYM1

    RP1

    LS1

    #1

    Sheet1

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    Sheet2

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    Sheet3

  • Number of Phrasal SubcategoriesPPVPNP

    Chart8

    37

    32

    28

    22

    21

    19

    15

    9

    5

    4

    4

    3

    2

    2

    2

    2

    2

    2

    2

    2

    1

    1

    1

    1

    1

    1

    1

    Sheet1

    NNP62NP37

    JJ58VP32

    NNS57PP28

    NN56ADVP22

    VBN49S21

    RB47ADJP19

    VBG40SBAR15

    VB37QP9

    VBD36WHNP5

    CD32PRN4

    IN27NX4

    VBZ25SINV3

    VBP19PRT2

    DT17WHPP2

    NNPS11SQ2

    CC7CONJP2

    JJR5FRAG2

    JJS5NAC2

    :5UCP2

    PRP4WHADVP2

    PRP$4INTJ1

    MD3SBARQ1

    RBR3RRC1

    WP2WHADJP1

    POS2X1

    PDT2ROOT1

    WRB2LST1

    -LRB-2

    .2

    EX2

    WP$2

    WDT2

    -RRB-2

    ''1

    FW1

    RBS1

    TO1

    $1

    UH1

    ,1

    ``1

    SYM1

    RP1

    LS1

    #1

    Sheet1

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    Sheet2

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    Sheet3

  • Number of Phrasal SubcategoriesXNAC

    Chart8

    37

    32

    28

    22

    21

    19

    15

    9

    5

    4

    4

    3

    2

    2

    2

    2

    2

    2

    2

    2

    1

    1

    1

    1

    1

    1

    1

    Sheet1

    NNP62NP37

    JJ58VP32

    NNS57PP28

    NN56ADVP22

    VBN49S21

    RB47ADJP19

    VBG40SBAR15

    VB37QP9

    VBD36WHNP5

    CD32PRN4

    IN27NX4

    VBZ25SINV3

    VBP19PRT2

    DT17WHPP2

    NNPS11SQ2

    CC7CONJP2

    JJR5FRAG2

    JJS5NAC2

    :5UCP2

    PRP4WHADVP2

    PRP$4INTJ1

    MD3SBARQ1

    RBR3RRC1

    WP2WHADJP1

    POS2X1

    PDT2ROOT1

    WRB2LST1

    -LRB-2

    .2

    EX2

    WP$2

    WDT2

    -RRB-2

    ''1

    FW1

    RBS1

    TO1

    $1

    UH1

    ,1

    ``1

    SYM1

    RP1

    LS1

    #1

    Sheet1

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    Sheet2

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    0

    Sheet3

  • Number of Lexical SubcategoriesTO,POS

    Chart6

    62

    58

    57

    56

    49

    47

    40

    37

    36

    32

    27

    25

    19

    17

    11

    7

    5

    5

    5

    4

    4

    3

    3

    2

    2

    2

    2

    2

    2

    2

    2

    2

    2

    1

    1

    1

    1

    1

    1

    1

    1

    1

    1

    1

    1

    Sheet1

    NNP62NP37

    JJ58VP32

    NNS57PP28

    NN56ADVP22

    VBN49S21

    RB47ADJP19

    VBG40SBAR15

    VB37QP9

    VBD36WHNP5

    CD32PRN4

    IN27NX4

    VBZ25SIN