Posterior Regularization for Structured Latent Variable Models Li Zhonghua I2R SMT Reading Group.
-
Upload
reginald-owen -
Category
Documents
-
view
228 -
download
0
Transcript of Posterior Regularization for Structured Latent Variable Models Li Zhonghua I2R SMT Reading Group.
Posterior Regularization for Structured Latent Variable Models
Li ZhonghuaI2R SMT Reading Group
Outline
• Motivation and Introduction• Posterior Regularization• Application• Implementation• Some Related Frameworks
Motivation and Introduction
Prior Knowledge
We posses a wealth of prior knowledge about most NLP tasks.
Motivation and Introduction --Prior Knowledge
Motivation and Introduction --Prior Knowledge
Motivation and Introduction
Leveraging Prior Knowledge Possible approaches and their limitations
Motivation and Introduction --Limited Approach
Bayesian Approach : Encode prior knowledge with a prior on parameters
Limitation: Our prior knowledge is not about parameters!Parameters are difficult to interpret; hard to get desired effect.
Augmenting Model : Encode prior knowledge with additional variables and dependencies.
Motivation and Introduction --Limited Approach
limitation: may make exact inference intractable
Posterior Regularization
• A declarative language for specifying prior knowledge
-- Constraint Features & Expectations
• Methods for learning with knowledge in this language
-- EM style learning algorithm
Posterior Regularization
Posterior Regularization
Original Objective :
Posterior RegularizationEM style learning algorithm
Posterior Regularization
Computing the Posterior Regularizer
Application
Statistical Word AlignmentsIBM Model 1 and HMM
Application
One feature for each source word m, that counts how many times it is aligned to a target word in the alignment y.
Application
Define feature for each target-source position pair i,j . The feature takes the value zero in expectation if a word pair i ,j is aligned with equal probability in both directions.
Application
Learning Tractable Word Alignment Models with Complex Constraints CL10
Application
• Six language pairs• both types of constraints improve over the
HMM in terms of both precision and recall• improve over the HMM by 10% to 15%• S-HMM performs slightly better than B-HMM• S-HMM performs better than B-HMM in 10
out of 12 cases• improve over IBM M4 9 times out of 12
Application
Implementation
• http://code.google.com/p/pr-toolkit/
Some Related Frameworks
Some Related Frameworks
Some Related Frameworks
Some Related Frameworks
Some Related Frameworks
more info: http://sideinfo.wikkii.com many of my slides get from there
Thanks!