MT for Languages with Limited Resources
description
Transcript of MT for Languages with Limited Resources
![Page 1: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/1.jpg)
MT for Languages with Limited Resources
11-731Machine Translation
April 20, 2011
Based on Joint Work with: Lori Levin, Jaime Carbonell, Stephan Vogel, Shuly Wintner, Danny Shacham, Katharina Probst, Erik Peterson, Christian Monson, Roberto Aranovich and Ariadna Font-Llitjos
![Page 2: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/2.jpg)
April 20, 2011 11-731 Machine Translation 2
Why Machine Translation for Minority and Indigenous Languages?
• Commercial MT economically feasible for only a handful of major languages with large resources (corpora, human developers)
• Is there hope for MT for languages with very limited resources?
• Benefits include:– Better government access to indigenous communities
(Epidemics, crop failures, etc.)– Better indigenous communities participation in
information-rich activities (health care, education, government) without giving up their languages.
– Language preservation– Civilian and military applications (disaster relief)
![Page 3: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/3.jpg)
April 20, 2011 11-731 Machine Translation 3
MT for Minority and Indigenous Languages: Challenges
• Minimal amount of parallel text• Possibly competing standards for
orthography/spelling• Often relatively few trained linguists• Access to native informants possible• Need to minimize development time
and cost
![Page 4: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/4.jpg)
April 20, 2011 11-731 Machine Translation 4
MT for Low Resource Languages
• Possible Approaches:– Phrase-based SMT, with whatever small amounts of
parallel data that is available– Build a rule-based system – need for bilingual
experts and resources– Hybrid approaches, such as the AVENUE Project
(Stat-XFER) approach: • Incorporate acquired manual resources within a general
statistical framework• Augment with targeted elicitation and resource
acquisition from bilingual non-experts
![Page 5: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/5.jpg)
April 20, 2011 11-731 Machine Translation 5
CMU Statistical Transfer (Stat-XFER) MT Approach
• Integrate the major strengths of rule-based and statistical MT within a common framework:– Linguistically rich formalism that can express complex and
abstract compositional transfer rules– Rules can be written by human experts and also acquired
automatically from data– Easy integration of morphological analyzers and
generators– Word and syntactic-phrase correspondences can be
automatically acquired from parallel text– Search-based decoding from statistical MT adapted to find
the best translation within the search space: multi-feature scoring, beam-search, parameter optimization, etc.
– Framework suitable for both resource-rich and resource-poor language scenarios
![Page 6: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/6.jpg)
April 20, 2011 11-731 Machine Translation 6
Stat-XFER Main Principles
• Framework: Statistical search-based approach with syntactic translation transfer rules that can be acquired from data but also developed and extended by experts
• Automatic Word and Phrase translation lexicon acquisition from parallel data
• Transfer-rule Learning: apply ML-based methods to automatically acquire syntactic transfer rules for translation between the two languages
• Elicitation: use bilingual native informants to produce a small high-quality word-aligned bilingual corpus of translated phrases and sentences
• Rule Refinement: refine the acquired rules via a process of interaction with bilingual informants
• XFER + Decoder:– XFER engine produces a lattice of possible transferred
structures at all levels– Decoder searches and selects the best scoring combination
![Page 7: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/7.jpg)
April 20, 2011 11-731 Machine Translation 7
Stat-XFER FrameworkSourceInput
Preprocessing
Morphology
TransferEngine
TransferRules
BilingualLexicon
TranslationLattice
Second-StageDecoder
LanguageModel
WeightedFeatures
TargetOutput
![Page 8: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/8.jpg)
Transfer Engine
English Language Model
Transfer Rules{NP1,3}NP1::NP1 [NP1 "H" ADJ] -> [ADJ NP1]((X3::Y1) (X1::Y2) ((X1 def) = +) ((X1 status) =c absolute) ((X1 num) = (X3 num)) ((X1 gen) = (X3 gen)) (X0 = X1))
Translation Lexicon
N::N |: ["$WR"] -> ["BULL"]((X1::Y1) ((X0 NUM) = s) ((Y0 lex) = "BULL"))
N::N |: ["$WRH"] -> ["LINE"]((X1::Y1) ((X0 NUM) = s) ((Y0 lex) = "LINE"))
Hebrew Input
בשורה הבאה
Decoder
English Output
in the next line
Translation Output Lattice
(0 1 "IN" @PREP)(1 1 "THE" @DET)(2 2 "LINE" @N)(1 2 "THE LINE" @NP)(0 2 "IN LINE" @PP)(0 4 "IN THE NEXT LINE" @PP)
Preprocessing
Morphology
![Page 9: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/9.jpg)
April 20, 2011 11-731 Machine Translation 9
Transfer Rule Formalism
Type informationPart-of-speech/constituent
informationAlignments
x-side constraints
y-side constraints
xy-constraints, e.g. ((Y1 AGR) = (X1 AGR))
;SL: the old man, TL: ha-ish ha-zaqen
NP::NP [DET ADJ N] -> [DET N DET ADJ]((X1::Y1)(X1::Y3)(X2::Y4)(X3::Y2)
((X1 AGR) = *3-SING)((X1 DEF = *DEF)((X3 AGR) = *3-SING)((X3 COUNT) = +)
((Y1 DEF) = *DEF)((Y3 DEF) = *DEF)((Y2 AGR) = *3-SING)((Y2 GENDER) = (Y4 GENDER)))
![Page 10: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/10.jpg)
April 20, 2011 11-731 Machine Translation 10
Transfer Rule Formalism (II)
Value constraints
Agreement constraints
;SL: the old man, TL: ha-ish ha-zaqen
NP::NP [DET ADJ N] -> [DET N DET ADJ]((X1::Y1)(X1::Y3)(X2::Y4)(X3::Y2)
((X1 AGR) = *3-SING)((X1 DEF = *DEF)((X3 AGR) = *3-SING)((X3 COUNT) = +)
((Y1 DEF) = *DEF)((Y3 DEF) = *DEF)((Y2 AGR) = *3-SING)((Y2 GENDER) = (Y4 GENDER)))
![Page 11: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/11.jpg)
April 20, 2011 11-731 Machine Translation 11
Translation Lexicon: Hebrew-to-English Examples(Semi-manually-developed)
PRO::PRO |: ["ANI"] -> ["I"]((X1::Y1)((X0 per) = 1)((X0 num) = s)((X0 case) = nom))
PRO::PRO |: ["ATH"] -> ["you"]((X1::Y1)((X0 per) = 2)((X0 num) = s)((X0 gen) = m)((X0 case) = nom))
N::N |: ["$&H"] -> ["HOUR"]((X1::Y1)((X0 NUM) = s)((Y0 NUM) = s)((Y0 lex) = "HOUR"))
N::N |: ["$&H"] -> ["hours"]((X1::Y1)((Y0 NUM) = p)((X0 NUM) = p)((Y0 lex) = "HOUR"))
![Page 12: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/12.jpg)
April 20, 2011 11-731 Machine Translation 12
Translation Lexicon: French-to-English Examples
(Automatically-acquired)DET::DET |: [“le"] -> [“the"]((X1::Y1))
Prep::Prep |:[“dans”] -> [“in”]((X1::Y1))
N::N |: [“principes"] -> [“principles"]((X1::Y1))
N::N |: [“respect"] -> [“accordance"]((X1::Y1))
NP::NP |: [“le respect"] -> [“accordance"]()
PP::PP |: [“dans le respect"] -> [“in accordance"]()
PP::PP |: [“des principes"] -> [“with the principles"]()
![Page 13: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/13.jpg)
April 20, 2011 11-731 Machine Translation 13
Hebrew-English Transfer GrammarExample Rules
(Manually-developed)
{NP1,2};;SL: $MLH ADWMH;;TL: A RED DRESS
NP1::NP1 [NP1 ADJ] -> [ADJ NP1]((X2::Y1)(X1::Y2)((X1 def) = -)((X1 status) =c absolute)((X1 num) = (X2 num))((X1 gen) = (X2 gen))(X0 = X1))
{NP1,3};;SL: H $MLWT H ADWMWT;;TL: THE RED DRESSES
NP1::NP1 [NP1 "H" ADJ] -> [ADJ NP1]((X3::Y1)(X1::Y2)((X1 def) = +)((X1 status) =c absolute)((X1 num) = (X3 num))((X1 gen) = (X3 gen))(X0 = X1))
![Page 14: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/14.jpg)
April 20, 2011 11-731 Machine Translation 14
French-English Transfer GrammarExample Rules
(Automatically-acquired)
{PP,24691};;SL: des principes;;TL: with the principles
PP::PP [“des” N] -> [“with the” N]((X1::Y1))
{PP,312};;SL: dans le respect des principes;;TL: in accordance with the principles
PP::PP [Prep NP] -> [Prep NP]((X1::Y1)(X2::Y2))
![Page 15: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/15.jpg)
April 20, 2011 11-731 Machine Translation 15
The Transfer Engine
• Input: source-language input sentence, or source-language confusion network
• Output: lattice representing collection of translation fragments at all levels supported by transfer rules
• Basic Algorithm: “bottom-up” integrated “parsing-transfer-generation” chart-parser guided by the synchronous transfer rules– Start with translations of individual words and phrases
from translation lexicon– Create translations of larger constituents by applying
applicable transfer rules to previously created lattice entries
– Beam-search controls the exponential combinatorics of the search-space, using multiple scoring features
![Page 16: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/16.jpg)
April 20, 2011 11-731 Machine Translation 16
The Transfer Engine
• Some Unique Features:– Works with either learned or manually-developed
transfer grammars– Handles rules with or without unification constraints– Supports interfacing with servers for morphological
analysis and generation– Can handle ambiguous source-word analyses and/or
SL segmentations represented in the form of lattice structures
![Page 17: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/17.jpg)
April 20, 2011 11-731 Machine Translation 17
Hebrew Example(From [Lavie et al., 2004])
• Input word: B$WRH
0 1 2 3 4 |--------B$WRH--------| |-----B-----|$WR|--H--| |--B--|-H--|--$WRH---|
![Page 18: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/18.jpg)
April 20, 2011 11-731 Machine Translation 18
Hebrew Example (From [Lavie et al., 2004])
Y0: ((SPANSTART 0) Y1: ((SPANSTART 0) Y2: ((SPANSTART 1) (SPANEND 4) (SPANEND 2) (SPANEND 3) (LEX B$WRH) (LEX B) (LEX $WR) (POS N) (POS PREP)) (POS N) (GEN F) (GEN M) (NUM S) (NUM S) (STATUS ABSOLUTE)) (STATUS ABSOLUTE))
Y3: ((SPANSTART 3) Y4: ((SPANSTART 0) Y5: ((SPANSTART 1) (SPANEND 4) (SPANEND 1) (SPANEND 2) (LEX $LH) (LEX B) (LEX H) (POS POSS)) (POS PREP)) (POS DET))
Y6: ((SPANSTART 2) Y7: ((SPANSTART 0) (SPANEND 4) (SPANEND 4) (LEX $WRH) (LEX B$WRH) (POS N) (POS LEX)) (GEN F) (NUM S) (STATUS ABSOLUTE))
![Page 19: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/19.jpg)
April 20, 2011 11-731 Machine Translation 19
XFER Output Lattice(28 28 "AND" -5.6988 "W" "(CONJ,0 'AND')")(29 29 "SINCE" -8.20817 "MAZ " "(ADVP,0 (ADV,5 'SINCE')) ")(29 29 "SINCE THEN" -12.0165 "MAZ " "(ADVP,0 (ADV,6 'SINCE THEN')) ")(29 29 "EVER SINCE" -12.5564 "MAZ " "(ADVP,0 (ADV,4 'EVER SINCE')) ")(30 30 "WORKED" -10.9913 "&BD " "(VERB,0 (V,11 'WORKED')) ")(30 30 "FUNCTIONED" -16.0023 "&BD " "(VERB,0 (V,10 'FUNCTIONED')) ")(30 30 "WORSHIPPED" -17.3393 "&BD " "(VERB,0 (V,12 'WORSHIPPED')) ")(30 30 "SERVED" -11.5161 "&BD " "(VERB,0 (V,14 'SERVED')) ")(30 30 "SLAVE" -13.9523 "&BD " "(NP0,0 (N,34 'SLAVE')) ")(30 30 "BONDSMAN" -18.0325 "&BD " "(NP0,0 (N,36 'BONDSMAN')) ")(30 30 "A SLAVE" -16.8671 "&BD " "(NP,1 (LITERAL 'A') (NP2,0 (NP1,0 (NP0,0 (N,34 'SLAVE')) ) ) ) ")(30 30 "A BONDSMAN" -21.0649 "&BD " "(NP,1 (LITERAL 'A') (NP2,0 (NP1,0 (NP0,0 (N,36 'BONDSMAN')) ) ) ) ")
![Page 20: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/20.jpg)
April 20, 2011 11-731 Machine Translation 20
The Lattice Decoder• Stack Decoder, similar to standard Statistical MT
decoders• Searches for best-scoring path of non-overlapping
lattice arcs• No reordering during decoding• Scoring based on log-linear combination of scoring
features, with weights trained using Minimum Error Rate Training (MERT)
• Scoring components:– Statistical Language Model– Bi-directional MLE phrase and rule scores– Lexical Probabilities– Fragmentation: how many arcs to cover the entire
translation?– Length Penalty: how far from expected target length?
![Page 21: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/21.jpg)
April 20, 2011 11-731 Machine Translation 21
XFER Lattice Decoder0 0 ON THE FOURTH DAY THE LION ATE THE RABBIT TO A MORNING MEALOverall: -8.18323, Prob: -94.382, Rules: 0, Frag: 0.153846, Length: 0,
Words: 13,13235 < 0 8 -19.7602: B H IWM RBI&I (PP,0 (PREP,3 'ON')(NP,2 (LITERAL 'THE')
(NP2,0 (NP1,1 (ADJ,2 (QUANT,0 'FOURTH'))(NP1,0 (NP0,1 (N,6 'DAY')))))))>918 < 8 14 -46.2973: H ARIH AKL AT H $PN (S,2 (NP,2 (LITERAL 'THE') (NP2,0
(NP1,0 (NP0,1 (N,17 'LION')))))(VERB,0 (V,0 'ATE'))(NP,100 (NP,2 (LITERAL 'THE') (NP2,0 (NP1,0 (NP0,1 (N,24 'RABBIT')))))))>
584 < 14 17 -30.6607: L ARWXH BWQR (PP,0 (PREP,6 'TO')(NP,1 (LITERAL 'A') (NP2,0 (NP1,0 (NNP,3 (NP0,0 (N,32 'MORNING'))(NP0,0 (N,27 'MEAL')))))))>
![Page 22: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/22.jpg)
April 20, 2011 11-731 Machine Translation 22
Stat-XFER MT Systems • General Stat-XFER framework under development for past
nine years• Systems so far:
– Chinese-to-English– French-to-English– Hebrew-to-English– Urdu-to-English– German-to-English– Hindi-to-English– Dutch-to-English– Turkish-to-English– Mapudungun-to-Spanish– Arabic-to-English– Brazilian Portuguese-to-English– English-to-Arabic– Hebrew-to-Arabic
![Page 23: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/23.jpg)
April 20, 2011 11-731 Machine Translation 23
Learning Transfer-Rules for Languages with Limited Resources
• Rationale:– Large bilingual corpora not available– Bilingual native informant(s) can translate and align a
small pre-designed elicitation corpus, using elicitation tool– Elicitation corpus designed to be typologically
comprehensive and compositional– Transfer-rule engine and rule learning approach support
acquisition of generalized transfer-rules from the data
![Page 24: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/24.jpg)
April 20, 2011 11-731 Machine Translation 24
English-Chinese Example
![Page 25: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/25.jpg)
April 20, 2011 11-731 Machine Translation 25
English-Hindi Example
![Page 26: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/26.jpg)
April 20, 2011 11-731 Machine Translation 26
Spanish-Mapudungun Example
![Page 27: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/27.jpg)
April 20, 2011 11-731 Machine Translation 27
English-Arabic Example
![Page 28: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/28.jpg)
April 20, 2011 11-731 Machine Translation 28
The Typological Elicitation Corpus
• Translated, aligned by bilingual informant• Corpus consists of linguistically diverse
constructions• Based on elicitation and documentation work
of field linguists (e.g. Comrie 1977, Bouquiaux 1992)
• Organized compositionally: elicit simple structures first, then use them as building blocks
• Goal: minimize size, maximize linguistic coverage
![Page 29: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/29.jpg)
April 20, 2011 11-731 Machine Translation 29
The Structural Elicitation Corpus
• Designed to cover the most common phrase structures of English learn how these structures map onto their equivalents in other languages
• Constructed using the constituent parse trees from the Penn TreeBank– Extracted and frequency ranked all rules in parse
trees– Selected top ~200 rules, filtered idiosyncratic cases– Revised lexical choices within examples
• Goal: minimize size, maximize linguistic coverage of structures
![Page 30: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/30.jpg)
April 20, 2011 11-731 Machine Translation 30
The Structural Elicitation Corpus
Examples:
srcsent: in the foresttgtsent: B H I&Raligned: ((1,1),(2,2),(3,3))context: C-Structure:(<PP> (PREP in-1) (<NP> (DET the-2) (N forest-3)))
srcsent: stepstgtsent: MDRGWTaligned: ((1,1))context: C-Structure:(<NP> (N steps-1))
srcsent: the boy ate the appletgtsent: H ILD AKL AT H TPWXaligned: ((1,1),(2,2),(3,3),(4,5),(5,6))context: C-Structure:(<S> (<NP> (DET the-1) (N boy-2)) (<VP> (V ate-3) (<NP> (DET the-4)(N apple-5))))
srcsent: the first yeartgtsent: H $NH H RA$WNHaligned: ((1,1 3),(2,4),(3,2))context: C-Structure:(<NP> (DET the-1) (<ADJP> (ADJ first-2)) (N year-3))
![Page 31: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/31.jpg)
April 20, 2011 11-731 Machine Translation 31
A Limited Data Scenario for Hindi-to-English
• Conducted during a DARPA “Surprise Language Exercise” (SLE) in June 2003
• Put together a scenario with “miserly” data resources:– Elicited Data corpus: 17589 phrases– Cleaned portion (top 12%) of LDC dictionary: ~2725
Hindi words (23612 translation pairs)– Manually acquired resources during the SLE:
• 500 manual bigram translations• 72 manually written phrase transfer rules• 105 manually written postposition rules• 48 manually written time expression rules
• No additional parallel text!!
![Page 32: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/32.jpg)
April 20, 2011 11-731 Machine Translation 32
Examples of Learned Rules (Hindi-to-English)
{NP,14244}
;;Score:0.0429
NP::NP [N] -> [DET N]
(
(X1::Y2)
)
{NP,14434}
;;Score:0.0040
NP::NP [ADJ CONJ ADJ N] ->
[ADJ CONJ ADJ N]
(
(X1::Y1) (X2::Y2)
(X3::Y3) (X4::Y4)
)
{PP,4894};;Score:0.0470PP::PP [NP POSTP] -> [PREP NP]((X2::Y1)(X1::Y2))
![Page 33: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/33.jpg)
April 20, 2011 11-731 Machine Translation 33
Manual Transfer Rules: Hindi Example
;; PASSIVE OF SIMPLE PAST (NO AUX) WITH LIGHT VERB;; passive of 43 (7b){VP,28}VP::VP : [V V V] -> [Aux V]( (X1::Y2) ((x1 form) = root) ((x2 type) =c light) ((x2 form) = part) ((x2 aspect) = perf) ((x3 lexwx) = 'jAnA') ((x3 form) = part) ((x3 aspect) = perf) (x0 = x1) ((y1 lex) = be) ((y1 tense) = past) ((y1 agr num) = (x3 agr num)) ((y1 agr pers) = (x3 agr pers)) ((y2 form) = part))
![Page 34: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/34.jpg)
April 20, 2011 11-731 Machine Translation 34
Manual Transfer Rules: Example
; NP1 ke NP2 -> NP2 of NP1; Ex: jIvana ke eka aXyAya; life of (one) chapter ; ==> a chapter of life;{NP,12}NP::NP : [PP NP1] -> [NP1 PP]( (X1::Y2) (X2::Y1); ((x2 lexwx) = 'kA'))
{NP,13}NP::NP : [NP1] -> [NP1]( (X1::Y1))
{PP,12}PP::PP : [NP Postp] -> [Prep NP]( (X1::Y2) (X2::Y1))
NP
PP NP1
NP P Adj N
N1 ke eka aXyAya
N
jIvana
NP
NP1 PP
Adj N P NP
one chapter of N1
N
life
![Page 35: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/35.jpg)
April 20, 2011 11-731 Machine Translation 35
Manual Grammar Development
• Covers mostly NPs, PPs and VPs (verb complexes)
• ~70 grammar rules, covering basic and recursive NPs and PPs, verb complexes of main tenses in Hindi (developed in two weeks)
![Page 36: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/36.jpg)
April 20, 2011 11-731 Machine Translation 36
Testing Conditions
• Tested on section of JHU provided data: 258 sentences with four reference translations– SMT system (stand-alone)– EBMT system (stand-alone)– XFER system (naïve decoding)– XFER system with “strong” decoder
• No grammar rules (baseline)• Manually developed grammar rules• Automatically learned grammar rules
– XFER+SMT with strong decoder (MEMT)
![Page 37: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/37.jpg)
April 20, 2011 11-731 Machine Translation 37
Results on JHU Test Set
System BLEU M-BLEU NIST
EBMT 0.058 0.165 4.22
SMT 0.093 0.191 4.64
XFER (naïve) man grammar
0.055 0.177 4.46
XFER (strong)
no grammar0.109 0.224 5.29
XFER (strong) learned grammar
0.116 0.231 5.37
XFER (strong) man grammar
0.135 0.243 5.59
XFER+SMT 0.136 0.243 5.65
![Page 38: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/38.jpg)
April 20, 2011 11-731 Machine Translation 38
Effect of Reordering in the Decoder
NIST vs. Reordering
4.8
4.9
5
5.1
5.2
5.3
5.4
5.5
5.6
5.7
0 1 2 3 4
reordering window
NIS
T s
core no grammar
learned grammar
manual grammar
MEMT: SFXER+ SMT
![Page 39: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/39.jpg)
April 20, 2011 11-731 Machine Translation 39
Observations and Lessons (I)• XFER with strong decoder outperformed SMT
even without any grammar rules in the miserly data scenario– SMT Trained on elicited phrases that are very short– SMT has insufficient data to train more discriminative
translation probabilities– XFER takes advantage of Morphology
• Token coverage without morphology: 0.6989• Token coverage with morphology: 0.7892
• Manual grammar was somewhat better than automatically learned grammar– Learned rules were very simple– Large room for improvement on learning rules
![Page 40: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/40.jpg)
April 20, 2011 11-731 Machine Translation 40
Observations and Lessons (II)
• MEMT (XFER and SMT) based on strong decoder produced best results in the miserly scenario.
• Reordering within the decoder provided very significant score improvements– Much room for more sophisticated grammar rules– Strong decoder can carry some of the reordering
“burden”
![Page 41: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/41.jpg)
April 20, 2011 11-731 Machine Translation 41
Modern Hebrew• Native language of about 3-4 Million in Israel• Semitic language, closely related to Arabic and
with similar linguistic properties– Root+Pattern word formation system– Rich verb and noun morphology– Particles attach as prefixed to the following word:
definite article (H), prepositions (B,K,L,M), coordinating conjuction (W), relativizers ($,K$)…
• Unique alphabet and Writing System– 22 letters represent (mostly) consonants– Vowels represented (mostly) by diacritics– Modern texts omit the diacritic vowels, thus
additional level of ambiguity: “bare” word word– Example: MHGR mehager, m+hagar, m+h+ger
![Page 42: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/42.jpg)
April 20, 2011 11-731 Machine Translation 42
Modern Hebrew Spelling
• Two main spelling variants– “KTIV XASER” (difficient): spelling with the vowel
diacritics, and consonant words when the diacritics are removed
– “KTIV MALEH” (full): words with I/O/U vowels are written with long vowels which include a letter
• KTIV MALEH is predominant, but not strictly adhered to even in newspapers and official publications inconsistent spelling
• Example: – niqud (spelling): NIQWD, NQWD, NQD– When written as NQD, could also be niqed, naqed,
nuqad
![Page 43: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/43.jpg)
April 20, 2011 11-731 Machine Translation 43
Challenges for Hebrew MT
• Puacity in existing language resources for Hebrew– No publicly available broad coverage morphological
analyzer– No publicly available bilingual lexicons or dictionaries– No POS-tagged corpus or parse tree-bank corpus for
Hebrew– No large Hebrew/English parallel corpus
• Scenario well suited for CMU transfer-based MT framework for languages with limited resources
![Page 44: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/44.jpg)
April 20, 2011 11-731 Machine Translation 44
Morphological Analyzer
• We use a publicly available morphological analyzer distributed by the Technion’s Knowledge Center, adapted for our system
• Coverage is reasonable (for nouns, verbs and adjectives)
• Produces all analyses or a disambiguated analysis for each word
• Output format includes lexeme (base form), POS, morphological features
• Output was adapted to our representation needs (POS and feature mappings)
![Page 45: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/45.jpg)
April 20, 2011 11-731 Machine Translation 45
Morphology Example
• Input word: B$WRH
0 1 2 3 4 |--------B$WRH--------| |-----B-----|$WR|--H--| |--B--|-H--|--$WRH---|
![Page 46: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/46.jpg)
April 20, 2011 11-731 Machine Translation 46
Morphology ExampleY0: ((SPANSTART 0) Y1: ((SPANSTART 0) Y2: ((SPANSTART 1) (SPANEND 4) (SPANEND 2) (SPANEND 3) (LEX B$WRH) (LEX B) (LEX $WR) (POS N) (POS PREP)) (POS N) (GEN F) (GEN M) (NUM S) (NUM S) (STATUS ABSOLUTE)) (STATUS ABSOLUTE))
Y3: ((SPANSTART 3) Y4: ((SPANSTART 0) Y5: ((SPANSTART 1) (SPANEND 4) (SPANEND 1) (SPANEND 2) (LEX $LH) (LEX B) (LEX H) (POS POSS)) (POS PREP)) (POS DET))
Y6: ((SPANSTART 2) Y7: ((SPANSTART 0) (SPANEND 4) (SPANEND 4) (LEX $WRH) (LEX B$WRH) (POS N) (POS LEX)) (GEN F) (NUM S) (STATUS ABSOLUTE))
![Page 47: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/47.jpg)
April 20, 2011 11-731 Machine Translation 47
Translation Lexicon• Constructed our own Hebrew-to-English lexicon, based
primarily on existing “Dahan” H-to-E and E-to-H dictionary made available to us, augmented by other public sources
• Coverage is not great but not bad as a start– Dahan H-to-E is about 15K translation pairs– Dahan E-to-H is about 7K translation pairs
• Base forms, POS information on both sides• Converted Dahan into our representation, added entries
for missing closed-class entries (pronouns, prepositions, etc.)
• Had to deal with spelling conventions• Recently augmented with ~50K translation pairs
extracted from Wikipedia (mostly proper names and named entities)
![Page 48: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/48.jpg)
April 20, 2011 11-731 Machine Translation 48
Manual Transfer Grammar (human-developed)
• Initially developed by Alon in a couple of days, extended and revised by Nurit over time
• Current grammar has 36 rules:– 21 NP rules – one PP rule – 6 verb complexes and VP rules – 8 higher-phrase and sentence-level rules
• Captures the most common (mostly local) structural differences between Hebrew and English
![Page 49: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/49.jpg)
April 20, 2011 11-731 Machine Translation 49
Transfer GrammarExample Rules
{NP1,2};;SL: $MLH ADWMH;;TL: A RED DRESS
NP1::NP1 [NP1 ADJ] -> [ADJ NP1]((X2::Y1)(X1::Y2)((X1 def) = -)((X1 status) =c absolute)((X1 num) = (X2 num))((X1 gen) = (X2 gen))(X0 = X1))
{NP1,3};;SL: H $MLWT H ADWMWT;;TL: THE RED DRESSES
NP1::NP1 [NP1 "H" ADJ] -> [ADJ NP1]((X3::Y1)(X1::Y2)((X1 def) = +)((X1 status) =c absolute)((X1 num) = (X3 num))((X1 gen) = (X3 gen))(X0 = X1))
![Page 50: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/50.jpg)
April 20, 2011 11-731 Machine Translation 50
Hebrew-to-English MT Prototype
• Initial prototype developed within a two month intensive effort
• Accomplished:– Adapted available morphological analyzer– Constructed a preliminary translation lexicon– Translated and aligned Elicitation Corpus– Learned XFER rules– Developed (small) manual XFER grammar– System debugging and development– Evaluated performance on unseen test data using
automatic evaluation metrics
![Page 51: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/51.jpg)
April 20, 2011 11-731 Machine Translation 51
Example Translation
• Input: – הנסיגה בנושא עם משאל לערוך הממשלה החליטה רבים דיונים לאחר– After debates many decided the government to hold
referendum in issue the withdrawal
• Output: – AFTER MANY DEBATES THE GOVERNMENT DECIDED
TO HOLD A REFERENDUM ON THE ISSUE OF THE WITHDRAWAL
![Page 52: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/52.jpg)
April 20, 2011 11-731 Machine Translation 52
Noun Phrases – Construct State
HXL@T [HNSIA HRA$WN]decision.3SF-CS the-president.3SM the-first.3SM
החלטת הנשיא הראשון
החלטת הנשיא הראשונה
[HXL@T HNSIA] HRA$WNHdecision.3SF-CS the-president.3SM the-first.3SF
THE DECISION OF THE FIRST PRESIDENT
THE FIRST DECISION OF THE PRESIDENT
![Page 53: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/53.jpg)
April 20, 2011 11-731 Machine Translation 53
Noun Phrases - Possessives
HNSIA HKRIZ $HM$IMH HRA$WNH $LW THIHthe-president announced that-the-task.3SF the-first.3SF of-him will.3SF
LMCWA PTRWN LSKSWK BAZWRNWto-find solution to-the-conflict in-region-POSS.1P
נו תהיה למצוא פתרון לסכסוך באזורשלו הנשיא הכריז שהמשימה הראשונה
Without transfer grammar:THE PRESIDENT ANNOUNCED THAT THE TASK THE BEST OF HIM WILL BE TO FIND SOLUTION TO THE CONFLICT IN REGION OUR
With transfer grammar:THE PRESIDENT ANNOUNCED THAT HIS FIRST TASK WILL BE TO FIND A SOLUTION TO THE CONFLICT IN OUR REGION
![Page 54: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/54.jpg)
April 20, 2011 11-731 Machine Translation 54
Subject-Verb Inversion
ATMWL HWDI&H HMM$LHyesterday announced.3SF the-government.3SF
אתמול הודיעה הממשלה שתערכנה בחירות בחודש הבא
$T&RKNH BXIRWT BXWD$ HBAthat-will-be-held.3PF elections.3PF in-the-month the-next
Without transfer grammar:YESTERDAY ANNOUNCED THE GOVERNMENT THAT WILL RESPECT OF THE FREEDOM OF THE MONTH THE NEXT
With transfer grammar:YESTERDAY THE GOVERNMENT ANNOUNCED THAT ELECTIONS WILL ASSUME IN THE NEXT MONTH
![Page 55: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/55.jpg)
April 20, 2011 11-731 Machine Translation 55
Subject-Verb Inversion
LPNI KMH $BW&WT HWDI&H HNHLT HMLWNbefore several weeks announced.3SF management.3SF.CS the-hotel
לפני כמה שבועות הודיעה הנהלת המלון שהמלון יסגר בסוף השנה
$HMLWN ISGR BSWF H$NH that-the-hotel.3SM will-be-closed.3SM at-end.3SM.CS the-year
Without transfer grammar:IN FRONT OF A FEW WEEKS ANNOUNCED ADMINISTRATION THE HOTEL THAT THE HOTEL WILL CLOSE AT THE END THIS YEAR
With transfer grammar:SEVERAL WEEKS AGO THE MANAGEMENT OF THE HOTEL ANNOUNCED THAT THE HOTEL WILL CLOSE AT THE END OF THE YEAR
![Page 56: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/56.jpg)
April 20, 2011 11-731 Machine Translation 56
Evaluation Results
• Test set of 62 sentences from Haaretz newspaper, 2 reference translations
System BLEU NIST P R METEOR
No Gram 0.0616 3.4109 0.4090 0.4427 0.3298
Learned 0.0774 3.5451 0.4189 0.4488 0.3478
Manual 0.1026 3.7789 0.4334 0.4474 0.3617
![Page 57: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/57.jpg)
April 20, 2011 11-731 Machine Translation 57
Current and Future Work
• Issues specific to the Hebrew-to-English system:– Coverage: further improvements in the translation lexicon
and morphological analyzer– Manual Grammar development– Acquiring/training of word-to-word translation probabilities– Acquiring/training of a Hebrew language model at a post-
morphology level that can help with disambiguation• General Issues related to XFER framework:
– Discriminative Language Modeling for MT– Effective models for assigning scores to transfer rules– Improved grammar learning– Merging/integration of manual and acquired grammars
![Page 58: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/58.jpg)
April 20, 2011 11-731 Machine Translation 58
Conclusions
• Test case for the CMU XFER framework for rapid MT prototyping
• Preliminary system was a two-month, three person effort – we were quite happy with the outcome
• Core concept of XFER + Decoding is very powerful and promising for low-resource MT
• We experienced the main bottlenecks of knowledge acquisition for MT: morphology, translation lexicons, grammar...
![Page 59: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/59.jpg)
April 20, 2011 11-731 Machine Translation 59
Mapudungun-to-Spanish Example
Mapudungun
pelafiñ Maria
Spanish
No vi a María
English
I didn’t see Maria
![Page 60: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/60.jpg)
April 20, 2011 11-731 Machine Translation 60
Mapudungun-to-Spanish Example
Mapudungun
pelafiñ Mariape -la -fi -ñ Mariasee -neg -3.obj -1.subj.indicative Maria
Spanish
No vi a MaríaNo vi a Maríaneg see.1.subj.past.indicative acc Maria
English
I didn’t see Maria
![Page 61: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/61.jpg)
April 20, 2011 11-731 Machine Translation 61
V
pe
pe-la-fi-ñ Maria
![Page 62: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/62.jpg)
April 20, 2011 11-731 Machine Translation 62
V
pe
pe-la-fi-ñ Maria
VSuff
laNegation = +
![Page 63: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/63.jpg)
April 20, 2011 11-731 Machine Translation 63
V
pe
pe-la-fi-ñ Maria
VSuff
la
VSuffGPass all features up
![Page 64: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/64.jpg)
April 20, 2011 11-731 Machine Translation 64
V
pe
pe-la-fi-ñ Maria
VSuff
la
VSuffG VSuff
fiobject person = 3
![Page 65: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/65.jpg)
April 20, 2011 11-731 Machine Translation 65
V
pe
pe-la-fi-ñ Maria
VSuff
la
VSuffG VSuff
fi
VSuffGPass all features up from both children
![Page 66: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/66.jpg)
April 20, 2011 11-731 Machine Translation 66
V
pe
pe-la-fi-ñ Maria
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
person = 1number = sgmood = ind
![Page 67: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/67.jpg)
April 20, 2011 11-731 Machine Translation 67
V
pe
pe-la-fi-ñ Maria
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
Pass all features up from both children
VSuffG
![Page 68: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/68.jpg)
April 20, 2011 11-731 Machine Translation 68
V
V
pe
pe-la-fi-ñ Maria
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
Pass all features up from both children
VSuffGCheck that:1) negation = +2) tense is undefined
![Page 69: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/69.jpg)
April 20, 2011 11-731 Machine Translation 69
V
pe
pe-la-fi-ñ Maria
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
V NP
N
Maria
N person = 3number = sghuman = +
![Page 70: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/70.jpg)
April 20, 2011 11-731 Machine Translation 70
Pass features up from
V
pe
pe-la-fi-ñ Maria
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
Check that NP is human = +V VP
![Page 71: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/71.jpg)
April 20, 2011 11-731 Machine Translation 71
V
pe
Transfer to Spanish: Top-Down
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
VP
S
VP
![Page 72: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/72.jpg)
April 20, 2011 11-731 Machine Translation 72
V
pe
Transfer to Spanish: Top-Down
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
VP
S
VP
NP“a”V
Pass all features to Spanish side
![Page 73: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/73.jpg)
April 20, 2011 11-731 Machine Translation 73
V
pe
Transfer to Spanish: Top-Down
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
VP
S
VP
NP“a”V
Pass all features down
![Page 74: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/74.jpg)
April 20, 2011 11-731 Machine Translation 74
V
pe
Transfer to Spanish: Top-Down
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
VP
S
VP
NP“a”V
Pass object features down
![Page 75: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/75.jpg)
April 20, 2011 11-731 Machine Translation 75
V
pe
Transfer to Spanish: Top-Down
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
VP
S
VP
NP“a”V
Accusative marker on objects is introduced because human = +
![Page 76: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/76.jpg)
April 20, 2011 11-731 Machine Translation 76
V
pe
Transfer to Spanish: Top-Down
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
VP
S
VP
NP“a”V
VP::VP [VBar NP] -> [VBar "a" NP]( (X1::Y1)
(X2::Y3)
((X2 type) = (*NOT* personal)) ((X2 human) =c +)
(X0 = X1) ((X0 object) = X2)
(Y0 = X0)
((Y0 object) = (X0 object))(Y1 = Y0)(Y3 = (Y0 object))((Y1 objmarker person) = (Y3 person))((Y1 objmarker number) = (Y3 number))((Y1 objmarker gender) = (Y3 ender)))
![Page 77: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/77.jpg)
April 20, 2011 11-731 Machine Translation 77
V
pe
Transfer to Spanish: Top-Down
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
VP
S
VP
NP“a”V
V“no”
Pass person, number, and mood features to Spanish Verb
Assign tense = past
![Page 78: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/78.jpg)
April 20, 2011 11-731 Machine Translation 78
V
pe
Transfer to Spanish: Top-Down
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
VP
S
VP
NP“a”V
V“no”
Introduced because negation = +
![Page 79: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/79.jpg)
April 20, 2011 11-731 Machine Translation 79
V
pe
Transfer to Spanish: Top-Down
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
VP
S
VP
NP“a”V
V“no”
ver
![Page 80: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/80.jpg)
April 20, 2011 11-731 Machine Translation 80
V
pe
Transfer to Spanish: Top-Down
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
VP
S
VP
NP“a”V
V“no”
vervi
person = 1number = sgmood = indicativetense = past
![Page 81: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/81.jpg)
April 20, 2011 11-731 Machine Translation 81
V
pe
Transfer to Spanish: Top-Down
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
VP
S
VP
NP“a”V
V“no”
vi N
María
N
Pass features over to Spanish side
![Page 82: MT for Languages with Limited Resources](https://reader033.fdocuments.net/reader033/viewer/2022051621/56814514550346895db1d70f/html5/thumbnails/82.jpg)
April 20, 2011 11-731 Machine Translation 82
V
pe
I Didn’t see Maria
VSuff
la
VSuffG VSuff
fi
VSuffG VSuff
ñ
VSuffG
NP
N
Maria
N
S
V
VP
S
VP
NP“a”V
V“no”
vi N
María
N