Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and...
Transcript of Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and...
![Page 1: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/1.jpg)
Syntax and Semanticsof Translation
Aarne Ranta
WoLLIC 2014, Valparaiso, 1-4 September
![Page 2: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/2.jpg)
Machine Translation: , , , and
Aarne Ranta
WoLLIC-2014, Valparaiso 4 September 2014
RedYellowGreen
CLT
![Page 3: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/3.jpg)
Versions also given at
CLT, Gothenburg, April 2014
NLCS/NLSR, Vienna Summer of Logic, July 2014
CNL, Galway, August 2014
![Page 4: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/4.jpg)
Executive summaryWe want to have machine translation that● delivers publication quality in areas where reasonable
effort is invested● degrades gracefully to browsing quality in other areas● shows a clear distinction between these
We do this by using grammars and type-theoretical interlinguas implemented in GF, Grammatical Framework
![Page 5: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/5.jpg)
Executive summaryWe want to have machine translation that● delivers publication quality in areas where reasonable
effort is invested● degrades gracefully to browsing quality in other areas● shows a clear distinction between these
We do this by using grammars and type-theoretical interlinguas implemented in GF, Grammatical Framework
![Page 6: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/6.jpg)
Joint work with
Krasimir Angelov, Björn Bringert, Grégoire Détrez, Ramona Enache, Erik de Graaf, Thomas Hallgren, Qiao Haiyan, Prasanth Kolachina, Inari Listenmaa, Peter Ljunnglöf, K.V.S. Prasad, Scharolta Siencnik, Shafqat Virk
50+ GF Resource Grammar Library contributors
![Page 7: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/7.jpg)
GF translation app in greyscale
![Page 8: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/8.jpg)
GF translation app in full colour
![Page 9: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/9.jpg)
translation by meaning- correct- idiomatic
translation by syntax- grammatical- often strange- often wrong
translation by chunks- probably ungrammatical- probably wrong
![Page 10: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/10.jpg)
word to word transfer
syntactic transfer
semantic interlingua
The Vauquois triangle
![Page 11: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/11.jpg)
word to word transfer
syntactic transfer
semantic interlingua
The Vauquois triangle
![Page 12: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/12.jpg)
What is it good for?
![Page 13: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/13.jpg)
get an idea
get the grammar right
publish the content
![Page 14: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/14.jpg)
Who is doing it?
![Page 15: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/15.jpg)
Google, Bing, Apertium
GF the last 15 months
GF in MOLTO
![Page 16: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/16.jpg)
What should we work on?
![Page 17: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/17.jpg)
chunks for robustness and speed
syntax for grammaticality
semantics for full quality and speed
All!
![Page 18: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/18.jpg)
We want a system that● can reach perfect quality● has robustness as back-up● tells the user which is which
We “combine GF, Apertium, and Google”
But we do it all in GF!
![Page 19: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/19.jpg)
The idea is to understand real problems that one would like to solve, and to do it with the standards of the highest quality research. This combines the best features of “applied research” and “basic research.” I’ve always found it productive to look at the details of real problems. Real problems often reveal issues that you wouldn’t think of otherwise.
William A. Woods, ACL Lifetime Achievement Award The Right Tools: Reflections on Computation and Language Computational Linguistics 36(4), 2010.
![Page 20: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/20.jpg)
Interlude: SMT
![Page 21: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/21.jpg)
How SMT works
SMT = Statistical Machine Translation
“Lexicon”: word alignments
“Syntax”: n-grams
Word order: distortion model
![Page 22: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/22.jpg)
Word alignmentswine vino 0.7 rojo 0.4red roja 0.2 rojos 0.2 rojas 0.1 tinto 0.001black tintos 0.0002
![Page 23: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/23.jpg)
n-grams (n = 2)
libro rojo 0.01 roja 0.0001casa roja 0.01 rojos 0.00001vino rojo 0.001 roja 0.00001 tinto 0.2
![Page 24: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/24.jpg)
Decoding
Selecting the best translation from f to e
ê = argmax p(f|e) p(e) e
Shannon’s noisy channel model (1948)
![Page 25: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/25.jpg)
Decoding in action: word alignments
red rojo roja rojos rojas tintowine vino
![Page 26: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/26.jpg)
Decoding in action: distortion
wine redred rojo roja rojos rojas tinto
![Page 27: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/27.jpg)
Decoding in action: n-grams
wine vinored rojo roja rojos rojas tinto
![Page 28: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/28.jpg)
Modern version: phrase alignment
red wine vino tinto 0.99 vino rojo 0.01
![Page 29: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/29.jpg)
Problems with SMT
When things are far apart (n > 3)
Sparse data: a language has 10^6 “words”
Fundamentally random and uncontrolled
Hard to fix bugs
![Page 30: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/30.jpg)
How to do it in GF?
a brief summary
![Page 31: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/31.jpg)
translator
chunk grammar
resource grammar
application grammar
![Page 32: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/32.jpg)
How much work is needed?
![Page 33: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/33.jpg)
translator
chunk grammar
resource grammar
application grammars
![Page 34: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/34.jpg)
resource grammar
● morphology● syntax● generic lexiconprecise linguistic knowledgemanual work can’t be escaped
![Page 35: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/35.jpg)
chunk grammar
words suitable word sequences● local agreement● local reorderingeasily derived from resource grammareasily variedminimize hand-hacking
![Page 36: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/36.jpg)
application grammars
domain semantics, domain idioms● need domain expertiseuse resource grammar as library● minimize hand-hacking
the work never ends ● we can only cover some domains
![Page 37: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/37.jpg)
translator PGF run-time system● parsing● linearization● disambiguationgeneric for all grammarsportable to different user interfaces● web● mobile
![Page 38: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/38.jpg)
Disambiguation?Grammatical: give priority to green over yellow, yellow over red
Statistical: use a distribution model for grammatical constructs (incl. word senses)
Interactive: for the last mile in the green zone
![Page 39: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/39.jpg)
Advantages of GF
Expressivity: easy to express complex rules● agreement● word order● discontinuityAbstractions: easy to manage complex codeInterlinguality: easy to add new languages
![Page 40: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/40.jpg)
Resources: basic and bigger
Norwegian Danish Afrikaans
Maltese
Romanian Catalan
Polish Estonian
Russian
Latvian Thai Japanese Urdu Punjabi Sindhi
Greek Nepali Persian
English Swedish German Dutch
French Italian Spanish
Bulgarian Finnish
Chinese Hindi
![Page 41: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/41.jpg)
![Page 42: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/42.jpg)
How to do it?
some more details
![Page 43: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/43.jpg)
Translation model: multi-source multi-target compiler
![Page 44: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/44.jpg)
Translation model: multi-source multi-target compiler-decompiler
Abstract Syntax
Hindi
Chinese
Finnish
Swedish
English
Spanish
German
French
Bulgarian Italian
![Page 45: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/45.jpg)
Word alignment: compiler
1 + 2 * 3
00000011 00000100 00000101 01101000 01100000
![Page 46: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/46.jpg)
Abstract syntax
Add : Exp -> Exp -> ExpMul : Exp -> Exp -> ExpE1, E2, E3 : Exp
Add E1 (Mul E2 E3)
![Page 47: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/47.jpg)
Concrete syntax
abstrakt Java JVMAdd x y x “+” y x y “01100000”Mul x y x “*” y x y “01101000”E1 “1” “00000011”E2 “2” “00000100”E3 “3” “00000101”
![Page 48: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/48.jpg)
Compiling natural languageAbstract syntax Pred : NP -> V2 -> NP -> S Mod : AP -> CN -> CN Love : V2Concrete syntax: English Latin Pred s v o s v o s o v Mod a n a n n a Love “love” “amare”
![Page 49: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/49.jpg)
Word alignment
the clever woman loves the handsome man
femina sapiens virum formosum amat
Pred (Def (Mod Clever Woman)) Love (Def (Mod Handsome Man))
![Page 50: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/50.jpg)
Linearization types English Latin CN {s : Number => Str} {s : Number => Case => Str ; g : Gender} AP {s : Str} {s : Gender => Number => Case => Str}
Mod ap cn {s = \\n => ap.s ++ cn.s ! n} {s = \\n,c => cn.s ! n ! c ++ ap.s ! cn.g ! n ! c ; g = cn.g }
![Page 51: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/51.jpg)
Abstract syntax treesmy name is John
HasName I (Name “John”)
![Page 52: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/52.jpg)
Abstract syntax treesmy name is John
HasName I (Name “John”)
Pred (Det (Poss i_NP) name_N)) (NameNP “John”)
![Page 53: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/53.jpg)
Abstract syntax treesmy name is John
HasName I (Name “John”)
Pred (Det (Poss i_NP) name_N)) (NameNP “John”)
[DetChunk (Poss i_NP), NChunk name_N, copulaChunk, NPChunk (NameNP “John”)]
![Page 54: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/54.jpg)
Building the yellow part
![Page 55: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/55.jpg)
Building a basic resource grammar
Programming skillsTheoretical knowledge of language3-6 months work3000-5000 lines of GF code- not easy to automate+ only done once per language
![Page 56: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/56.jpg)
Building a large lexiconMonolingual (morphology + valencies)● extraction from open sources (SALDO etc)● extraction from text (extract)● smart paradigmsMultilingual (mapping from abstract syntax)● extraction from open sources (Wordnet, Wiktionary)● extraction from parallel corpora (Giza++)
Manual quality control at some point needed
![Page 57: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/57.jpg)
Improving the resourcesMultiwords: non-compositional translation● red wine - vino tintoConstructions: multiwords with arguments● x’s name is y - x se llama yExtraction from free resources (Konstruktikon)Extraction from SMT phrase tables● example-based grammar writing
![Page 58: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/58.jpg)
It’s important to look at the details. Try to understand what would be necessary to solvethe whole problem. At this point, don’t settle for approximations.
Woods, ibid.
![Page 59: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/59.jpg)
Building the red part
![Page 60: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/60.jpg)
1. Write a grammar that builds sentences from sequences of chunks cat Chunk fun SChunks : [Chunk] -> S
2. Introduce chunks to cover phrases
fun NP_nom_Chunk : NP -> Chunk fun NP_acc_Chunk : NP -> Chunk fun AP_sg_masc_Chunk : AP -> Chunk fun AP_pl_fem_Chunk : AP -> Chunk
![Page 61: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/61.jpg)
Do this for all categories and feature combinations you want to cover.
Include both long and short phrases● long phrases have better quality● short phrases add to robustness
Give long phrases priority by probability settings.
![Page 62: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/62.jpg)
Long chunks are better:
[this yellow house] - [det här gula huset]
[this] [yellow house] - [den här] [gult hus]
[this] [yellow] [house] - [den här] [gul] [hus]
Limiting case: whole sentences as chunks.
![Page 63: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/63.jpg)
Accurate feature distinctions are good, especially between closely related language pairs. god bon buono good gott bonne buona goda bons buoni bonnes buone
Apertium does this for every language pair.
![Page 64: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/64.jpg)
Resource grammar chunks of course come with reordering and internal agreement Prep Det+Fem+Sg N+Fem+Sg A+Fem+Sg dans la maison bleue
im blauen Haus Prep-Det+Neutr+Sg+Dat A+Weak+Dat N+Neutr+Sg
![Page 65: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/65.jpg)
Recall: chunks are just a by-product of the real grammar.
Their size span is
single words <---> entire sentences
A wide-coverage chunking grammar can be built in a couple of hours by using the RGL.
![Page 66: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/66.jpg)
If you have a practical job to do, and it’s important to get it done quickly as well as possible, and you can only do that by partially solving the problem, then by all means do that. That’s practical engineering, and I do that with my Engineer’s hat on. But that’s not going to advance the science
Woods, ibid.
![Page 67: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/67.jpg)
Building the green part
![Page 68: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/68.jpg)
Define semantically based abstract syntax fun HasName : Person -> Name -> Fact
Define concrete syntax by mapping to resource grammar structures lin HasName p n = mkCl (possNP p name_N) y my name is John lin HasName p n = mkCl p heta_V2 y jag heter John lin HasName p n = mkCl p (reflV chiamare_V) y (io) mi chiamo John
![Page 69: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/69.jpg)
Resource grammars give crucial help● application grammarians need not know
linguistics● a substantial grammar can be built in a few
days● adding new languages is a matter of a few
hours
MOLTO’s goal was to make this possible.
![Page 70: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/70.jpg)
Automatic extraction of application grammars?
● abstract syntax from ontologies● concrete syntax from examples
○ including phrase tables
As always, full green quality needs expert verification
● formal methods help (REMU project)
![Page 71: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/71.jpg)
These grammars are a source of● “non-compositional” translations● compile-time transfer● idiomatic language● translating meaning, not syntax
Constructions are the generalized form of this idea, originally domain-specific.
![Page 72: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/72.jpg)
Building the translation system
![Page 73: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/73.jpg)
GF source
![Page 74: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/74.jpg)
GF source
probability model
![Page 75: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/75.jpg)
GF source
probability model
PGF binary
GFcompiler
![Page 76: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/76.jpg)
PGF binaryPGF runtime
system
![Page 77: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/77.jpg)
PGF binaryPGF runtime
system
user interface
![Page 78: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/78.jpg)
PGF binaryPGF runtime
system
user interface
another PGF binary
![Page 79: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/79.jpg)
PGF binaryPGF runtime
system
user interface
another PGF binary
app
![Page 80: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/80.jpg)
PGF binaryPGF runtime
system
user interface
another PGF binary
anotherapp
![Page 81: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/81.jpg)
PGF binaryPGF runtime
system
custom user interface
genericuser interface
PGF runtimesystem
generic grammar
app
White: free, open-source. Green: a business idea
![Page 82: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/82.jpg)
User interfaces
command-lineshellweb serverweb applicationsmobile applications
![Page 83: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/83.jpg)
Demos
![Page 84: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/84.jpg)
To test it yourself
Android app
http://www.grammaticalframework.org/demos/app.html
Web app
http://www.grammaticalframework.org/demos/translation.html
![Page 85: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/85.jpg)
Agenda for future work
![Page 86: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/86.jpg)
Improve the lexicon
Split senses
Improve disambiguation
Introduce constructions
Design and perform evaluation
![Page 87: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/87.jpg)
Current dictionary coverage
![Page 88: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/88.jpg)
Splitting senses
time
![Page 89: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/89.jpg)
Splitting senses
time_N
time_V
![Page 90: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/90.jpg)
Splitting senses
tiempo time_N vez
![Page 91: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/91.jpg)
Splitting senses
time_1_N tiempo
time_2_N vez
![Page 92: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/92.jpg)
Splitting senses
time_1_N tiempo Zeit
time_2_N vez Mal
![Page 93: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/93.jpg)
Splitting senses
weather_N Wetter
time_1_N tiempo Zeit
time_2_N vez Mal
![Page 94: Syntax and Semantics WoLLIC 2014, Valparaiso, 1-4 ...aarne/Translation-Wollic.pdfSyntax and Semantics of Translation Aarne Ranta WoLLIC 2014, Valparaiso, 1-4 September](https://reader033.fdocuments.net/reader033/viewer/2022052821/6072c8b663eb1d65e371c9a9/html5/thumbnails/94.jpg)
See also: 4th GF Summer School
19-31 July 2015 in Marsalforn, Malta