Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

68
Machine Translation Jan Odijk Utrecht. March 7, 2011 1

Transcript of Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Page 1: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Machine Translation

Jan Odijk

Utrecht. March 7, 2011

1

Page 2: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Overview

• Lexicons• Statistical MT• MT: What is (perhaps) possible• Conclusions

2

Page 3: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Lexicons

• “Wat helemaal niet moeilijk is– Grote woordenboeken met veel moeilijke

woorden en vaktermen”– (Steven Krauwer, vorige college)

• I disagree

3

Page 4: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Lexicons

• True if you know the words and terms in advance

• But new words and terms (usually with different translations) are created all the time in science, technology and industry

• So you must have techniques to find (identify, extract) such new words/terms and their translations as automatically as possible

– To tune the lexicons to specific domains– to continuously extend them

4

Page 5: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Lexicons

• Many terms are multiword expressions – With some internal variation– Not always contiguous– This requires special treatment in the lexicon

and in the grammar• House* of representatives (Chambre* des représentants)• Patatas* fritas* (French fries*)• Chômeur* (Unemployed person*)

5

Page 6: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Lexicons

• Modern formal grammars depend highly on lexical properties

• They have very general rule schemata, which are filled in by properties of lexical items– e.g. a word of category X and its complements

form a XPhrase– E.g. mass nouns can occur without article in

singular; – count nouns can occur with een in singular 6

Page 7: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Lexicons

• Properties of lexical items– E.g. which complements a verb takes

• E.g. a direct object noun phrase, also an indirect object, predicate, prepositional complement, etc

• E.g. an infinitival complement, with or with te, with or without om, with or without a subject, etc.

– With which preposition it can be combined• Kijken naar, zorgen voor, houden van

– Nouns: mass or count?

7

Page 8: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Lexicons

• Traditional dictionaries do not contain such information (or very rarely)

• And what is available is not represented in a formal manner

• So computers cannot use this information directly

8

Page 9: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Lexicons

• It is very difficult to assign such properties correctly in a systematic manner– It requires very good knowledge of syntax– Often the phenomena are not understood well

enough– Words often have multiple options with

different meanings and translations– Try it yourself for lopen; innemen– Count/Mass: vis; wijn; bestek; meubilair

9

Page 10: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Lexicons

• It is very difficult to assign such properties correctly in a systematic manner (Cont.)– Lexicographers are not trained to assign such

properties– It must be done for many words– Consistency within one person is hard to

achieve– Consistency among multiple people is evebn

harder

10

Page 11: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Lexicon: Semantics

• Selection restrictions with type system to approach modeling of world knowledge– Requires sophisticated syntactic analysis

• Boek: info (legible)• Uur: time unit duration• Vergadering: event duration• Lezen: subject=human; object=info (legible)• Durational adjunct must be a duration phrase

11

Page 12: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Lexicon: Semantics

• Selection restrictions– Pak (1) (suit): cloths– Pak (2) (package): entity– Dragen (1) (wear): subj=animate; object=cloths– Dragen (2) (carry): subj=animate; object= entity – Schoen: cloths– Entity > cloths– Identity preferred over subsumption– Homogeneous object preferred over heterogeneous one

12

Page 13: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Lexicon: Semantics

• Selection restrictions– Hij draagt een bruin pak

• He wears a brown suit (1: cloths=cloths) • He carries a brown package (1: entity=entity)• He carries a brown suit (2: entity > cloth)• *He wears a brown package (cloth ¬> entity)

– Hij draagt een bruin pak en zwarte schoenen• He wears a brown suit and black shoes (1: homogeneous and

cloths=cloths)• He carries a brown suit and black shoes (2: homogeneous but

entity > cloths)• He carries a brown package and black shoes(2:

inhomogeneous but entity=entity)• *He wears a brown package and black shoes (cloths ¬> entity)

13

Page 14: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Statistical MT

• Statistical MT • Derives MT-system automatically

– From statistics taken from• Aligned parallel corpora ( translation model)• Monolingual target language corpora ( language

model)• Being worked since early 90’s

14

Page 15: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Statistical MT

• Plus:– No or very limited grammar development– Includes language and world knowledge automatically

(but implicitly)– Based on actually occurring data– Currently many experimental and commercial systems

• Minus:– Requires large aligned parallel corpora– Unclear how much linguistics will be needed anyway– Probably restricted to very limited domains only

15

Page 16: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Statistical MT

• Google Translate (statistical MT)• Hij draagt een pak. √He wears a suit.• Hij draagt schoenen. √ He wears shoes.• Hij draagt bruine schoenen en een pak.

• √ He wears a suit and brown shoes. (!!)• Hij draagt het pakket √ He carries the package• Hij heeft een pak aan. *He has a suit.• Voert uw bedrijf sloten uit?

– *Does your company locks out?

16

Page 17: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Hybrid MT

• Euromatrix esp. “the Euromatrix”, and– Successor project EuromatrixPlus– …– Efficient inclusion of linguistic knowledge into statistical

machine translation– The development and testing of hybrid architectures for the

integration of rule-based and statistical approaches

17

Page 18: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Hybrid MT

• META-NET 2010-2013 (EU-funding)– Building a community with shared vision and strategic

research agenda– Building META-SHARE, an open resource exchange

facility– Building bridges to neighbouring technology fields

• Bringing more Semantics into Translation• Optimising the Division of Labour in Hybrid MT• Exploiting the Context for Translation• Empirical Base for Machine Translation

18

Page 19: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Hybrid MT

• PACO-MT 2008-2011• Investigates hybrid approach to MT

– Rule-based and statistical– Uses existing parser for source language

analysis– Uses statistical n-gram language models for

generation– Uses statistical approach to transfer

19

Page 20: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Cross-Language Information Retrieval• Low Quality MT for Gist extraction• MT and Speech Technology• Controlled Language• Limited Domain• Interaction with author• Combinations of the above• Computer-aided translation

20

Page 21: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Cross-Language Information Retrieval (CLIR)– Input query: in own language– Input query translated into target languages– Search in target language documents– Results in target language

• Translation of individual words only• Growing need (growing multilingual Web)• No perfect translation required

21

Page 22: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

22

Page 23: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Low quality MT for Gist extraction• Low quality but still useful• If interesting high quality human translation

can be requested (has to be paid for)

23

Page 24: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

24

Page 25: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

25

Page 26: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• CLIR– Fills a growing need in the market– Is technically feasible– Creates need for translation of found

documents• Solved partially by low quality MT• Potentially creates need for more human translation• Stimulates (funds) research into more sophisticated

MT

26

Page 27: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Combine MT (statistical or rule-based) with OCR technology– Make a picture of a text with your phone– Text is OCR-ed– Text is translated– (usually a short and simple text)

• Linguatec Shoot & Translate• Word Lens

27

Page 28: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Combine MT (statistical or rule-based) with Speech technology– Complicates the problem on the one hand but– Speech technology (ASR) is currently limited to very

limited domains (makes MT simpler)– Many useful applications for speech technology

currently in the market• Directory assistance Tourist Information• Tourist communication Call Centers• Navigation Hotel reservations

– Some will profit from in-built automatic translation

28

Page 29: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Large EC FP6 project TC-STAR (2004-)– (http://www.tc-star.org/)– Research into improved speech technology

(ASR and TTS)– Research into statistical MT– Research in combining both (speech-to-speech

translation)– In a few selected limited domains

29

Page 30: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Commercial Speech2Speech Translation• Jibbigo

– http://www.jibbigo.com• Speech-to-speech translation (iPhone, Android) • http://www.phonedog.com/2009/10/30/iphone-app-j

ibbigo-speech-translator

• Talk to Me (Android phones)

30

Page 31: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Controlled Language– Authoring System limits vocabulary and syntax

of document authors– Often desirable in companies to get consistent

documentation (e.g. aircraft maintenance manuals)• AECMA Simplified English• GIFAS Rationalized French

– Makes MT easier (language well-defined)

31

Page 32: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Limited Domain– Translation of

• Weather reports (TAUM-Meteo, Canada)• Avalanche warnings (Switzerland)

– Fast adaptation to domain/company-specific vocabulary and terminology

32

Page 33: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Interaction with author– No fully automatic translation– Document author resolves

• Ambiguities unresolved by the system• In a dialogue between the author and the system in

the source language• Approach taken in Rosetta project (Philips)• Will only work if the

– #unresolved ambiguities is low– Questions to resolve ambiguity are clear

33

Page 34: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Hij droeg een bruin pak– Wat bedoelt u met “pak”

• (1) kostuum• (2) pakket

• Hij droeg een bruin pak– Wat bedoelt u met “dragen (droeg)”

• (1) aan of op hebben (kleding)• (2) bij zich hebben (bijv. in de hand)

34

Page 35: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Combinations of the above

35

Page 36: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT: What is (perhaps) possible

• Computer-aided translation– For end-users– For professional translators/localization industry

• Limited functionality– Specific terminology

• Bootstrap translation automatically– Human revision and correction (Post-edit)

• Only if– MT Quality is such that it reduces effort– The system is fully integrated in the workflow system

36

Page 37: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Conclusions

• MT is really very difficult!• Even making a lexicon for an MT system is very

difficult (and a lot of work) • Statistical MT yields practical relatively quick to

produce systems (but low-quality)– Provided you have huge amounts of data

• Focus of research is on hybrid systems (mixed statistically based/knowledge based) (PACO-MT, META-NET,…)

37

Page 38: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Conclusions

• Several constrained versions do yield usable technology with state-of-the-art MT

• In some cases: even potentially creates additional needs for MT and human translation

38

Page 40: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

Do not go beyond this slide

40

Page 41: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Evaluation depends on purpose of MT and how it is used– application, domain, controlled language

• Many aspects can be evaluated– functionality, efficiency, usability, reliability,

maintainability, portability– translation quality– embedding in work flow

• post-editing options/tools

41

Page 42: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Focus here:– does the system yield good translations

according to human judgement– in the context of developing a system

• Again, many aspects:– fidelity (how close), correctness, adequacy,

informativeness, intelligibility, fluency– and many ways to measure these aspects

42

Page 43: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation• Test suite

– Reference = • list of (carefully selected) sentences• with their translations (ordered by score)

– translations judged correct by human (usually developer)– upon every update of the system output of the new system is compared to the

reference• if different: system has to be adapted, or reference has to be adapted

• Advantages– focus on specific translation problems possible– excellent for regression testing– Manual judgement needed only once for each new output

• –other comparisons are automatic• Disadvantages

– not really independent– particularly suited for pure rule-based systems– human judgement needed if output differs from reference

43

Page 44: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Comparison against– translation corpus– independently created by human translators– possibly multiple equivalently correct translations of a sentence

• Advantages– truely independent– also suited for data-driven systems

• Disadvantage– requires human judgement (every time there is a system update)

• high effort by highly skilled people, high costs, requires a lot of time– human judgement is not easy (unless there is a perfect match)

• Useful – for a one-time evaluation of a stable system– not for evaluation during development

44

Page 45: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Edit-Distance (Word Accuracy)– metric to determine closeness of translations

automatically– the least number of edit operations to turn the

translated sentence into the reference sentence– Alshawi et al. 1998

45

Page 46: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• WA = 1- ((d+s+i)/max(r,c))• d= number of deletions• s = number of substitutions• i = number of insertions• r = reference sentence length• c = candidate sentence length• easy to calculate using Levenshtein distance

algorithm (dynamic programming)• various extensions have been proposed

46

Page 47: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Advantages– fully automatic given a reference set

• Disadvantages– penalizes candidates if a synonym is used– penalizes swaps of words and block of words

too much

47

Page 48: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• BLEU (method to automate MT Evaluation)– the closer a machine translation is to a

professional human translation, the better it is– BiLingual Evaluation Understudy

• Required:– corpus of good quality human reference

translations– a “closeness” metric 48

Page 49: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Two candidate translations from Chinese source– C1: It is a guide to action which ensures that

the military always obeys the commands of the party

– C2: It is to insure the troops forever hearing the activity guidebook that party direct

• Intuitively: C1 is better than C2

49

Page 50: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Three reference translations– R1: It is a guide to action that ensures that the

military will forever heed Party commands– R2: It is the guiding principle which guarantees

the military forces always being under the command of the Party

– R3: It is the practical guide for the army always to heed the directions of the party

50

Page 51: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Basic idea:– a good candidate translation shares many words

and phrases with reference translations– comparing n-gram matches can be used to

rank candidate translations• n-gram: a sequence of n word occurrences

– in BLEU n=1,2,3,4- 1-grams give a measure of adequacy- longer n-grams give a measure of fluency

51

Page 52: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• For unigrams:– count the number of matching unigrams

• in all references

– divide by the total number of unigrams (in the candidate sentence)

52

Page 53: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Problem– C1: the the the the the the the (=7/7=1)– R1: the cat is on the mat

• Solution:– clip matching count (7) by maximum reference

count (2) 2 (CountClip)

– modified unigram precision = 2/7=0.29

53

Page 54: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Example (unigrams)– C1: It is a guide to action which ensures that the

military always obeys the commands of the party (17/18=0.94)

– R1: It is a guide to action that ensures that the military will forever heed Party commands

– R2: It is the guiding principle which guarantees the military forces always being under the command of the Party

– R3: It is the practical guide for the army always to heed the directions of the party

54

Page 55: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Example (unigrams)– C2: It is to insure the troops forever hearing the activity

guidebook that party direct (8/14=0.57)– R1: It is a guide to action that ensures that the military

will forever heed Party commands– R2: It is the guiding principle which guarantees the

military forces always being under the command of the Party

– R3: It is the practical guide for the army always to heed the directions of the party

55

Page 56: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Example (bigrams)– C1: It is a guide to action which ensures that the

military always obeys the commands of the party (10/17=0.59)

– R1: It is a guide to action that ensures that the military will forever heed Party commands

– R2: It is the guiding principle which guarantees the military forces always being under the command of the Party

– R3: It is the practical guide for the army always to heed the directions of the party

56

Page 57: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Example (bigrams)– C2: It is to insure the troops forever hearing the activity

guidebook that party direct (1/13=0.08)– R1: It is a guide to action that ensures that the military

will forever heed Party commands– R2: It is the guiding principle which guarantees the

military forces always being under the command of the Party

– R3: It is the practical guide for the army always to heed the directions of the party

57

Page 58: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Extend to a full multi-sentence corpus• compute n-gram matches sentence by sentence• sum the clipped n-gram counts for all candidates• divide by the number of n-grams in the text corpus• pn =

– ∑C ∈ {Candidates}∑n-gram ∈ C Countclip(n-gram)– divided by– ∑C’ ∈ {Candidates}∑n-gram’ ∈ C’ Count(n-gram’)

58

Page 59: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Combining n-gram precision scores• weighted linear average works reasonable

– ∑Nn=1 wn pn

• but: n-gram decisions decays exponentially with n (so log to compensate for this)– exp (∑N

n=1 wn log pn)

• weights in BLEU: wn = 1/N

59

Page 60: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• BLEU is a precision measure– #(C ∩ R) / #C

• Recall is difficult to define because of multiple reference translations– e.g. #(C ∩ Rs) / # Rs

• where Rs = Ui Ri

– will not work

60

Page 61: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• C1: I always invariably perpetually do• C2: I always do• R1: I always do• R2: I invariably do• R3: I perpetually do• Recall of C1 over R1-3 is better than C2• but C2 is a better translation

61

Page 62: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• But without Recall:– C1: of the– compared with R1-3 as before– modified unigram precision = 2/2– modified bigram precision = 1/1– which is the wrong result

62

Page 63: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Length– n-gram precision penalizes translations longer

than the reference– but not translations shorter than the reference– Add Brevity Penalty (BP)

63

Page 64: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• bi= best match length = reference sentence length closest to candidate sentence i‘s length (e.g. r:12, 15, 17, c: 12 12)

• r = test corpus effective reference length = ∑i bi

• c = total length of candidate translation corpus

64

Page 65: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• BP = – computed over the corpus– not sentence by sentence and averaged– 1 if c > r– e(1-r/c) if c <= r

• BLEU = BP • exp (∑Nn=1 wn log pn)

65

Page 66: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• BLEU:– claim: BLEU closely matches human judgement

• when averaged over a test corpus• not necessarily on individual sentences• shown extensively in Papineni et al. 2001

– multiple reference translations are desirable• to cancel out translation styles of individual translators• (e.g. East Asian economy v. economy of East Asia)

66

Page 67: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Variants on BLEU– NIST

• http://www.nist.gov/speech/tests/mt/doc/ngram-study.pdf

• different weights• different BP

– ROUGE (Lin and Hovy 2003) • for text summarization• Recall-Oriented Understudy for Gisting Evaluation

67

Page 68: Machine Translation Jan Odijk Utrecht. March 7, 2011 1.

MT Evaluation

• Main Advantage of BLEU– automatic evaluation

• good for use during development• particularly useful for data-based systems

• Disadvantage– defined for a whole test corpus– not for individual sentences– just measures difference with reference

68