Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new...

25
Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation

Transcript of Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new...

Page 1: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI

Taking on new challenges in multi-word unit processing for machine

translation

Page 2: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Outline

2Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011

Page 3: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Multi-word units in the Lexicon-Grammar: definition

3Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 4: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Multi-word units in the Lexicon-Grammar

4Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 5: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Multi-word units in the Lexicon-Grammar : part of a continuum

5Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 6: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Multi-word units in the Lexicon-Grammar: lemmatization

6Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 7: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Multi-word units in the Lexicon-Grammar: lemmatization criteria

7Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 8: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

The corpus-linguistic approach

8Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 9: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Multi-word units in Machine Translation

9Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 10: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Multi-word units in Machine Translation: main problems

10Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 11: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Multi-word units in Machine Translation: different approaches

Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011

Page 12: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Lexical ambiguities handled by different systems

• Corpus: non-specialized texts approx. 300 sentences (10,000 words) multi-word units extracted from the Web

Webcorp LSE, Web as a Corpus

• MT systems : Google TranslateOpenLogos

12Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 13: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Typical ambiguities: examples

13Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 14: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Typical ambiguities: examples

14Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 15: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Integration of Semantico-Syntactic knowledge

15Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 16: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Integration of Semantico-Syntactic knowledge

16Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 17: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Integration of Semantico-Syntactic knowledge: mix up

17Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 18: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Semantic table (SEMTAB ) rule

Italian Transfer

MIX UP(VT) IN MESCOLARE IN

MIX UP(VT) N IN MESCOLARE N IN

MIX UP(VT) N WITH CONFONDERE N CON

MIX UP(VT) N(HUMAN) IN CONFONDERE N IN

MIX UP(VT) N(INGREDIENT) MESCOLARE N

MIX UP(VT) N(MEDICINE) PREPARARE N

MIX UP(VT) WITH CONFONDERE CON

MIX UP(VT) N(HUMAN,INFO) WITH CONFONDERE N CON

SemTab rules comment lines for the verb mix up

Taking on new challenges in Multi-word Unit processing for machine translation – FreeBMT 2011

Page 19: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Qualitative MT Evaluation metrics

19Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 20: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Qualitative MT Evaluation metrics

20Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 21: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Qualitative MT Evaluation metrics

21Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 22: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Qualitative MT Evaluation metrics

22Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 23: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Qualitative MT Evaluation metrics: the «ideal» evaluation tool

23Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 24: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Conclusions

24Taking on new challenges in Multi-word Unit processing for machine translation –

FreeBMT 2011

Page 25: Johanna MONTI, Anabela BARREIRO, Annibale ELIA, Federica MARANO, Antonella NAPOLI Taking on new challenges in multi-word unit processing for machine translation.

Johanna MONTI, Anabela BARREIROAnnibale ELIA, Federica MARANO, Antonella NAPOLI

Thank you for your attention !