Statistical Machine Translation Part III – Phrase- based SMT / Decoding

61
Statistical Machine Translation Part III – Phrase-based SMT / Decoding Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23 EMA Summer School

description

Statistical Machine Translation Part III – Phrase- based SMT / Decoding. Alex Fraser Institute for Natural Language Processing University of Stuttgart 2008.07.23 EMA Summer School. Outline. Phrase- based translation Log-linear model Tuning log-linear model Decoding. - PowerPoint PPT Presentation

Transcript of Statistical Machine Translation Part III – Phrase- based SMT / Decoding

Page 1: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Statistical Machine TranslationPart III – Phrase-based SMT / Decoding

Alex FraserInstitute for Natural Language Processing

University of Stuttgart

2008.07.23 EMA Summer School

Page 2: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Outline

• Phrase-based translation • Log-linear model• Tuning log-linear model• Decoding

Page 3: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 4: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 5: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Language Model

• Usually a trigram language model is used for p(e)• P(the man went home) = p(the | START) p(man | START

the) p(went | the man) p(home | man went)• Language models work well for comparing the

grammaticality of strings of the same length– However, when comparing short strings with long strings

they favor short strings– For this reason, a very important component of the language

model is the length bonus• This is a constant > 1 multiplied for each English word in the

hypothesis

Page 6: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Modified from Koehn 2008

d

Page 7: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 8: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 9: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 10: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 11: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 12: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 13: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 14: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 15: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 16: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 17: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 18: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Outline

• Phrase-based translation • Log-linear model• Tuning log-linear model• Decoding

Page 19: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 20: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 21: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 22: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 23: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 24: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 25: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 26: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 27: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Outline

• Phrase-based translation model• Log-linear model• Tuning log-linear model automatically• Decoding

Page 28: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Outline

• Phrase-based translation model• Log-linear model• Tuning log-linear model automatically• Decoding– Basic phrase-based decoding– Dealing with complexity

• Recombination• Pruning• Future cost estimation

– Decoding output

Page 29: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 30: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 31: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 32: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 33: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 34: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 35: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 36: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 37: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 38: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 39: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 40: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 41: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 42: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 43: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 44: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 45: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 46: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 47: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 48: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 49: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 50: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 51: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 52: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 53: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 54: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 55: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 56: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 57: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 58: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 59: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Slide from Koehn 2008

Page 60: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Assignment 2

• Build a state of the art phrase-based SMT system!– German to English or French to English– Using a small amount of data– This is a „learning by doing“ exercise

• See my home page again

Page 61: Statistical  Machine  Translation Part III – Phrase- based  SMT / Decoding

Thank you!