Two Aspects of Text Representations for NLP and MT: Morphology and Deep...

192
Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learning Hinrich Sch¨ utze Center for Information and Language Processing, LMU Munich 2015-09-10 Sch¨ utze, LMU Munich: Text Representations for NLP and MT 1 / 64

Transcript of Two Aspects of Text Representations for NLP and MT: Morphology and Deep...

Page 1: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Two Aspects of

Text Representations for NLP and MT:Morphology and Deep Learning

Hinrich Schutze

Center for Information and Language Processing, LMU Munich

2015-09-10

Schutze, LMU Munich: Text Representations for NLP and MT 1 / 64

Page 2: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Text Representations for NLP and MT

How should the input to NLP and MT systems berepresented?

Schutze, LMU Munich: Text Representations for NLP and MT 2 / 64

Page 3: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Text Representations for NLP and MT

How should the input to NLP and MT systems berepresented?

Statistical NLP/MT

Schutze, LMU Munich: Text Representations for NLP and MT 2 / 64

Page 4: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Text Representations for NLP and MT

How should the input to NLP and MT systems berepresented?

Statistical NLP/MT

The representation should make generalization easy.

Schutze, LMU Munich: Text Representations for NLP and MT 2 / 64

Page 5: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Text Representations for NLP and MT

How should the input to NLP and MT systems berepresented?

Statistical NLP/MT

The representation should make generalization easy.

Rule-based NLP/MT

Schutze, LMU Munich: Text Representations for NLP and MT 2 / 64

Page 6: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Text Representations for NLP and MT

How should the input to NLP and MT systems berepresented?

Statistical NLP/MT

The representation should make generalization easy.

Rule-based NLP/MT

The representation should make it easy to formulate accurate,broad-coverage rules/constraints.

Schutze, LMU Munich: Text Representations for NLP and MT 2 / 64

Page 7: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Text Representations for NLP and MT

How should the input to NLP and MT systems berepresented?

Statistical NLP/MT

The representation should make generalization easy.

Rule-based NLP/MT

The representation should make it easy to formulate accurate,broad-coverage rules/constraints.

Topic of this talk: two aspects of “good” representation

Schutze, LMU Munich: Text Representations for NLP and MT 2 / 64

Page 8: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Text Representations for NLP and MT

How should the input to NLP and MT systems berepresented?

Statistical NLP/MT

The representation should make generalization easy.

Rule-based NLP/MT

The representation should make it easy to formulate accurate,broad-coverage rules/constraints.

Topic of this talk: two aspects of “good” representation

morphology

Schutze, LMU Munich: Text Representations for NLP and MT 2 / 64

Page 9: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Text Representations for NLP and MT

How should the input to NLP and MT systems berepresented?

Statistical NLP/MT

The representation should make generalization easy.

Rule-based NLP/MT

The representation should make it easy to formulate accurate,broad-coverage rules/constraints.

Topic of this talk: two aspects of “good” representation

morphologydeep learning embeddings

Schutze, LMU Munich: Text Representations for NLP and MT 2 / 64

Page 10: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Overview

1 Morphology

2 Deep learning embeddings

3 Morphological lexica vs embeddings

4 For units of which granularities should we use embeddings?

5 Using deep learning (in general) in MT

Schutze, LMU Munich: Text Representations for NLP and MT 3 / 64

Page 11: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Disclaimer

I am not an MT researcher!

Page 12: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Outline

1 Morphology

2 Deep learning embeddings

3 Morphological lexica vs embeddings

4 For units of which granularities should we use embeddings?

5 Using deep learning (in general) in MT

Schutze, LMU Munich: Text Representations for NLP and MT 5 / 64

Page 13: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology in MT

Schutze, LMU Munich: Text Representations for NLP and MT 6 / 64

Page 14: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology in MT

Much of statistical NLP:estimate and use pθ(y |x), y ∈ Y , x ∈ X

Schutze, LMU Munich: Text Representations for NLP and MT 6 / 64

Page 15: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology in MT

Much of statistical NLP:estimate and use pθ(y |x), y ∈ Y , x ∈ X

X = representation for language, including words

Schutze, LMU Munich: Text Representations for NLP and MT 6 / 64

Page 16: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology in MT

Much of statistical NLP:estimate and use pθ(y |x), y ∈ Y , x ∈ X

X = representation for language, including words

Y = some event / fact / observation we care about

Schutze, LMU Munich: Text Representations for NLP and MT 6 / 64

Page 17: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology in MT

Much of statistical NLP:estimate and use pθ(y |x), y ∈ Y , x ∈ X

X = representation for language, including words

Y = some event / fact / observation we care about

Sparseness: Estimating pθ is hard because language (eventspace X ) is very sparse.

Schutze, LMU Munich: Text Representations for NLP and MT 6 / 64

Page 18: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology in MT

Much of statistical NLP:estimate and use pθ(y |x), y ∈ Y , x ∈ X

X = representation for language, including words

Y = some event / fact / observation we care about

Sparseness: Estimating pθ is hard because language (eventspace X ) is very sparse.

Morphological analysis can reduce sparseness.

Schutze, LMU Munich: Text Representations for NLP and MT 6 / 64

Page 19: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology in MT

Much of statistical NLP:estimate and use pθ(y |x), y ∈ Y , x ∈ X

X = representation for language, including words

Y = some event / fact / observation we care about

Sparseness: Estimating pθ is hard because language (eventspace X ) is very sparse.

Morphological analysis can reduce sparseness.

Morphological analysis improves estimates of pθ.

Schutze, LMU Munich: Text Representations for NLP and MT 6 / 64

Page 20: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology in MT

Much of statistical NLP:estimate and use pθ(y |x), y ∈ Y , x ∈ X

X = representation for language, including words

Y = some event / fact / observation we care about

Sparseness: Estimating pθ is hard because language (eventspace X ) is very sparse.

Morphological analysis can reduce sparseness.

Morphological analysis improves estimates of pθ.

English is morphologically poor,so simple heuristics are often sufficient.

Schutze, LMU Munich: Text Representations for NLP and MT 6 / 64

Page 21: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology in MT

Much of statistical NLP:estimate and use pθ(y |x), y ∈ Y , x ∈ X

X = representation for language, including words

Y = some event / fact / observation we care about

Sparseness: Estimating pθ is hard because language (eventspace X ) is very sparse.

Morphological analysis can reduce sparseness.

Morphological analysis improves estimates of pθ.

English is morphologically poor,so simple heuristics are often sufficient.

Also true for many other languages.

Schutze, LMU Munich: Text Representations for NLP and MT 6 / 64

Page 22: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology in MT

Much of statistical NLP:estimate and use pθ(y |x), y ∈ Y , x ∈ X

X = representation for language, including words

Y = some event / fact / observation we care about

Sparseness: Estimating pθ is hard because language (eventspace X ) is very sparse.

Morphological analysis can reduce sparseness.

Morphological analysis improves estimates of pθ.

English is morphologically poor,so simple heuristics are often sufficient.

Also true for many other languages.

So this part of the talk only applies to pairs of languages ofwhich at least one is morphologically rich.

Schutze, LMU Munich: Text Representations for NLP and MT 6 / 64

Page 23: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology in MT

For symbolic / rule-based approaches, there is a very similarargument for why you need morphology if you are dealing witha morphologically rich language.

Schutze, LMU Munich: Text Representations for NLP and MT 7 / 64

Page 24: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Anecdotal example

Schutze, LMU Munich: Text Representations for NLP and MT 8 / 64

Page 25: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Anecdotal example

Schutze, LMU Munich: Text Representations for NLP and MT 8 / 64

Page 26: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Anecdotal example

Inflected form “flanierst” is not translated.

Schutze, LMU Munich: Text Representations for NLP and MT 8 / 64

Page 27: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Anecdotal example

The lemma “flanieren” is correctly translated as “to stroll”.

Schutze, LMU Munich: Text Representations for NLP and MT 9 / 64

Page 28: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology now?

Schutze, LMU Munich: Text Representations for NLP and MT 10 / 64

Page 29: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology now?

Recent progress: new technology for high-accuracyhigh-performance morphological analysis

Schutze, LMU Munich: Text Representations for NLP and MT 10 / 64

Page 30: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why worry about morphology now?

Recent progress: new technology for high-accuracyhigh-performance morphological analysis

Resources (linguistically annotated corpora) are becomingavailable for an increasing number of languages.

Schutze, LMU Munich: Text Representations for NLP and MT 10 / 64

Page 31: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

http://cistern.cis.lmu.de/marmot

Page 32: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

MarMoT model for German (freely available)

Sei sein number=sg|person=3|tense=pres|mood

diese dieser case=nom|number=sg|gender=fem

uberschritten uberschreiten, ,wurden werden number=pl|person=3|tense=past|mood

die der case=nom|number=pl|gender=*

‘ ‘Signale signal case=nom|number=pl|gender=neut

nicht nichthart hart degree=pos

gestellt stellen” ”. .

Schutze, LMU Munich: Text Representations for NLP and MT 12 / 64

Page 33: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

MarMoT model for Czech (freely available)

Nazor nazor num=s|gen=m|cas=a

experta expert num=s|gen=m|cas=a

Informace informace num=p|gen=f|cas=n

zverejnene zverejneny num=p|gen=f|deg=p|cas=n

v v cas=l

Profitu profit num=s|gen=m|cas=l

o o cas=l

moznostech moznost num=p|gen=f|cas=l

vyuzitı vyuzitı num=s|gen=n|cas=n

poradcu poradce num=p|gen=m|cas=g

Schutze, LMU Munich: Text Representations for NLP and MT 13 / 64

Page 34: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

MarMoT model for Hungarian (freely available)

A a SubPOS=f

gazdasag gazdasag SubPOS=c|Num=s|Cas=n|NumP=none|PerP=none|

ilyen ilyen SubPOS=d|Per=3|Num=s|Cas=n|NumP=none|PerP

merteku merteku SubPOS=f|Deg=p|Num=s|Cas=n|NumP=none|PerP

fejlodeset fejlodes SubPOS=c|Num=s|Cas=a|NumP=s|PerP=3|NumPd=

tobb tobb SubPOS=c|Num=s|Cas=n|Form=l|NumP=none|Per

folyamat folyamat SubPOS=c|Num=s|Cas=n|NumP=none|PerP=none|

gerjeszti gerjeszti SubPOS=f|Deg=p|Num=s|Cas=n|NumP=none|PerP

Schutze, LMU Munich: Text Representations for NLP and MT 14 / 64

Page 35: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

MarMoT model for Spanish (freely available)

que que type=r|num=n|gen=c

se se type=r|num=n|gen=c|per=3

llamaba llamar type=m|num=s|mood=i|ten=i|per=3

la el type=a|num=s|gen=f

voz voz type=c|num=s|gen=f

de de type=p|form=s

la el type=a|num=s|gen=f

conciencia conciencia type=c|num=s|gen=f

Schutze, LMU Munich: Text Representations for NLP and MT 15 / 64

Page 36: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

MarMoT model for Latin (freely available)

Cum cum INFL=n

autem autem INFL=n

perambulasset perambulo PERS=3|NUMB=s|TENS=l|MOOD=s|VOIC=a

partes pars NUMB=p|GEND=f|CASE=a

illas ille NUMB=p|GEND=f|CASE=a

et et INFL=n

exhortatus exhorto NUMB=s|TENS=r|MOOD=p|VOIC=p|GEND=m|CA

eos is PERS=3|NUMB=p|GEND=m|CASE=a

Schutze, LMU Munich: Text Representations for NLP and MT 16 / 64

Page 37: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

MarMoT model for English (freely available)

The the DT

agreements agreement NNS

bring bring VBP

to to IN

a a DT

total total NN

of of IN

nine nine CD

the the DT

number number NN

of of IN

planes plane NNS

the the DT

travel travel NN

company company NN

has have VBZ

sold sell VBNSchutze, LMU Munich: Text Representations for NLP and MT 17 / 64

Page 38: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

What do I need to train a model for a new language?

Schutze, LMU Munich: Text Representations for NLP and MT 18 / 64

Page 39: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

What do I need to train a model for a new language?

A morphologically annotated corpus

Schutze, LMU Munich: Text Representations for NLP and MT 18 / 64

Page 40: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

What do I need to train a model for a new language?

A morphologically annotated corpus

usually 10,000 to 100,000 tokens if annotation is high quality

Schutze, LMU Munich: Text Representations for NLP and MT 18 / 64

Page 41: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

What do I need to train a model for a new language?

A morphologically annotated corpus

usually 10,000 to 100,000 tokens if annotation is high qualitymore in some cases and if the annotation is not high quality

Schutze, LMU Munich: Text Representations for NLP and MT 18 / 64

Page 42: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

What do I need to train a model for a new language?

A morphologically annotated corpus

usually 10,000 to 100,000 tokens if annotation is high qualitymore in some cases and if the annotation is not high quality

Given this resource, training a MarMoT model is efficient andsimple.

Schutze, LMU Munich: Text Representations for NLP and MT 18 / 64

Page 43: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Results (Muller, Cotterell, Fraser, Schutze, 2015)

Schutze, LMU Munich: Text Representations for NLP and MT 19 / 64

Page 44: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Results (Muller, Cotterell, Fraser, Schutze, 2015)

l = lemmatizationt/l = taggig and lemmatization

cs de es hu laALL OOV ALL OOV ALL OOV ALL OOV ALL OOV

l 98.42 93.46 98.10 93.02 98.78 94.86 98.08 94.26 95.36 80.94t/l 89.90 78.34 82.84 62.10 96.41 87.47 93.40 84.15 82.57 54.63

Schutze, LMU Munich: Text Representations for NLP and MT 20 / 64

Page 45: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Lemmatization vs. Morphological features

Lemmatization ready for prime time

Morphological features: you may need more than 100,000tokens in some languages

Schutze, LMU Munich: Text Representations for NLP and MT 21 / 64

Page 46: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Schutze, LMU Munich: Text Representations for NLP and MT 22 / 64

Page 47: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Recent progress in morphological analysis:due to new technology and new resources

Schutze, LMU Munich: Text Representations for NLP and MT 22 / 64

Page 48: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Recent progress in morphological analysis:due to new technology and new resources

Morphological analysis reduces sparseness.

Schutze, LMU Munich: Text Representations for NLP and MT 22 / 64

Page 49: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Recent progress in morphological analysis:due to new technology and new resources

Morphological analysis reduces sparseness.

⇒ better machine translation

Schutze, LMU Munich: Text Representations for NLP and MT 22 / 64

Page 50: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Outline

1 Morphology

2 Deep learning embeddings

3 Morphological lexica vs embeddings

4 For units of which granularities should we use embeddings?

5 Using deep learning (in general) in MT

Schutze, LMU Munich: Text Representations for NLP and MT 23 / 64

Page 51: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

What are embeddings?

Schutze, LMU Munich: Text Representations for NLP and MT 24 / 64

Page 52: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why use embeddings for MT?

Schutze, LMU Munich: Text Representations for NLP and MT 25 / 64

Page 53: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why use embeddings for MT?

Discrete space ⇒ Continuous space

Schutze, LMU Munich: Text Representations for NLP and MT 25 / 64

Page 54: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why use embeddings for MT?

Discrete space ⇒ Continuous space

Much of statistical NLP: estimate and use pθ(y |x), x ∈ X

Schutze, LMU Munich: Text Representations for NLP and MT 25 / 64

Page 55: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why use embeddings for MT?

Discrete space ⇒ Continuous space

Much of statistical NLP: estimate and use pθ(y |x), x ∈ X

X = representation for language, including words

Schutze, LMU Munich: Text Representations for NLP and MT 25 / 64

Page 56: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why use embeddings for MT?

Discrete space ⇒ Continuous space

Much of statistical NLP: estimate and use pθ(y |x), x ∈ X

X = representation for language, including words

If X is discrete:

Schutze, LMU Munich: Text Representations for NLP and MT 25 / 64

Page 57: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why use embeddings for MT?

Discrete space ⇒ Continuous space

Much of statistical NLP: estimate and use pθ(y |x), x ∈ X

X = representation for language, including words

If X is discrete:

It is often hard to deal with rare/unseen events x .

Schutze, LMU Munich: Text Representations for NLP and MT 25 / 64

Page 58: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why use embeddings for MT?

Discrete space ⇒ Continuous space

Much of statistical NLP: estimate and use pθ(y |x), x ∈ X

X = representation for language, including words

If X is discrete:

It is often hard to deal with rare/unseen events x .In language modeling: throw away information, guess(e.g., Kneser-Ney)

Schutze, LMU Munich: Text Representations for NLP and MT 25 / 64

Page 59: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why use embeddings for MT?

Discrete space ⇒ Continuous space

Much of statistical NLP: estimate and use pθ(y |x), x ∈ X

X = representation for language, including words

If X is discrete:

It is often hard to deal with rare/unseen events x .In language modeling: throw away information, guess(e.g., Kneser-Ney)

If X is continuous:

Schutze, LMU Munich: Text Representations for NLP and MT 25 / 64

Page 60: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why use embeddings for MT?

Discrete space ⇒ Continuous space

Much of statistical NLP: estimate and use pθ(y |x), x ∈ X

X = representation for language, including words

If X is discrete:

It is often hard to deal with rare/unseen events x .In language modeling: throw away information, guess(e.g., Kneser-Ney)

If X is continuous:

You may be able to handle rare/unseen events well . . .

Schutze, LMU Munich: Text Representations for NLP and MT 25 / 64

Page 61: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why use embeddings for MT?

Discrete space ⇒ Continuous space

Much of statistical NLP: estimate and use pθ(y |x), x ∈ X

X = representation for language, including words

If X is discrete:

It is often hard to deal with rare/unseen events x .In language modeling: throw away information, guess(e.g., Kneser-Ney)

If X is continuous:

You may be able to handle rare/unseen events well . . .. . . if the continuous space X is smooth in some sense.

Schutze, LMU Munich: Text Representations for NLP and MT 25 / 64

Page 62: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Why use embeddings for MT?

Discrete space ⇒ Continuous space

Much of statistical NLP: estimate and use pθ(y |x), x ∈ X

X = representation for language, including words

If X is discrete:

It is often hard to deal with rare/unseen events x .In language modeling: throw away information, guess(e.g., Kneser-Ney)

If X is continuous:

You may be able to handle rare/unseen events well . . .. . . if the continuous space X is smooth in some sense.State of the art in language modeling: continuous space

Schutze, LMU Munich: Text Representations for NLP and MT 25 / 64

Page 63: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Schutze, LMU Munich: Text Representations for NLP and MT 26 / 64

Page 64: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach: compute embeddings for word forms

Schutze, LMU Munich: Text Representations for NLP and MT 26 / 64

Page 65: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach: compute embeddings for word forms

However, in most cases, the lemma is nexus at whichform-meaning pairing is located.

Schutze, LMU Munich: Text Representations for NLP and MT 26 / 64

Page 66: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach: compute embeddings for word forms

However, in most cases, the lemma is nexus at whichform-meaning pairing is located.

Exceptions:air/s, blind/s, custom/s, manner/s, spectacle/s, wood/s

Schutze, LMU Munich: Text Representations for NLP and MT 26 / 64

Page 67: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach: compute embeddings for word forms

However, in most cases, the lemma is nexus at whichform-meaning pairing is located.

Exceptions:air/s, blind/s, custom/s, manner/s, spectacle/s, wood/s

However, this can be seen as just one instance of the generalphenomenon of noncompositionality in language.

Schutze, LMU Munich: Text Representations for NLP and MT 26 / 64

Page 68: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach: compute embeddings for word forms

However, in most cases, the lemma is nexus at whichform-meaning pairing is located.

Exceptions:air/s, blind/s, custom/s, manner/s, spectacle/s, wood/s

However, this can be seen as just one instance of the generalphenomenon of noncompositionality in language.

E.g., hot dog, red herring, kick the bucket

Schutze, LMU Munich: Text Representations for NLP and MT 26 / 64

Page 69: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach: compute embeddings for word forms

However, in most cases, the lemma is nexus at whichform-meaning pairing is located.

Exceptions:air/s, blind/s, custom/s, manner/s, spectacle/s, wood/s

However, this can be seen as just one instance of the generalphenomenon of noncompositionality in language.

E.g., hot dog, red herring, kick the bucket

If we pick a single level for embeddings,then the lemma level is a good one.

Schutze, LMU Munich: Text Representations for NLP and MT 26 / 64

Page 70: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Schutze, LMU Munich: Text Representations for NLP and MT 27 / 64

Page 71: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach now: word forms

Schutze, LMU Munich: Text Representations for NLP and MT 27 / 64

Page 72: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach now: word formsWuerden die Signale nicht hart gestellt~vwuerden ~vdie ~vSignale ~vnicht ~vhart ~vgestellt

Schutze, LMU Munich: Text Representations for NLP and MT 27 / 64

Page 73: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach now: word formsWuerden die Signale nicht hart gestellt~vwuerden ~vdie ~vSignale ~vnicht ~vhart ~vgestellt

Better approach: lemmata

Schutze, LMU Munich: Text Representations for NLP and MT 27 / 64

Page 74: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach now: word formsWuerden die Signale nicht hart gestellt~vwuerden ~vdie ~vSignale ~vnicht ~vhart ~vgestellt

Better approach: lemmataWuerden die Signale nicht hart gestellt~vwerden ~vder ~v signal ~vnicht ~vhart ~v stellen

Schutze, LMU Munich: Text Representations for NLP and MT 27 / 64

Page 75: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach now: word formsWuerden die Signale nicht hart gestellt~vwuerden ~vdie ~vSignale ~vnicht ~vhart ~vgestellt

Better approach: lemmataWuerden die Signale nicht hart gestellt~vwerden ~vder ~v signal ~vnicht ~vhart ~v stellen

Or perhaps: lemmata + morph vectors

Schutze, LMU Munich: Text Representations for NLP and MT 27 / 64

Page 76: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach now: word formsWuerden die Signale nicht hart gestellt~vwuerden ~vdie ~vSignale ~vnicht ~vhart ~vgestellt

Better approach: lemmataWuerden die Signale nicht hart gestellt~vwerden ~vder ~v signal ~vnicht ~vhart ~v stellen

Or perhaps: lemmata + morph vectorsWuerden die Signale nicht~vwerden ~vµ1~vµ5 . . . ~vder ~vµ8~vµ1 . . . ~v signal ~vµ6~vµ2 . . . ~vnicht ~vµ3~vµ4 .

Schutze, LMU Munich: Text Representations for NLP and MT 27 / 64

Page 77: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach now: word formsWuerden die Signale nicht hart gestellt~vwuerden ~vdie ~vSignale ~vnicht ~vhart ~vgestellt

Better approach: lemmataWuerden die Signale nicht hart gestellt~vwerden ~vder ~v signal ~vnicht ~vhart ~v stellen

Or perhaps: lemmata + morph vectorsWuerden die Signale nicht~vwerden ~vµ1~vµ5 . . . ~vder ~vµ8~vµ1 . . . ~v signal ~vµ6~vµ2 . . . ~vnicht ~vµ3~vµ4 .

Or perhaps: lemmata + morph features

Schutze, LMU Munich: Text Representations for NLP and MT 27 / 64

Page 78: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Best level for embeddings?

Standard approach now: word formsWuerden die Signale nicht hart gestellt~vwuerden ~vdie ~vSignale ~vnicht ~vhart ~vgestellt

Better approach: lemmataWuerden die Signale nicht hart gestellt~vwerden ~vder ~v signal ~vnicht ~vhart ~v stellen

Or perhaps: lemmata + morph vectorsWuerden die Signale nicht~vwerden ~vµ1~vµ5 . . . ~vder ~vµ8~vµ1 . . . ~v signal ~vµ6~vµ2 . . . ~vnicht ~vµ3~vµ4 .

Or perhaps: lemmata + morph featuresWuerden die Signale nicht hart~vwerden 010010 ~vder 100010 ~v signal 111000 ~vnicht 001100 ~vhart 000001

Schutze, LMU Munich: Text Representations for NLP and MT 27 / 64

Page 79: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Use embeddings for lemmata, not for word forms

Schutze, LMU Munich: Text Representations for NLP and MT 28 / 64

Page 80: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Outline

1 Morphology

2 Deep learning embeddings

3 Morphological lexica vs embeddings

4 For units of which granularities should we use embeddings?

5 Using deep learning (in general) in MT

Schutze, LMU Munich: Text Representations for NLP and MT 29 / 64

Page 81: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

This section based on work by Thomas Muller.“Robust Morphological Tagging with Word Representa-tions” (NAACL 2015)

Schutze, LMU Munich: Text Representations for NLP and MT 30 / 64

Page 82: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Task: Morphological tagging

Schutze, LMU Munich: Text Representations for NLP and MT 31 / 64

Page 83: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Task: Morphological tagging

Disambiguate part-of-speech and morphology

Schutze, LMU Munich: Text Representations for NLP and MT 31 / 64

Page 84: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Task: Morphological tagging

Disambiguate part-of-speech and morphology

Example:Ein ART case=nom|number=sg|gender=neutKlettergebiet NN case=nom|number=sg|gender=neutmacht VVFIN number=sg|person=3|tense=pres|mood=indGeschichte NN case=acc|number=sg|gender=fem

Schutze, LMU Munich: Text Representations for NLP and MT 31 / 64

Page 85: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Task: Morphological tagging

Disambiguate part-of-speech and morphology

Example:Ein ART case=nom|number=sg|gender=neutKlettergebiet NN case=nom|number=sg|gender=neutmacht VVFIN number=sg|person=3|tense=pres|mood=indGeschichte NN case=acc|number=sg|gender=fem

Part-of-speech disambiguation: ART, NN, VFIN

Schutze, LMU Munich: Text Representations for NLP and MT 31 / 64

Page 86: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Task: Morphological tagging

Disambiguate part-of-speech and morphology

Example:Ein ART case=nom|number=sg|gender=neutKlettergebiet NN case=nom|number=sg|gender=neutmacht VVFIN number=sg|person=3|tense=pres|mood=indGeschichte NN case=acc|number=sg|gender=fem

Part-of-speech disambiguation: ART, NN, VFIN

Morphological disambiguation: case=nom, number=sg,tense=pres, mood=ind etc

Schutze, LMU Munich: Text Representations for NLP and MT 31 / 64

Page 87: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Problem setting: Domain adaptation

Schutze, LMU Munich: Text Representations for NLP and MT 32 / 64

Page 88: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Problem setting: Domain adaptation

Schutze, LMU Munich: Text Representations for NLP and MT 33 / 64

Page 89: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Problem setting: Domain adaptation

Schutze, LMU Munich: Text Representations for NLP and MT 34 / 64

Page 90: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Problem setting: Domain adaptation

Schutze, LMU Munich: Text Representations for NLP and MT 35 / 64

Page 91: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Representation for morphological tagging

Schutze, LMU Munich: Text Representations for NLP and MT 36 / 64

Page 92: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Representation for morphological tagging

Formalize problem as sequence classification(using higher-order CRF: MarMoT)

Schutze, LMU Munich: Text Representations for NLP and MT 36 / 64

Page 93: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Representation for morphological tagging

Formalize problem as sequence classification(using higher-order CRF: MarMoT)

Standard features for morphological tagging: suffix, shape, . . .

Schutze, LMU Munich: Text Representations for NLP and MT 36 / 64

Page 94: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Representation for morphological tagging

Formalize problem as sequence classification(using higher-order CRF: MarMoT)

Standard features for morphological tagging: suffix, shape, . . .

Additional representation for each token:

NONE (word index)UNSU: unsupervised learning: SVD and Brown clustersDEEP: deep learning embeddingsLING: finite state morphology (manually created linguisticresource)

Schutze, LMU Munich: Text Representations for NLP and MT 36 / 64

Page 95: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Representation for morphological tagging

Formalize problem as sequence classification(using higher-order CRF: MarMoT)

Standard features for morphological tagging: suffix, shape, . . .

Additional representation for each token:

NONE (word index)UNSU: unsupervised learning: SVD and Brown clustersDEEP: deep learning embeddingsLING: finite state morphology (manually created linguisticresource)

Which representation works best for morphological tagging:NONE, LING, UNSU or DEEP?

Schutze, LMU Munich: Text Representations for NLP and MT 36 / 64

Page 96: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Morphological tagging: Results

SVMTool Morfette MarMoT

NONE NONE NONE UNSU1 UNSU2 DEEP LING

cs 75.28 76.04 78.01 78.44 78.51 78.42 78.88

hu 88.44 89.18 89.77 90.52 90.41 90.88 91.24

Schutze, LMU Munich: Text Representations for NLP and MT 37 / 64

Page 97: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Embeddings and morphological resources providecomplementary information.

Schutze, LMU Munich: Text Representations for NLP and MT 38 / 64

Page 98: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Embeddings and morphological resources providecomplementary information.

Use both!

Schutze, LMU Munich: Text Representations for NLP and MT 38 / 64

Page 99: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Outline

1 Morphology

2 Deep learning embeddings

3 Morphological lexica vs embeddings

4 For units of which granularities should we use embeddings?

5 Using deep learning (in general) in MT

Schutze, LMU Munich: Text Representations for NLP and MT 39 / 64

Page 100: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Embeddings for what?

morphemes

word forms

lemmata

phrases

sentences

paragraphs

documents

Schutze, LMU Munich: Text Representations for NLP and MT 40 / 64

Page 101: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Embeddings for what?

Schutze, LMU Munich: Text Representations for NLP and MT 41 / 64

Page 102: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Embeddings for what?

Most common use of embeddings:Embeddings for words (= word forms)

Schutze, LMU Munich: Text Representations for NLP and MT 41 / 64

Page 103: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Embeddings for what?

Most common use of embeddings:Embeddings for words (= word forms)

My earlier argument:Lemmata are the right level of embedding representation,not word forms.

Schutze, LMU Munich: Text Representations for NLP and MT 41 / 64

Page 104: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Embeddings for what?

Most common use of embeddings:Embeddings for words (= word forms)

My earlier argument:Lemmata are the right level of embedding representation,not word forms.

What about embedding representations for larger units:phrases and sentences?

Schutze, LMU Munich: Text Representations for NLP and MT 41 / 64

Page 105: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Embeddings for what?

Most common use of embeddings:Embeddings for words (= word forms)

My earlier argument:Lemmata are the right level of embedding representation,not word forms.

What about embedding representations for larger units:phrases and sentences?

Recent deep learning work on MT usesvector representations for sentences.

Schutze, LMU Munich: Text Representations for NLP and MT 41 / 64

Page 106: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Example: paraphrase identification

Given: two sentences

Task: Are they paraphrases, yes or no?

Schutze, LMU Munich: Text Representations for NLP and MT 42 / 64

Page 107: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Wenpeng Yin and HinrichSchutze. MultiGranCNN: Anarchitecture for general match-ing of text chunks on multi-ple levels of granularity. ACL2015.

Page 108: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Task-specificity: Experimental results

method acc F1ARC-I (Hu et al., 2014) 61.4 60.3ARC-II (Hu et al., 2014) 64.9 63.5Bi-CNN-MI (Yin and Schutze, 2015) 87.9 87.18MT (Madnani et al., 2012) 92.3 92.1(Bach et al., 2014) 93.4 93.3

MultiGranCNN+8MT (freeze) 94.9 94.7

Schutze, LMU Munich: Text Representations for NLP and MT 44 / 64

Page 109: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Capacity

MultiGranCNN determines for each meaning element:is it also present in the other sentence?

Schutze, LMU Munich: Text Representations for NLP and MT 45 / 64

Page 110: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Capacity

MultiGranCNN determines for each meaning element:is it also present in the other sentence?

At all levels of granularity:single word, short ngram, long ngram, sentence.

Schutze, LMU Munich: Text Representations for NLP and MT 45 / 64

Page 111: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Capacity

MultiGranCNN determines for each meaning element:is it also present in the other sentence?

At all levels of granularity:single word, short ngram, long ngram, sentence.

Representation of the sentence in MultigranCNN:large set of vectors, each representing a (smaller or larger)part of the sentence.

Schutze, LMU Munich: Text Representations for NLP and MT 45 / 64

Page 112: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Capacity

MultiGranCNN determines for each meaning element:is it also present in the other sentence?

At all levels of granularity:single word, short ngram, long ngram, sentence.

Representation of the sentence in MultigranCNN:large set of vectors, each representing a (smaller or larger)part of the sentence.Alternative:

Use a single vector to represent sentenceThen compare these two vectors

Schutze, LMU Munich: Text Representations for NLP and MT 45 / 64

Page 113: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Capacity

MultiGranCNN determines for each meaning element:is it also present in the other sentence?

At all levels of granularity:single word, short ngram, long ngram, sentence.

Representation of the sentence in MultigranCNN:large set of vectors, each representing a (smaller or larger)part of the sentence.Alternative:

Use a single vector to represent sentenceThen compare these two vectorsDoes it make sense to go through this bottleneck?

Schutze, LMU Munich: Text Representations for NLP and MT 45 / 64

Page 114: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Capacity

MultiGranCNN determines for each meaning element:is it also present in the other sentence?

At all levels of granularity:single word, short ngram, long ngram, sentence.

Representation of the sentence in MultigranCNN:large set of vectors, each representing a (smaller or larger)part of the sentence.Alternative:

Use a single vector to represent sentenceThen compare these two vectorsDoes it make sense to go through this bottleneck?Does it make sense to go through this bottleneck forparagraphs?

Schutze, LMU Munich: Text Representations for NLP and MT 45 / 64

Page 115: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Capacity

MultiGranCNN determines for each meaning element:is it also present in the other sentence?

At all levels of granularity:single word, short ngram, long ngram, sentence.

Representation of the sentence in MultigranCNN:large set of vectors, each representing a (smaller or larger)part of the sentence.Alternative:

Use a single vector to represent sentenceThen compare these two vectorsDoes it make sense to go through this bottleneck?Does it make sense to go through this bottleneck forparagraphs?Does it make sense to go through this bottleneck for books?

Schutze, LMU Munich: Text Representations for NLP and MT 45 / 64

Page 116: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Capacity

MultiGranCNN determines for each meaning element:is it also present in the other sentence?

At all levels of granularity:single word, short ngram, long ngram, sentence.

Representation of the sentence in MultigranCNN:large set of vectors, each representing a (smaller or larger)part of the sentence.Alternative:

Use a single vector to represent sentenceThen compare these two vectorsDoes it make sense to go through this bottleneck?Does it make sense to go through this bottleneck forparagraphs?Does it make sense to go through this bottleneck for books?

Argument 1 against representing sentences as vectors:Vectors have limited storage capacity.

Schutze, LMU Munich: Text Representations for NLP and MT 45 / 64

Page 117: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Context

Schutze, LMU Munich: Text Representations for NLP and MT 46 / 64

Page 118: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Context

(1) “They continued their advance.”

Schutze, LMU Munich: Text Representations for NLP and MT 46 / 64

Page 119: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Context

(1) “They continued their advance.”

(2) “Houthi forces continued their advance.”

(3) “Stocks continued their advance”

Schutze, LMU Munich: Text Representations for NLP and MT 46 / 64

Page 120: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Context

(1) “They continued their advance.”

(2) “Houthi forces continued their advance.”

(3) “Stocks continued their advance”

In context, it will be clear that “they” refers either to soldiersor to stocks.

Schutze, LMU Munich: Text Representations for NLP and MT 46 / 64

Page 121: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Context

(1) “They continued their advance.”

(2) “Houthi forces continued their advance.”

(3) “Stocks continued their advance”

In context, it will be clear that “they” refers either to soldiersor to stocks.

Argument 2 against representing sentences as vectors:The same sentence should have different representationsin different contexts.

Schutze, LMU Munich: Text Representations for NLP and MT 46 / 64

Page 122: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Intent

Schutze, LMU Munich: Text Representations for NLP and MT 47 / 64

Page 123: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Intent

It’s impossible to find parking!

It’s impossible to find parking!

It’s impossible to find parking!

Schutze, LMU Munich: Text Representations for NLP and MT 47 / 64

Page 124: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Intent

Why did you not pick up the dry cleaning? –It’s impossible to find parking!

It’s impossible to find parking!

It’s impossible to find parking!

Schutze, LMU Munich: Text Representations for NLP and MT 47 / 64

Page 125: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Intent

Why did you not pick up the dry cleaning? –It’s impossible to find parking! (10 minutes ago, it wasimpossible to find parking at my dry cleaner’s.)

It’s impossible to find parking!

It’s impossible to find parking!

Schutze, LMU Munich: Text Representations for NLP and MT 47 / 64

Page 126: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Intent

Why did you not pick up the dry cleaning? –It’s impossible to find parking! (10 minutes ago, it wasimpossible to find parking at my dry cleaner’s.)

You’re looking for an apartment. Why are you not consideringneighborhood X? –It’s impossible to find parking!

It’s impossible to find parking!

Schutze, LMU Munich: Text Representations for NLP and MT 47 / 64

Page 127: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Intent

Why did you not pick up the dry cleaning? –It’s impossible to find parking! (10 minutes ago, it wasimpossible to find parking at my dry cleaner’s.)

You’re looking for an apartment. Why are you not consideringneighborhood X? –It’s impossible to find parking! (It is probably possible to findparking in neighborhood X, but it’s difficult, expensive,time-consuming.)

It’s impossible to find parking!

Schutze, LMU Munich: Text Representations for NLP and MT 47 / 64

Page 128: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Intent

Why did you not pick up the dry cleaning? –It’s impossible to find parking! (10 minutes ago, it wasimpossible to find parking at my dry cleaner’s.)

You’re looking for an apartment. Why are you not consideringneighborhood X? –It’s impossible to find parking! (It is probably possible to findparking in neighborhood X, but it’s difficult, expensive,time-consuming.)

Why are you late? –It’s impossible to find parking!

Schutze, LMU Munich: Text Representations for NLP and MT 47 / 64

Page 129: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Intent

Why did you not pick up the dry cleaning? –It’s impossible to find parking! (10 minutes ago, it wasimpossible to find parking at my dry cleaner’s.)

You’re looking for an apartment. Why are you not consideringneighborhood X? –It’s impossible to find parking! (It is probably possible to findparking in neighborhood X, but it’s difficult, expensive,time-consuming.)

Why are you late? –It’s impossible to find parking! (It actually was not impossibleto find parking, it just took a while.)

Schutze, LMU Munich: Text Representations for NLP and MT 47 / 64

Page 130: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

How to represent sentences: Intent

Why did you not pick up the dry cleaning? –It’s impossible to find parking! (10 minutes ago, it wasimpossible to find parking at my dry cleaner’s.)

You’re looking for an apartment. Why are you not consideringneighborhood X? –It’s impossible to find parking! (It is probably possible to findparking in neighborhood X, but it’s difficult, expensive,time-consuming.)

Why are you late? –It’s impossible to find parking! (It actually was not impossibleto find parking, it just took a while.)

Argument 3 against representing sentences as vectors:Intended meaning depends on communicative task / goal.

Schutze, LMU Munich: Text Representations for NLP and MT 47 / 64

Page 131: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Representing a sentence as a vector: Problems

Capacity

Representation is context-dependent.

Representation is task/goal/intent-dependent.

Schutze, LMU Munich: Text Representations for NLP and MT 48 / 64

Page 132: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Embeddings for what?

morphemes

word forms

lemmata

phrases

sentences

paragraphs

documents

Schutze, LMU Munich: Text Representations for NLP and MT 49 / 64

Page 133: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Embeddings for what?

morphemes

word forms

lemmata

phrases

sentences

paragraphs

documents

Schutze, LMU Munich: Text Representations for NLP and MT 49 / 64

Page 134: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Embeddings for what?

morphemes

word forms

lemmata

phrases

sentences

paragraphs

documents

Schutze, LMU Munich: Text Representations for NLP and MT 49 / 64

Page 135: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Outline

1 Morphology

2 Deep learning embeddings

3 Morphological lexica vs embeddings

4 For units of which granularities should we use embeddings?

5 Using deep learning (in general) in MT

Schutze, LMU Munich: Text Representations for NLP and MT 50 / 64

Page 136: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Deep learning

Will deep-learning-based MT replacecurrent approaches to MT?

Yann LeCun, Yoshua Bengio, Geoffrey Hinton:Deep learning. 2015. Nature, 521, 436–444.

Schutze, LMU Munich: Text Representations for NLP and MT 51 / 64

Page 137: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

Schutze, LMU Munich: Text Representations for NLP and MT 52 / 64

Page 138: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

On embeddings

Schutze, LMU Munich: Text Representations for NLP and MT 52 / 64

Page 139: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

On embeddings

N-grams treat each word as an atomic unit, so they cannotgeneralize across semantically related sequences of words, whereasneural language models can because they associate each word witha vector of real valued features . . .

(thumbs up)

Schutze, LMU Munich: Text Representations for NLP and MT 52 / 64

Page 140: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

On the “deepness” of deep learning

Schutze, LMU Munich: Text Representations for NLP and MT 53 / 64

Page 141: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

On the “deepness” of deep learning

Deep-learning methods are representation-learning methods withmultiple levels of representation, obtained by composing simple butnon-linear modules that each transform the representation at onelevel (starting with the raw input) into a representation at a higher,slightly more abstract level. With the composition of enough suchtransformations, very complex functions can be learned.

(thumbs up)

Schutze, LMU Munich: Text Representations for NLP and MT 53 / 64

Page 142: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Deep network, increasingly abstract representations

Honglak Lee, Roger Grosse, Rajesh Ranganath, Andrew Y. Ng.2009. Convolutional deep belief networks for scalable unsupervisedlearning of hierarchical representations. ICML 2009.

Schutze, LMU Munich: Text Representations for NLP and MT 54 / 64

Page 143: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

On convolutional neural networks (CNNs / ConvNets)

Schutze, LMU Munich: Text Representations for NLP and MT 55 / 64

Page 144: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

On convolutional neural networks (CNNs / ConvNets)

. . . four key ideas . . . local connections, shared weights, poolingand the use of many layers. . . . ConvNets have been applied withgreat success . . .

(thumbs up)

Schutze, LMU Munich: Text Representations for NLP and MT 55 / 64

Page 145: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

Domain expertise no longer needed?

Schutze, LMU Munich: Text Representations for NLP and MT 56 / 64

Page 146: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

Domain expertise no longer needed?

. . . constructing a pattern-recognition or machine-learning systemrequired careful engineering and considerable domain expertise todesign a feature extractor that transformed the raw data (such asthe pixel values of an image) into a suitable internal representation. . . deep learning . . . requires very little engineering by hand . . .

(shock)

Schutze, LMU Munich: Text Representations for NLP and MT 56 / 64

Page 147: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

On unsupervised learning

Schutze, LMU Munich: Text Representations for NLP and MT 57 / 64

Page 148: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

On unsupervised learning

Although we have not focused on it in this Review, we expectunsupervised learning to become far more important in the longerterm. Human and animal learning is largely unsupervised: wediscover the structure of the world by observing it . . .

(shock)

Schutze, LMU Munich: Text Representations for NLP and MT 57 / 64

Page 149: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

On recurrent neural networks (RNNs)

Schutze, LMU Munich: Text Representations for NLP and MT 58 / 64

Page 150: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

On recurrent neural networks (RNNs)

For tasks that involve sequential inputs, such as speech andlanguage, it is often better to use RNNs . . . RNNs process aninput sequence one element at a time, maintaining in their hiddenunits a ‘state vector’ that implicitly contains information about thehistory of all the past elements of the sequence.

(skepticism – my earlier argument against sentence representation)

Schutze, LMU Munich: Text Representations for NLP and MT 58 / 64

Page 151: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Representing sentences as vectors (1)

Sutskever, Vinyals, Le (2015)

Schutze, LMU Munich: Text Representations for NLP and MT 59 / 64

Page 152: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Representing sentences as vectors (2)

Cho, Merrienboer, Gulcehre, Bahdanau,Bougares, Schwenk, Bengio (2015)

Schutze, LMU Munich: Text Representations for NLP and MT 60 / 64

Page 153: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

On symbolic representation / symbolic computation

Schutze, LMU Munich: Text Representations for NLP and MT 61 / 64

Page 154: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Comments on “Deep learning” by LeCun, Bengio & Hinton

On symbolic representation / symbolic computation

This rather naive way of performing machine translation hasquickly become competitive with the state-of-the-art, and thisraises serious doubts about whether understanding a sentencerequires anything like the internal symbolic expressions that aremanipulated by using inference rules.

(skepticism)

Schutze, LMU Munich: Text Representations for NLP and MT 61 / 64

Page 155: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 156: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inference

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 157: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memory

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 158: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memoryStatistical inference: Inference for never observed events,based on true generalization

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 159: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memoryStatistical inference: Inference for never observed events,based on true generalization

Example for memory inference

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 160: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memoryStatistical inference: Inference for never observed events,based on true generalization

Example for memory inference“In Rome, I got a job teaching English as a foreign . . . ”

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 161: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memoryStatistical inference: Inference for never observed events,based on true generalization

Example for memory inference“In Rome, I got a job teaching English as a foreign . . . ”What is the next word?

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 162: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memoryStatistical inference: Inference for never observed events,based on true generalization

Example for memory inference“In Rome, I got a job teaching English as a foreign . . . ”What is the next word?“language”

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 163: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memoryStatistical inference: Inference for never observed events,based on true generalization

Example for memory inference“In Rome, I got a job teaching English as a foreign . . . ”What is the next word?“language”

Example for statistical inference

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 164: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memoryStatistical inference: Inference for never observed events,based on true generalization

Example for memory inference“In Rome, I got a job teaching English as a foreign . . . ”What is the next word?“language”

Example for statistical inference“My favorite spice for lighting up shrimp is . . . ”

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 165: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memoryStatistical inference: Inference for never observed events,based on true generalization

Example for memory inference“In Rome, I got a job teaching English as a foreign . . . ”What is the next word?“language”

Example for statistical inference“My favorite spice for lighting up shrimp is . . . ”What is the next word?

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 166: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memoryStatistical inference: Inference for never observed events,based on true generalization

Example for memory inference“In Rome, I got a job teaching English as a foreign . . . ”What is the next word?“language”

Example for statistical inference“My favorite spice for lighting up shrimp is . . . ”What is the next word?“mace”, “garlic”, “chili”, “paprika”

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 167: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memoryStatistical inference: Inference for never observed events,based on true generalization

Example for memory inference“In Rome, I got a job teaching English as a foreign . . . ”What is the next word?“language”

Example for statistical inference“My favorite spice for lighting up shrimp is . . . ”What is the next word?“mace”, “garlic”, “chili”, “paprika”probably not: “bay leaf”, “saffron”; “vanilla”, “allspice”

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 168: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memoryStatistical inference: Inference for never observed events,based on true generalization

Example for memory inference“In Rome, I got a job teaching English as a foreign . . . ”What is the next word?“language”

Example for statistical inference“My favorite spice for lighting up shrimp is . . . ”What is the next word?“mace”, “garlic”, “chili”, “paprika”probably not: “bay leaf”, “saffron”; “vanilla”, “allspice”

Is statistical inference the right tool for “memories”?

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 169: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

No symbolic representations?

Compare two types of inferenceMemory inference: Inference for frequently observed events,based on retrieval from memoryStatistical inference: Inference for never observed events,based on true generalization

Example for memory inference“In Rome, I got a job teaching English as a foreign . . . ”What is the next word?“language”

Example for statistical inference“My favorite spice for lighting up shrimp is . . . ”What is the next word?“mace”, “garlic”, “chili”, “paprika”probably not: “bay leaf”, “saffron”; “vanilla”, “allspice”

Is statistical inference the right tool for “memories”?

Kneser-Ney’s success partly due to memorization

Schutze, LMU Munich: Text Representations for NLP and MT 62 / 64

Page 170: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 171: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

nettementsuperieur

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 172: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

nettementsuperieur

muchhigher

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 173: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

nettementsuperieur

muchhigher

Il lui est nettement superieur techniquement.

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 174: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

nettementsuperieur

muchhigher / better

Il lui est nettement superieur techniquement.

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 175: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

nettementsuperieur

muchhigher / better

The advantage of a continuous space model

A continuous space model can better learn whento use “higher” vs. “better”.

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 176: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 177: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 178: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 179: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 180: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

$425

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 181: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

$425

$425

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 182: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 183: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

5 janvier 1970

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 184: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

5 janvier 1970

Jan 1, 1970

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 185: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Only continuous representations,

no symbolic representations?

Disadvantage of a continuous spacemodel for entities

In translation, it is not a good idea tosmooth an entity like Putin, an amount like$425, a date like January 5, 1970.

Schutze, LMU Munich: Text Representations for NLP and MT 63 / 64

Page 186: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Schutze, LMU Munich: Text Representations for NLP and MT 64 / 64

Page 187: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Use lemmata for MT

Schutze, LMU Munich: Text Representations for NLP and MT 64 / 64

Page 188: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Use lemmata for MT

Use embeddings for MT

Schutze, LMU Munich: Text Representations for NLP and MT 64 / 64

Page 189: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Use lemmata for MT

Use embeddings for MT

Use linguistic morphological resources for MT

Schutze, LMU Munich: Text Representations for NLP and MT 64 / 64

Page 190: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Use lemmata for MT

Use embeddings for MT

Use linguistic morphological resources for MT

Don’t represent sentences as vectors for MT

Schutze, LMU Munich: Text Representations for NLP and MT 64 / 64

Page 191: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Use lemmata for MT

Use embeddings for MT

Use linguistic morphological resources for MT

Don’t represent sentences as vectors for MT

Deep learning will not replace other MT work . . .

Schutze, LMU Munich: Text Representations for NLP and MT 64 / 64

Page 192: Two Aspects of Text Representations for NLP and MT: Morphology and Deep Learningufal.mff.cuni.cz/mtm15/files/13-text-representations-for... · Morphology Embeddings Lexicavsembeddings

Morphology Embeddings Lexica vs embeddings Embeddings for what? Deep learning

Summary

Use lemmata for MT

Use embeddings for MT

Use linguistic morphological resources for MT

Don’t represent sentences as vectors for MT

Deep learning will not replace other MT work . . .

. . . but will be a powerful component of MT systems.

Schutze, LMU Munich: Text Representations for NLP and MT 64 / 64