Haitham Elmarakeby. Speech recognition

63
Sequence to Sequence Learning Haitham Elmarakeby

Transcript of Haitham Elmarakeby. Speech recognition

Page 1: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence Learning

Haitham Elmarakeby

Page 2: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

Speech recognition

http://nlp.stanford.edu/courses/lsa352/

Page 3: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

Machine translation

Welcome to the deep learning class

درس في بكم مرحباالعميق التعلم

Page 4: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

Question answering

Page 5: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

Knight and Koehn 2003

Page 6: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

Knight and Koehn 2003

Page 7: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

Components Translation model Language Model Decoding

Page 8: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

Translation model

Learn the P(f | e)

Knight and Koehn 2003

Page 9: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

Translation model Input is Segmented in Phrases Each Phrase is Translated into English Phrases are Reordered

Koehn 2004

Page 10: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

Language Model

Goal of the Language Model: Detect good English P(e)Standard Technique: Trigram Model

Knight and Koehn 2003

Page 11: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

DecodingGoal of the decoding algorithm: Put models to work, perform the actual translation

Koehn 2004

Page 12: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

DecodingGoal of the decoding algorithm: Put models to work, perform the actual translation

Koehn 2004

Page 13: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

DecodingGoal of the decoding algorithm: Put models to work, perform the actual translation

Koehn 2004

Page 14: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

DecodingGoal of the decoding algorithm: Put models to work, perform the actual translation

Koehn 2004

Page 15: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

DecodingGoal of the decoding algorithm: Put models to work, perform the actual translation

Koehn 2004

Page 16: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

DecodingGoal of the decoding algorithm: Put models to work, perform the actual translation

Koehn 2004

Page 17: Haitham Elmarakeby.  Speech recognition

Statistical Machine Translation

DecodingGoal of the decoding algorithm: Put models to work, perform the actual translation

Prune out Weakest Hypotheses by absolute threshold (keep 100 best) by relative cutoff

Future Cost Estimation compute expected cost of untranslated words

Page 18: Haitham Elmarakeby.  Speech recognition

Sutskever et al.,2014

Sequence to Sequence Learning with Neural Networks

Page 19: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

Model

A B C

W X Y Z

Page 20: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

Model

Sutskever et al. 2014

Page 21: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

Model- encoder

Cho: From Sequence Modeling to Translation

Page 22: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

Model- encoder

Cho: From Sequence Modeling to Translation

Page 23: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

Model- encoder

Cho: From Sequence Modeling to Translation

Page 24: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

Model- encoder

Cho: From Sequence Modeling to Translation

Page 25: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

Model- encoder

Cho: From Sequence Modeling to Translation

Page 26: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

Model- decoder

Cho: From Sequence Modeling to Translation

Page 27: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

Model- decoder

Cho: From Sequence Modeling to Translation

Page 28: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

Model- decoder

Cho: From Sequence Modeling to Translation

Page 29: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

RNN

Page 30: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

RNNVanishing gradient

Cho: From Sequence Modeling to Translation

Page 31: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

LSTM

Graves 2013

Page 32: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

LSTMProblem: Exploding gradient

Page 33: Haitham Elmarakeby.  Speech recognition

Neural Machine Translation

LSTMProblem: Exploding gradient Solution: Scaling gradient

Page 34: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

Reversing the Source Sentences

Welcome to the deep learning class

Page 35: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

Reversing the Source Sentences

Welcome to the deep learning class

Page 36: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

ResultsBLEU score (Bilingual Evaluation Understudy)

Candidate the the the the the the the

Reference 1 the cat is on the matReference 2 there is a cat on the mat

P = m/w= 7/7 = 1

Papineni et al. 2002

Page 37: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

ResultsBLEU score (Bilingual Evaluation Understudy)

Candidate the the the the the the the

Reference 1 the cat is on the matReference 2 there is a cat on the mat

P = 2/7

Papineni et al. 2002

Page 38: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

Results

Sutskever et al. 2014

Page 39: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

Results

Sutskever et al. 2014

Page 40: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

Model Analysis

Sutskever et al. 2014

Page 41: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

Long sentences

Sutskever et al. 2014

Page 42: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

Long sentences

Cho et al. 2014

Page 43: Haitham Elmarakeby.  Speech recognition

Bahdanau et al.,2014

Neural Machine Translation by Jointly Learning to Align and Translate

Page 44: Haitham Elmarakeby.  Speech recognition

Sequence to Sequence

Long sentences

Fixed length representation maybe the cause

Page 45: Haitham Elmarakeby.  Speech recognition

Jointly Learning to Align and Translate Attention mechanism

Page 46: Haitham Elmarakeby.  Speech recognition

Jointly Learning to Align and Translate Attention mechanism

Page 47: Haitham Elmarakeby.  Speech recognition

Jointly Learning to Align and Translate Attention mechanism

Page 48: Haitham Elmarakeby.  Speech recognition

Jointly Learning to Align and Translate Attention mechanism

Page 49: Haitham Elmarakeby.  Speech recognition

Jointly Learning to Align and Translate Attention mechanism

Page 50: Haitham Elmarakeby.  Speech recognition

Jointly Learning to Align and Translate Attention mechanism

Page 51: Haitham Elmarakeby.  Speech recognition

Jointly Learning to Align and Translate Attention mechanism

Page 52: Haitham Elmarakeby.  Speech recognition

Jointly Learning to Align and Translate

Long sentences

Cho et al. 2014

Page 53: Haitham Elmarakeby.  Speech recognition

Vinyals et al., 2015

Grammar as a Foreign Language

Page 54: Haitham Elmarakeby.  Speech recognition

Grammar as a Foreign Language

Parsing tree

Page 55: Haitham Elmarakeby.  Speech recognition

Grammar as a Foreign Language

Parsing tree

Page 56: Haitham Elmarakeby.  Speech recognition

Grammar as a Foreign Language

Parsing tree

Page 57: Haitham Elmarakeby.  Speech recognition

Grammar as a Foreign Language

Parsing tree

Page 58: Haitham Elmarakeby.  Speech recognition

Grammar as a Foreign Language

Parsing tree

John has a dog .

Page 59: Haitham Elmarakeby.  Speech recognition

Grammar as a Foreign Language

Converting tree to sequence

Page 60: Haitham Elmarakeby.  Speech recognition

Grammar as a Foreign Language

Converting tree to sequence

Page 61: Haitham Elmarakeby.  Speech recognition

Grammar as a Foreign Language

Model

Page 62: Haitham Elmarakeby.  Speech recognition

Grammar as a Foreign Language

Results

Page 63: Haitham Elmarakeby.  Speech recognition