ML with Tensorflow-new - GitHub Pageshunkim.github.io/ml/lec12.pdf · e.g. Machine Translation seq...

Post on 11-Jul-2020

26 views 0 download

Transcript of ML with Tensorflow-new - GitHub Pageshunkim.github.io/ml/lec12.pdf · e.g. Machine Translation seq...

Lecture 12 RNN

Sung Kim <hunkim+mr@gmail.com>http://hunkim.github.io/ml/

Sequence data

• We don’t understand one word only

• We understand based on the previous words + this word. (time series)

• NN/CNN cannot do this

Sequence data

• We don’t understand one word only

• We understand based on the previous words + this word. (time series)

• NN/CNN cannot do this

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201613

Recurrent Neural Network

x

RNN

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201614

Recurrent Neural Network

x

RNN

yusually want to predict a vector at some time steps

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201615

Recurrent Neural Network

x

RNN

yWe can process a sequence of vectors x by applying a recurrence formula at every time step:

new state old state input vector at some time step

some functionwith parameters W

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201616

Recurrent Neural Network

x

RNN

yWe can process a sequence of vectors x by applying a recurrence formula at every time step:

Notice: the same function and the same set of parameters are used at every time step.

/HFWXUH����� ��)HE�����)HL�)HL�/L��$QGUHM�.DUSDWK\��-XVWLQ�-RKQVRQ)HL�)HL�/L��$QGUHM�.DUSDWK\��-XVWLQ�-RKQVRQ /HFWXUH����� ��)HE�������

�9DQLOOD��5HFXUUHQW�1HXUDO�1HWZRUN

[

511

\

7KH�VWDWH�FRQVLVWV�RI�D�VLQJOH�³KLGGHQ´�YHFWRU�K�

http://colah.github.io/posts/2015-08-Understanding-LSTMs/

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201618

Character-levellanguage modelexample

Vocabulary:[h,e,l,o]

Example trainingsequence:“hello”

x

RNN

y

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201619

Character-levellanguage modelexample

Vocabulary:[h,e,l,o]

Example trainingsequence:“hello”

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201620

Character-levellanguage modelexample

Vocabulary:[h,e,l,o]

Example trainingsequence:“hello”

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201620

Character-levellanguage modelexample

Vocabulary:[h,e,l,o]

Example trainingsequence:“hello”

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201620

Character-levellanguage modelexample

Vocabulary:[h,e,l,o]

Example trainingsequence:“hello”

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201620

Character-levellanguage modelexample

Vocabulary:[h,e,l,o]

Example trainingsequence:“hello”

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201621

Character-levellanguage modelexample

Vocabulary:[h,e,l,o]

Example trainingsequence:“hello”

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201621

Character-levellanguage modelexample

Vocabulary:[h,e,l,o]

Example trainingsequence:“hello”

RNN applications

• Language Modeling

• Speech Recognition

• Machine Translation

• Conversation Modeling/Question Answering

• Image/Video Captioning

• Image/Music/Dance Generation

http://jiwonkim.org/awesome-rnn/

https://github.com/TensorFlowKR/awesome_tensorflow_implementations

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 20166

Recurrent Networks offer a lot of flexibility:

Vanilla Neural Networks

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 20167

Recurrent Networks offer a lot of flexibility:

e.g. Image Captioningimage -> sequence of words

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 20168

Recurrent Networks offer a lot of flexibility:

e.g. Sentiment Classificationsequence of words -> sentiment

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 20169

Recurrent Networks offer a lot of flexibility:

e.g. Machine Translationseq of words -> seq of words

Lecture 10 - 8 Feb 2016Fei-Fei Li & Andrej Karpathy & Justin JohnsonFei-Fei Li & Andrej Karpathy & Justin Johnson Lecture 10 - 8 Feb 201610

Recurrent Networks offer a lot of flexibility:

e.g. Video classification on frame level

Multi-Layer RNN

Training RNNs is challenging

• Several advanced models- Long Short Term Memory (LSTM)- GRU by Cho et al. 2014

Next

RNN in TensorFlow