Data-Driven Response Generation...Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven...

Alan RitterOhio State University

Data-Driven Response Generation

1950s ~ 2010 Dialog systems mostly rule-based

Alan Ritter (Ohio State University)

Rule-Based: Eliza (Weizenbaum 1966)

Goal-Directed Dialogue Systems:

Information Retrieval (Isbell et. al. 2000)

ATIS Dataset (Hemphill, 1990)

-774 flight reservation conversations-Manually annotated

Chatbots:

1990s ~ 2010s Data-Driven Machine Translation

millions of bilingual documents on the web


Findings of WMT 2010 (Callison-Burch et. al.) The Mathematics of Statistical Machine Translation: Parameter Estimation (Brown et. al.)

July 2011 Data-Driven Dialogue

500 million conversations per month on Twitter alone

Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”

(vs. 30m for French-English translation)




Named Entity Recognition (Ritter et. al. EMNLP 2011)

Open-Domain Event Extraction (Ritter et. al. KDD 2012)

Unsupervised Dialogue Acts (Ritter, Cherry, Dolan, NAACL 2010)

NLP on Noisy User-Generated Text:

Minimally-Supervised Event Extraction (Ritter et. al. WWW 2015)





MTDialogue

… and they lived happily ever after.



But, unlike MT, conversations are not semantically equivalent.

Who wants to come over for dinner tomorrow?Input:



Output:

Yum ! I

{Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven Response Generation in Social Media”


Output:{

want toYum ! I



Output:{

want toYum ! I

{be there



Output:{

want toYum ! I

{be there

{tomorrow !


2015 ~ present Neural MT-based Conversation Models


• I. Serban, A. Sordoni, Y. Bengio, A. Courville and J. Pineau. Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Networks. In Proc of AAAI, 2016.

• Jesse Dodge, Andreea Gane, Xiang Zhang, Antoine Bordes, Sumit Chopra, Alexander Miller, Arthur Szlam, Jason Weston. Evaluating Prerequisite Qualities for Learning End-to-end Dialog Systems, ICLR 2016

• Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Meg Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan, A Neural Network Approach to Context-Sensitive Generation of Conversational Responses. NAACL 2015

• Lifeng Shang, Zhengdong Lu, Hang Li. Neural Responding Machine for Short Text Conversation. ACL 2015

• O. Vinyals, Q.V. Le. A Neural Conversational Model. ICML Deep Learning Workshop 2015

• Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao and Bill Dolan. A Diversity-Promoting Objective Function for Neural Conversation Models. NAACL 2016


But, maximum likelihood estimate responses can be safe and boring

arg max

r1,...,rlP (r1, . . . , rl|m1, . . . ,mk)

Input MessageResponse



Some replies work for almost any input:

arg max

r1,...,rlP (r1, . . . , rl|m1, . . . ,mk)




Some replies work for almost any input:

“I don’t know”

arg max

r1,...,rlP (r1, . . . , rl|m1, . . . ,mk)


2016 Neural Dialogue with Deep Reinforcement Learning

J. Li, W. Monroe, A. Ritter, M. Galley, J. Gao, D. Jurafsky (EMNLP 2016) “Deep Reinforcement Learning for Dialogue Generation”

2016 Neural Dialogue with Deep Reinforcement Learning


Jiwei Li (PhD Stanford 2017)

How old are you ?

Problem: Short-sighted conversation decisions.


How old are you ?

i 'm 16 .



How old are you ?

i 'm 16 .

16 ?



How old are you ?

i 'm 16 .

16 ?

i don 't know what you 're talking about



How old are you ?

i 'm 16 .

16 ?


you don 't know what you 're saying



How old are you ?

i 'm 16 .

16 ?






How old are you ?

i 'm 16 .

16 ?





Bad Action



How old are you ?

i 'm 16 .

16 ?




you don 't know what you 're saying Outcome



How old are you ?

i 'm 16 .

16 ?




you don 't know what you 're saying Outcome does not emerge

until a few turns later

Can Reinforcement Learning Handle This?


How old are you ?

how old are you

Encoding

Notations: State


How old are you ?

i 'm 16 .

I’m 16 . EOS

Decoding

EOS I’m fine .how old are you

Encoding

Notations: Action


A message from training set

Encode

r1DecodeEncode

r2Decode

…

Simulation


S1 S2 Sn

Compute Accumulated Reward R(S1,S2,…,Sn)

Input Message

Encode Decode

Turn 1

Encode

Turn 2

Decode Encode

…

Decode

Turn N


S1 S2 Sn

REINFORCE Algorithm (William,1992)

Input Message

Encode Decode

Turn 1

Encode

Turn 2

Decode Encode

…

Decode

Turn N


Policy Gradient Methods:

S1 S2 Sn

REINFORCE Algorithm (William,1992)

What we want to learn

Input Message

Encode Decode

Turn 1

Encode

Turn 2

Decode Encode

…

Decode

Turn N


Policy Gradient Methods:

Q: How to a Specify Reward Signal?

(J. Li, W. Monroe, T. Shi, S. Jean, A. Ritter, D. Jurafsky (EMNLP 2017) “Adversarial Learning for Neural Dialogue Generation”

A: Turing Test

Adversarial Learning(Goodfellow et al., 2014)


Q: How to a Specify Reward Signal?

A: Turing Test

Real-world conversations

Response Generator

generate response

sample human response

Discriminator Real or Fake?


Adversarial Learning for Neural Dialogue


Response Generator

generate response


Discriminator

(Alternate Between Training Generator and Discriminator)

Real or Fake?




Response Generator

Discriminator

(Alternate Between Training Generator and Discriminator)

REINFORCE Algorithm (Williams,1992)

Real or Fake?

generate response




Human Evaluator:

Machine Evaluator:

Adversarial Success (How often can you fool a machine)

Adversarial Learning 8.0%Standard Seq2Seq model 4.9%

Adversarial Win

Adversarial Lose

Tie

62% 18% 20%

Adversarial Learning Improves Response Generationvs a vanilla generation model


34

Extract Entities,

Relations and Events

Barack Obama

Hawaii

Born in

United States

President

David Ige

Mayor

Spouse

Michelle Obama Alma

Mater

Princeton

Honolulu

Capitol

Future: Integrating dynamic knowledge graphs

(A. Konovalov, B. Strauss, A. Ritter and B. O'Connor (WWW 2017) “Learning to Extract Events from Knowledge Base Revisions”

Takeaways


MTDialogue

Open-Domain Dialogue

Takeaways


Learning from Delayed-Reward

MTDialogue


Takeaways



MTDialogue


Adversarial Learning for Dialogue

Takeaways



MTDialogue


Adversarial Learning for Dialogue

Takeaways


Thank You!

Data-Driven Response Generation...Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven...

Documents

Transcript of Data-Driven Response Generation...Alan Ritter, Colin Cherry, Bill Dolan (EMNLP 2011) “Data-Driven...