Chat bot made by the chainer

85
Chat Bot Made by Chainer Chainer is a Neural Network Framework PyCon JP 2016 Masaya Ogushi 1

Transcript of Chat bot made by the chainer

Page 1: Chat bot made by the chainer

Chat Bot Made by Chainer Chainer is a Neural Network Framework

PyCon JP 2016Masaya Ogushi

1

Page 2: Chat bot made by the chainer

Attention

I will not show any Mathematical formula

If you understand the machine learning model, I recommend to read the paper in the last page

2

Page 3: Chat bot made by the chainer

Agenda

Self Introduction

Dialogue Value

Character of the Bot

System

Feature Plan

3

Chat Bot

Choosethe Topic

UnderstandContents Control

DialogueGenerateAnswer

Answer Candidates

Neural Network1

UnderstandContents Control

DialogueGenerateAnswer

Neural Network 2

Question

Question Answer

Page 4: Chat bot made by the chainer

Self Introduction

4

Page 5: Chat bot made by the chainer

Self Introduction

Name:Masaya Ogushi

@SnowGushiGit

PORT. Inc

Web Development Research and Development team

Tech-Circle staff

machine learning, Natural Language Processing, Crawler Dev, Automatic Infrastructure Construction, parallel processing, SearchFunction

5

Page 6: Chat bot made by the chainer

Self Introduction

We’re Hiring!!

https://www.theport.jp/recruit/information/6

Page 7: Chat bot made by the chainer

Self Introduction

Free Consulting for your job search

https://port-recruitment.jp/7

Page 8: Chat bot made by the chainer

Dialogue Value

8

Page 9: Chat bot made by the chainer

Agenda

Self Introduction

Dialogue Value

Character of the Bot

System

Feature Plan

9

Chat Bot

Choosethe Topic

UnderstandContents Control

DialogueGenerateAnswer

Answer Candidates

Neural Network1

UnderstandContents Control

DialogueGenerateAnswer

Neural Network 2

Question

Question Answer

Page 10: Chat bot made by the chainer

Dialogue Value

Dialogue Value

Continuously

Interactive

New User Experiences

10

Page 11: Chat bot made by the chainer

Dialogue Value

Continuously

It is possible to use the prior conversation’s information

11

I really love to play the tennis

Chat Bot

That’s sounds great

Well, a friend of mine owns a sports shop and is

looking for help. find the part time job

candidates

Yeah well I looking for the

part time job Do you know the any good ones?

Page 12: Chat bot made by the chainer

Dialogue Value

Interactive

It is possible to react to new information

12

I found the delicious sweets

I am on a diet

Don’t say such a things during the

I’m on a diet

This food is a good for losing

the weight

Really ???

Chat Bot

Page 13: Chat bot made by the chainer

Dialogue Value

New User Experiences

Character

13

Example

Page 14: Chat bot made by the chainer

Dialogue Value

Dialogue Value

continuously

Interactive

New User Experiences

14

Need Dialogue Data

Possible to achieve even without dialogue data

Page 15: Chat bot made by the chainer

Character of the Bot

15

Page 16: Chat bot made by the chainer

Agenda

Self Introduction

Dialogue Value

Character of the Bot

System

Feature Plan

16

Chat Bot

Choosethe Topic

UnderstandContents Control

DialogueGenerateAnswer

Answer Candidates

Neural Network1

UnderstandContents Control

DialogueGenerateAnswer

Neural Network 2

Question

Question Answer

Page 17: Chat bot made by the chainer

Which Characters looks smart ?

17

Character of the Bot

Page 18: Chat bot made by the chainer

Which Characters would you like to talk with?

18

Character of the Bot

Page 19: Chat bot made by the chainer

Which looks and character is a big gap ?

19

Character of the Bot

I read the “Statistical Machine

Translation”

I read the “Statistical Machine

Translation”

Page 20: Chat bot made by the chainer

Character of the Bot

I recommend Matsuya

Do you know any good part time

jobs

No, something more suited to

me

How about a sweets shop ?

Sounds Good

Character is very important

20

Food Shop

Maybe Sweets are the better

Looks Funny

Uhhh.. He only think

about food.

You understand it better than I

expected

Learning

allowance

Surprise

Chat Bot

Page 21: Chat bot made by the chainer

Character of the BotImprove the new use experiences

Decreasing the expected value of the answer and Became easer to talk to

Image of the Icon

Conversation of the Bot

21

Cognitive Science

Expect value of the answer:High

Easy to talk:Low

Expect value of the answer:Low

Easy to talk:High

Feel free to talk to me

Feel free to talk to me

Talk it

Page 22: Chat bot made by the chainer

Character of the BotPreparing the sentences for each character is costly

We have to change a little in the same sentences

*The Example uses the different levels of politeness in Japanese, the nuances of which are hard to translate into English

22

あなたは食べたパン の数を

把握されていますか?

お前は食ったパンの数を

覚えているか?

あなたは食べたパン の数を

覚えているの?

Page 23: Chat bot made by the chainer

23

Character of the BotWe need Many scenario writers

Page 24: Chat bot made by the chainer

Character of the Bot

We would like to change the Character but not change the contents

24

あなたは食べたパン の数を

把握されていますか?

お前は食ったパンの数を

覚えているか?

あなたは食べたパン の数を

覚えているの?

Page 25: Chat bot made by the chainer

Character of the Bot

25

Woman

Fat Man

Steward

Do you remember the number of the breads ?

Add the Character

Page 26: Chat bot made by the chainer

Character of the BotNeuralStoryTeller

Add the Character into the normal sentences。Add the romantic elements below

26

Page 27: Chat bot made by the chainer

27

Character of the BotIt is possible to apply to a variety of situations, if we

prepare the characters sentences

Page 28: Chat bot made by the chainer

28

Character of the BotI’m sorry.

I can’t implement this characters function.

Page 29: Chat bot made by the chainer

Agenda

Self Introduction

Dialogue Value

Character of the Bot

System

Feature Plan

29

Chat Bot

Choosethe Topic

UnderstandContents Control

DialogueGenerateAnswer

Answer Candidates

Neural Network1

UnderstandContents Control

DialogueGenerateAnswer

Neural Network 2

Question

Question Answer

Page 30: Chat bot made by the chainer

System

30

Page 31: Chat bot made by the chainer

Agenda

System Architecture

Dialogue Interface is Slack

Prepare the conversation data form Twitter

Pre training use the Wikipedia Data and the Dialogue Breakdown Collection

Choose the TOPIC by using the WordNet and WikiPedia Entity Vector

Dialogue model made by Chainer

Question and Answer functionally uses Elasticsearch

31

Chat Bot

Choosethe Topic

UnderstandContents Control

DialogueGenerateAnswer

Answer Candidates

Neural Network1

UnderstandContents Control

DialogueGenerateAnswer

Neural Network 2

Question

Question Answer

WordNet WikiPediaVector

Page 32: Chat bot made by the chainer

Agenda

:32

Chat Bot

Choosethe Topic

UnderstandContents Control

DialogueGenerateAnswer

Answer Candidates

Neural Network1

UnderstandContents Control

DialogueGenerateAnswer

Neural Network 2

Question

Question Answer

WordNet WikiPediaVector

Choose the Topic

Page 33: Chat bot made by the chainer

SystemChoose the Topic

Conversation contents is changed by the someone

33

Which is more important

to you me or your job

Please tell me where

you bought your clothes

Please give me a money

Boy Friend

Young Sister

Father

Page 34: Chat bot made by the chainer

SystemChoose the Topic

Word Net

Data set is the grouped into set of cognitive synonyms, each expressing a distinct concept

34

Scottish hold Black cat Orange Cat

Cat

Page 35: Chat bot made by the chainer

35

SystemMany concept. 57238 concepts It is difficult to prepare the data.

Page 36: Chat bot made by the chainer

36

SystemGrouping the concepts

Page 37: Chat bot made by the chainer

SystemThe way of grouping

Mapping the concept space

Grouping by distance

37

cat

cat

catTiger

Page 38: Chat bot made by the chainer

SystemMapping the concept space

38

Facebook

Twitter

Close??

We could not understand the distance comparing each words

We have to map the word to space, which makes it possible to measure the distance

Page 39: Chat bot made by the chainer

SystemMapping the concept space

Entity Linking

Mapping the Keyword to the Knowledge space

39

Facebook

Twitter

Close??

Knowledge Space

FacebookTwitter

SNS

Page 40: Chat bot made by the chainer

SystemChoose the Topic

Mapping the concept to the knowledge space

Japanese WikiPedia Entity Vector !!!

Vector representations of Words and WikiPedia(Knowldge)

(Wikipedia is the called the Entity)

40

Page 41: Chat bot made by the chainer

SystemChoose the Topic

Synonym get the vector by the WikiPedia Entity Vector

41

cat:[0.2, 0.3, 0.4…] dog:[0.3, 0.4, 0.5…]

Page 42: Chat bot made by the chainer

42

SystemMeasure the Distance

Page 43: Chat bot made by the chainer

SystemMeasuring the Concept Distance

43

Choose the appropriate measure for the distance in mapping space

If we make a mistake choosing the measuring of the distance.

It looks yellow

is close

Light blue is closer than

yellow

Page 44: Chat bot made by the chainer

44

SystemI used cosine similarity to measure the

vector distance

Page 45: Chat bot made by the chainer

SystemChoose the Topic

Synonym get the vector by the WikiPedia Entity Vector

45

Cat:[0.2, 0.3, 0.4…] Dog:[0.3, 0.4, 0.5…]

Cosine Similarity

Page 46: Chat bot made by the chainer

46

SystemMany Concepts yet

Page 47: Chat bot made by the chainer

SystemChoose the Topic

Add the Unknown words of the WordNet from the Wikipedia Entity Vector.

47

Black CatWhite Cat:

calico cat:

CatWikipedia Entity Vector

Close the Cosine

Similarity

Add the Unknown words

Word Net

Page 48: Chat bot made by the chainer

SystemChoose the Topic

Calculate the each concept Average vector

48

Black Cat:[0.2, 0.3, 0.4…]White Cat:[0.1, 0.3, 0.…]

Cat

Shiba:[0.1, 0.3, 0.4…]Tosa:[0.1, 0.2, 0.…]

Dog

Average Vector

Average Vector

Page 49: Chat bot made by the chainer

SystemChoose the Topic

If the average vector is close to each concept, group them by concept

49

Black Cat:[0.2, 0.3, 0.4…]White Cat:[0.1, 0.3, 0.…]

Cat

Shiba:[0.1, 0.3, 0.4…]Tosa:[0.1, 0.2, 0.…]

Dog

Average Vector

Average Vector

grouping the each concept

Page 50: Chat bot made by the chainer

50

SystemMany concepts yet(20000 concepts)

Page 51: Chat bot made by the chainer

SystemChoose the Topic

Choose the concept from over the 1000 words. It is easy to match the phrase.

51

Black Cat White Cat

Cat

Shiba Tosa :

Dog

swan duck :

Bird

koala

Koala

Choosing the Concept

Page 52: Chat bot made by the chainer

52

System76 concepts

(Attention:I didn’t use the all WikiPedia Entity Vectors)

Page 53: Chat bot made by the chainer

SystemChoose the Topic

The way of the choosing the dialogue

Choose the each concept by the word match rate

53

Where can I buy

cute clothes ?

Boy FriendCool

Nice guy :

Young SisterCute

Clothes :

Fathermoney gentle :

Calculate the word match

rate

Page 54: Chat bot made by the chainer

Agenda

:54

Chat Bot

Choosethe Topic

UnderstandContents Control

DialogueGenerateAnswer

Answer Candidates

Neural Network1

UnderstandContents Control

DialogueGenerateAnswer

Neural Network 2

Question

Question Answer

WordNet WikiPediaVector

Understand Contents

Control Dialogue

Generate the Answer

Page 55: Chat bot made by the chainer

55

SystemAll parts made by the Neural Network(Attention:I might be the best way)

Page 56: Chat bot made by the chainer

56

SystemWhy Neural Network?

I will explain how to apply the Neural Network to Natural Language Processing

Page 57: Chat bot made by the chainer

Dialogue Value

Value of the Neural Network

Expression

Continuously

Focus

57

Page 58: Chat bot made by the chainer

Dialogue Value

Value of the Neural Network

Expression

Continuously

Focus

58

Page 59: Chat bot made by the chainer

59

SystemMapping natural language to the vector space using the Bag of

words(Prepare the Dictionary and Count the word in the dictionary)

Low High

Word Phrase sentenceExpression

I show am me your you … when are1, 0, 0, 0, 0, 0, … 0, 0I

am

Shota

I show am me your you … when are0, 0, 1, 0, 0, 0, … 0, 0

I show am me your you … when are0, 0, 0, 0, 0, 0, … 0, 0

It is rate time to use it, but over the million wordsData

Page 60: Chat bot made by the chainer

60

SystemIt only considers words.

Page 61: Chat bot made by the chainer

61

SystemDeep Learning is an efficient method for learning high-quality

distributed vector representations that capture a large number of precise syntactic and semantic word relationships

Low High

Word Phrase sentenceExpression

I

am

Shota

Distributed representations of words in a vector space by the Deep leaning

Data

Deep Learning

0.5, 0.0, 1.0, 1.0, 0.3, 0.0

0.5, 0.0, 1.0, 1.0, 0.0, 0.0

0.5, 0.0, 1.0, 0.5, 0.3, 0.0

Page 62: Chat bot made by the chainer

Dialogue Value

Value of the Neural Network

Expression

Continuously

Focus

62

Page 63: Chat bot made by the chainer

63

System

太郎 さん こんにちは

Phrase is important for Continuously Recurrent Neural Network is possible to consider the

Continuously

Page 64: Chat bot made by the chainer

Dialogue Value

Value of the Neural Network

Expression

Continuously

Focus

64

Page 65: Chat bot made by the chainer

65

System

+

太郎 さん こんにちは

Focus is important for important phrasing.A Attention Model(Neural Network) considers which are

the focus words

Page 66: Chat bot made by the chainer

SystemValue of the Neural Network

Expression

Continuously

Focus

66

Which is more important

to you me or your job

Please tell me where

you bought your clothes

Please give me money

Boy Friend

Young Sister

Father

Page 67: Chat bot made by the chainer

67

SystemHow to implement a Neural Network

Page 68: Chat bot made by the chainer

68

SystemThis is a Dialogue Model

太郎 さん こんにちは

こんにちは<EOS>

+Encoder

Decoder

Page 69: Chat bot made by the chainer

69

SystemMapping the Phrases to a neural network space.

The middle layer express a neural network space.

太郎 さん こんにちは

太郎:1さん:0こんにちは:0

:

太郎:1さん:0こんにちは:0

:

Page 70: Chat bot made by the chainer

70

SystemContinuously learn from phrases

0 0 0 0 1 : 0 output

layer

さん

こんにちはhidden layer

太郎の時 の 隠れ層

Transform Matrix

Copy the past value

太郎 さん こんにちは

Page 71: Chat bot made by the chainer

71

SystemThe input sentence is reversed.

The first word is the most important.

太郎 さん こんにちは

Page 72: Chat bot made by the chainer

72

SystemForward Information and Reverse Information are

Convolution

+Encoder

太郎 さん こんにちは

Page 73: Chat bot made by the chainer

73

SystemGenerating the phrases by the Convolution information

太郎 さん こんにちは

+Encoder

こんにちはDecoder

Page 74: Chat bot made by the chainer

74

SystemIt is consider the continuously Generating the Phrases

太郎 さん こんにちは

こんにちは<EOS>

+Encoder

Decoder

Page 75: Chat bot made by the chainer

75

SystemThe Value of this model

Expression

Continuously

Focus

Page 76: Chat bot made by the chainer

Agenda

:76

Chat Bot

Choosethe Topic

UnderstandContents Control

DialogueGenerateAnswer

Answer Candidates

Neural Network1

UnderstandContents Control

DialogueGenerateAnswer

Neural Network 2

Question

Question Answer

WordNet WikiPediaVector

Question and Answer Function

Page 77: Chat bot made by the chainer

77

SystemHow to know if it’s a question or not

Page 78: Chat bot made by the chainer

SystemIt is very simple to decide

Is there a question mark (?)

If you interested in detecting questions, I recommend you read the paper below

Li, Baichuan, et al. "Question identification on twitter." Proceedings of the 20th ACM international conference on Information and knowledge management. ACM, 2011.

78

Where can I buy

cute clothes ?

UnderstandContents Control

DialogueGenerateAnswer

Neural Network1

Question Answer

Please tell me where I can find

cute clothes

Page 79: Chat bot made by the chainer

79

SystemDemonstration!!

https://youtu.be/ulICnU2f2Po

Page 80: Chat bot made by the chainer

Agenda

Self Introduction

Dialogue Value

Character of the Bot

System

Feature Plan

80

Chat Bot

Choosethe Topic

UnderstandContents Control

DialogueGenerateAnswer

Answer Candidates

Neural Network1

UnderstandContents Control

DialogueGenerateAnswer

Neural Network 2

Question

Question Answer

Page 81: Chat bot made by the chainer

Feature Plan

81

Page 82: Chat bot made by the chainer

Feature PlanPrepare the enough test

Not Enough test code

Evaluation

F measure

Apply the latest Chainer

I hear the Trainer function is good

Rule base and Neural Network

NeuralStoryTeller

Add the character82

Page 83: Chat bot made by the chainer

ConclusionWord Net is a Concept Dataset

It is possible to find other data which express the concept

Mapping words to Vector space using Wikipedia Entity Vectors

We make the Vector spaces using our own data set

Hybrid function (Neural Network and Rule based)

Please search github for “Chainer Slack Twitter”

Please give me a star

I prepare the Docker Container

please search for “Docker hub Chainer-Slack-Twitter-Dialogue”83

Page 84: Chat bot made by the chainer

Conclusion

We’re Hiring!!

https://www.theport.jp/recruit/information/84

Page 85: Chat bot made by the chainer

Reference• Chainerで学習した対話用のボットをSlackで使用+Twitterから学習データを取得してファインチューニン• http://qiita.com/GushiSnow/items/79ca7deeb976f50126d7

• WordNet• http://nlpwww.nict.go.jp/wn-ja/

• 日本語 Wikipedia エンティティベクトル• http://www.cl.ecei.tohoku.ac.jp/~m-suzuki/jawiki_vector/

• PAKUTASO• https://www.pakutaso.com/

• Luong, Minh-Thang, Hieu Pham, and Christopher D. Manning. "Effective approaches to attention-based neural machine translation." arXiv preprint arXiv:1508.04025 (2015).

• Rush, Alexander M., Sumit Chopra, and Jason Weston. "A neural attention model for abstractive sentence summarization." arXiv preprint arXiv:1509.00685 (2015).

• Tech Circle #15 Possibility Of BOT • http://www.slideshare.net/takahirokubo7792/tech-circle-15-possibility-of-bot

• Generating Stories about Images• https://medium.com/@samim/generating-stories-about-images-d163ba41e4ed#.h80qhbd54

• 二つの文字列の類似度• http://d.hatena.ne.jp/ktr_skmt/20111214/1323835913

• Li, Baichuan, et al. "Question identification on twitter." Proceedings of the 20th ACM international conference on Information and knowledge management. ACM, 2011.

• 音源:スカイウォーキング• http://dova-s.jp/bgm/download5052.html

• 音源:get into the rhythm• http://dova-s.jp/bgm/download5145.html

• 構文解析• http://qiita.com/laco0416/items/b75dc8689cf4f08b21f6

85