Knowlywood: Mining Activity Knowledge from Hollywood Narratives

28
Knowlywood: Mining Activity Knowledge from Hollywood Narratives Niket Tandon (MPI Informatics, Saarbruecken) Gerard de Melo (IIIS, Tsinghua Univ) Abir De (IIT Kharagpur) Gerhard Weikum (MPI Informatics, Saarbruecken)

Transcript of Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Page 1: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Niket Tandon (MPI Informatics, Saarbruecken) Gerard de Melo (IIIS, Tsinghua Univ)

Abir De (IIT Kharagpur)Gerhard Weikum (MPI Informatics, Saarbruecken)

Page 2: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Legs, person, shoe, mountain, rope..

Page 3: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Legs, person, shoe, mountain, rope..

Rock climbingGoing up a mountain/ hillGoing up an elevation

Daytime, outdoor activityWhat happens next?

Page 4: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Legs, person, shoe, mountain, rope..

Rock climbingGoing up a mountain/ hillGoing up an elevation

Daytime, outdoor activityWhat happens next?

Activity classes Activity groupingsActivity hierarchy

Additional informationTemporal guidance

Page 5: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

5

{Climb up a mountain

, Hike up a hill}

Participants climber, boy, rope

Location camp, forest, sea shore

Time daylight, holiday

Visuals

Get to village

.. ..

Go up an elevation

.. ..

Previous activityParent activity

Drink water

.. ..

Next activity

Page 6: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Activity commonsense: Related work

Event mining

Encyclopedic KBs:Factual e.g. bornOnEntity oriented e.g. PersonMany KBs: e.g. Freebase

6

Page 7: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Activity commonsense: Related work

Event mining Commonsense KB

Encyclopedic KBs: Cyc:Factual e.g. bornOn ManualEntity oriented e.g. Person Limited sizeMany KBs: e.g. Freebase No focus on activities

ConceptNet:

Crowdsourced

Limited sizeNo semantic activity frames

WebChild:No focus on activities

7

Page 8: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Activity commonsense: Related work

Event mining Commonsense KB This talk

Encyclopedic KBs: Cyc: Semantic Factual e.g. bornOn Manual Activity CSKEntity oriented e.g. Person Limited size KB constructionMany KBs: e.g. Freebase No focus on activities

ConceptNet:

Crowdsourced

Limited sizeNo semantic activity frames

WebChild:No focus on activities

8

Page 9: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

{Climb up a mountain

, Hike up a hill}

Participants climber, boy, rope

Location camp, forest, sea shore

Time daylight, holiday

..

Get to village

.. ..

Go up an elevation

.. ..

Previous activityParent activity

Drink water

.. ..

Next activity

Activity commonsense is hard:- People hardly express the obvious : implicit and scarce- Spread across multiple modalities : text, image, videos- Non-factual : hence noisy

Page 10: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

10

align via

subtitles

with approximate

dialogue similarity

Hollywood narratives are easily available and meet the desiderata

Contain events but not activity knowledge

May contain activities but varying granularity and no visuals. No clear scene boundaries.

Page 11: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

11

Page 12: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

12

State of the art WSD customized for phrasesSyntactic and semantic role

semantics from VerbNet

man.1

video.1

shoot.1

shoot.4

man.2 agent.animate

shoot.vn.1patient.animate

agent.animate

shoot.vn.3patient.

inanimate

the man

began to

shoot

a video

NP VP NP

NP VP NP

Page 13: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

State of the art WSD customized for phrasesSyntactic and semantic role

semantics from VerbNet

man.1

video.1

shoot.1

shoot.4

man.2 agent.animate

shoot.vn.1patient.animate

agent.animate

shoot.vn.3patient.

inanimate

the man

began to

shoot

a video

NP VP NP

NP VP NP

Output Frame

Agent:

man.1

Action:

shoot.4

Patient:

video.1

Page 14: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

14

xij = binary decision var. for word i, mapped to WN sense j

IMS prior WN prior Word, VN match score

Selectional restriction score

One VN sense per verb

WN, VN sense consistency

Selectional restr. constraints

binary decision

Page 15: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

15

Climb up a mountain

Participants climber, rope

Location camp, forest

Time daylight

Hike up a hill

Participants climber

Location sea shore

Time holiday

Go up an elevation

.. ..

Drink water

.. ..

Similarity:

Hypernymy:

Temporal:

+ Attribute overlap

WordNet hypernymy : vi, vj and oi , oj

+ Attribute hypernymy

Generalized Sequence Pattern mining over statistics with gaps#(asynset1 precedes asynset2 ) / #(asynset1 ) #(asynset2 )

Page 16: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Probabilistic soft logic- refining Typeof (T), Similar (S) and Prev (P) edges

16

Page 17: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

17

Climb up a mountain

ParticipatingAgent

climber, rope

Location camp, forest

Time daylight

Hike up a hill

ParticipatingAgent

climber

Location sea shore

Time holiday

Go up an elevation

.. ..

Drink water

.. ..

Tie the activity synsets

Break cycles

Resultant: DAG

Page 18: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Recap

• Defined a new problem of automatic acquisition of semantically refined frames.

• Proposed a joint method that needs no labeled data.

18

Page 19: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

19

Knowlywood Statistics

Scenes 1,708,782

Activity synsets 505,788

Accuracy 0.85 ± 0.01

URL bit.ly/knowlywood

#Scenes is aggregated counts over Moviescripts, TV serials, Sitcoms, Novels, Kitchen data.

Evaluation: Manually sampled accuracy over the activity frames.

Evaluation

Page 20: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Evaluation: Baselines

- No direct competitor providing activity frames.

KB Baseline: Our semantic frame (rule based) structure over the crowdsourced commonsense KB ConceptNet

Methodology Baseline: A rule based frame detector over our data and other data using an open IE system ReVerb

20

Page 21: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

KB Baseline

You open your wallet hasNextSubEvent take out money

Normalized domain: concept1 ~ verb [article] noun

Organize and canonicalized the relations as follows:

21

ConceptNet 5’s relations We map it to

IsA, InheritsFrom type

Causes, ReceivesAction, RelatedTo, CapableOf, UsedFor agent

HasPrerequisite, HasFirst/LastSubevent, HasSubevent, MotivatedByGoal

prev/next

SimilarTo, Synonym similarTo

AtLocation, LocationOfAction, LocatedNear location

Page 22: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Methodology Baseline

Reverb, an openIE tool extracts SVO triples from text

- S and O are only surface forms.

- V is not categorized into a relation. We use a Bayesian classifier to estimate the label of V

The estimates come from MovieClips.com that provides 30K manually tagged popular movie scenes like, action: singing, prop: violin, setting: theater

22

Page 23: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Methodology Baseline

Reverb, an openIE tool extracts SVO triples from text

- S and O are only surface forms.

- V is not categorized into a relation. We use a Bayesian classifier to estimate the label of V

The estimates come from MovieClips.com that provides 30K manually tagged popular movie scenes like, action: singing, prop: violin, setting: theater

23

Page 24: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

24

0.87 0.860.84 0.85

0.78 0.79

0.15

0.81

0.92 0.91

0.33

00

0.77

0 0

0.83

0.66

0

0.41

0 0 0 00

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Parent Participant Prev Next Location Time

Knowlywood ConceptNet based Reverb based Reverb clueweb

# activities

Knowlywood ~1 M High accuracy & high coverage

ConceptNet based ~ 5 K High accuracy & low coverage

Reverb based ~ 0.3 M Low accuracy & high coverage

Reverb clueweb ~ 0.8 M Low accuracy & high coverage

Page 25: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Visual alignments

~30,000 Images from movies, and additionally, >1 Million images via Flickr tag matching:

25

Match verb-noun pairs from

Knowlywood as ride bicycle

riding, road,

bicycle ..

ride a bicycleparticipant: man, boy

location: road

Flickr Activity vector = road

DOTKnowlywood =

man, road

Page 26: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

External use case -1 : Semantic indexing

26

Given: participant, location and timePredict: the activityGround truth: Movieclip’s manually specified activity tag.

Atleast one hit in Top 10 predictions

Thank you! Browse at bit.ly/webchild

Page 27: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

27

Method: A generative model encoding that a query holistically matches a sceneif the participants and activity fit well with the query.

External use case 2: Movie Scene Search

Page 28: Knowlywood: Mining Activity Knowledge from Hollywood Narratives

Thank you! Browse at bit.ly/webchild

Conclusion