PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also...

25
PRAXICON: PRAXICON: The Development of a Grounding The Development of a Grounding Resource Resource Katerina Pastra Institute for Language & Speech Processing (ILSP/Athena R.C.) Bellagio Meeting 6-7 October 2008

Transcript of PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also...

Page 1: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

PRAXICON: PRAXICON: The Development of a Grounding The Development of a Grounding

Resource Resource

Katerina Pastra

Institute for Language & Speech Processing(ILSP/Athena R.C.)

Bellagio Meeting6-7 October 2008

Page 2: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

OverviewOverview

• The notion of a PRAXICON• Grounding Resources (GR)• Development within the POETICON

project• The COSMOROE approach• Trends in Language Resources (LR)• Outlook

Page 3: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

What will the PRAXICON be?What will the PRAXICON be?

a) a “lexicon” with grounded lemmas

word + sensorimotor representationword sense + visual & motoric repr.

Andb) a “lexicon” with conceptual/pragmatic

relations between grounded lemmas

WordX REL WordY

Conceptual structures of different levels

(from action-object combinations, action-action combinations to …scripts…?)

Page 4: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

What will the PRAXICON be?What will the PRAXICON be?

Page 5: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

Why do we need a PRAXICONWhy do we need a PRAXICON

- AV processing- Human-Computer/Robot

interaction

Grounding needed:

- To tie words to sensorimotor experiences (disambiguation)

- To untie sensorimotor experiences from physical specificity (intentionality indication)

Page 6: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

Other Grounding Resources (?)Other Grounding Resources (?)“Integration” suggestions:

Word + visual object representationWord + image-region & visual feature-value vector (Bajcsy & Joshi78)

(Word + 3D model of object)+conceptual structures (Jackendoff83)

Small-scale implementation:

• In AI applications: words + object/action repr.

+ link to conceptual structures • Multimedia Thesauri (e.g. Benitez et al. 2000)

• Multimedia Ontologies (e.g. Zinger 2005 – OntoImage)

• Annotated Corpora (collections of labelled images, e.g. Lin et al. 03)

f-v vectors and/or actual images,

wireframe models, 2D drawings, motion trajectories etc.

ontologies, domain models,

frames etc.

Page 7: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

Other Grounding Resources (2)Other Grounding Resources (2)Issues:- Never done systematically, i.e. there is no grounding resource of any scale beyond ad hoc system development- Content (blobs or mini-worlds, not always link to conceptual structures…)- Methodology (manual association & visual abstraction)- Scaling, extending or even basic questions related to the development of such resource not posed

Page 8: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

POETICON SuggestionPOETICON Suggestion

Page 9: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

PRAXICON developmentPRAXICON development

COSMOROE annotated corpus

POETICON Cognitive Experiments

Page 10: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

AssociationAssociationv in everyday interaction v through cognitive categorization

“trolley”

“walk”

What and How?

Naming Strategies

Page 11: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

PRAXICON experimentationPRAXICON experimentation

“the yellow taxi-boats…”

Get associations,

create new entries/expand etc.

COSMOROE annotated corpus

Page 12: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

COSMOROE RelationsCOSMOROE Relations

see: Pastra 2008, Multimedia Systems Journal

Page 13: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

“… helmet for safety...”

TypeType--token equivalencetoken equivalence

Page 14: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

“The city, of course, is Athens , and it is here that I will begin my exploration of modern

Greece.”

MetonymyMetonymyThe two referents come from the same domain, have same array of associations, there is no transfer of qualities from one to another – the two modalities refer to different entities but the user intends the two modalities to be considered semantically equivalent

Page 15: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

Essential ExophoraEssential Exophora A pragmatic “anaphora” case

“…[pollution has taken its toll] on this..”

Page 16: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

COSMOROE Annotated CorpusCOSMOROE Annotated Corpus

Annotation particulars:

• TV travel programmes• 3 hours EL - 1.5 hours EN • validated• 0.88 inter-annotator agreement• tools used: Transcriber (Baras et al. 1998) and ANVIL (Kipp 2004)

• Annotation scheme: multi-facetedannotation comprises of indicating the time offsets of different modalities and the relation into which they participate

Page 17: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

COSMOROE annotationCOSMOROE annotation

Page 18: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

Annotation ByAnnotation By--ProductsProducts- manual speech transcriptions- optical character transcriptions- acoustic events gold data- audiovisual topics gold data- body movements & gestures gold data- shots gold data- object and event identification gold data

è Important for training and evaluation of corresponding technologiesè Further refinements for specific purposes can be done with less cost and effort (e.g. fully word-level transcription or annotation of gesture phases etc.)

Page 19: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

PRAXICON experimentationPRAXICON experimentation

Use cases for PRAXICON extension mechanism:

A) Input: action X or/and object Y Output needed: name the action or object

1. similar visual info in PRAXICON entry = LUCKY

1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY

1b) PRAXICON word in entry not present in Utterance (due to e.g. synonymy, metaphor, antithesis, complementarity, independence etc.) => need COSMOROE rel identification mechanism (trained on CM corpus) à new entry creation

2. no similar visual info in PRAXICON

2a) word in utterance exists in PRAXICON with different visual info (different sense) à new entry creation using CM algorithm

2b) word in utterance not present in PRAXICON => need CM algorithm to decide on new entry to be created and its relation to others

Page 20: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

QuestionsQuestions

1) What will grounded lemmas be like? 1-word? Multi-word? Word-centric? sensorimotor repr. – centric?

2) How will they be organised?

3) How specific or general should they be?

4) What kind of relations between entries/lemmas ???

Page 21: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

QuestionsQuestions

What information should be included? - Morphological info? (inflection, POS)- Syntactic info? (subcategorisation info)- Syntactico-semantic info?

(thematic roles, selectional restrictions)- Morpho-semantic info? (derivational links)- Lexical semantic relations

(synonymy, antonymy, meronymy…)- Conceptual relations (time e.g. temporal inclusion, manner e.g. troponymy, causation…)?- Will it include facts?

Page 22: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

Turning to Language ResourcesTurning to Language Resources

a) Can we tune one to get it grounded?b) Can we interface one with the PRAXICON at some conceptual level?c) Can we use one to develop a mechanism for extending the PRAXICON?

Page 23: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

TrendsTrends

• most resources get extended so that they cover more types of conceptual information/relations, going also down to the level of specific instances and facts• the resources get mapped to each other for greater usability• there is a constant search for automatic or semi-automatic mechanisms for extending the coverage of the resources, and• there is a growing development of the resources in different languages, all mapped to each other

Page 24: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

Outlook Outlook

è AI Quest for primitive concepts or features – feature bundles to describe the world in a universal way

è categorisation and story-telling to organise and talk about the world

But why not use the sensorimotor experience itself ?

Based on LR lessons, implement this new perspective

Page 25: PRAXICON: The Development of a Grounding Resource Katerina ... · 1a) PRAXICON word in entry also present in Utterance = REALLY LUCKY 1b) PRAXICON word in entry not present in Utterance

We envision PRAXICON We envision PRAXICON ……(A) To be a sensorimotor-centric resourceEntries beyond lexicalization, e.g. different visual representation of

toe and finger entries in the PRAXICON, though single lexicalization in some languages

This implies:- Different organisation than currently available in LRs- Greater granularity in concept analysis (meaning decomposition

down to the level inferences are minimal)

(B) To be a resource that goes beyond the readily lexicalised conceptual level, to the one at which inferences come into play (i.e. the meeting point of symbolic and sensorimotor representations)

(C) To let sensorimotor experience deal with the specifics of everyday interaction, &let language do what it does best: abstraction + interpretation