Spatial Language in Human-Robot Interaction Thora Tenbrink Project I1-OntoSpace.

30
Spatial Language in Human-Robot Interaction Thora Tenbrink Project I1-OntoSpace
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    226
  • download

    0

Transcript of Spatial Language in Human-Robot Interaction Thora Tenbrink Project I1-OntoSpace.

Spatial Language in Human-Robot Interaction

Thora Tenbrink

Project I1-OntoSpace

Structure

● Project Background● Motivation● General Research Questions

● Overviews● Variation in object localisation (OL) tasks● Variation in discoursal factors● Range of variability in users’ utterances

● Analysis of projective terms● Approach● Default instructions and departures from that

standard

Motivation● Situated human – robot interaction:

● human instructs robot using natural language● goals: natural, successful, efficient interaction

● Difficulties:● different perceptual systems of human and

robot: necessity of mediation● discourse situation influences linguistic

choices● determining factors as yet unknown

Reference systems (cf. Levinson 1996)

● Intrinsic:● Localisation of object by reference to the intrinsic

properties of another entity, e.g. speaker‘s front: „The ball is in front of me“

● Relative: ● Presence of relatum required:

„The ball is in front of the table“● Group-based reference if one of several similar objects

is referred to: „The leftmost ball“

● Absolute: ● the earth‘s cardinal directions (north, south)

Perspectives

● Speaker-centered: ● Intrinsic: „The ball is in front of me“● Relative: „From my point of view, the ball is to the right of the table“● Absolute: „to the north of me“

● Listener-centered: ● Intrinsic: „The ball is in front of you“● Relative: „From your point of view, the ball is to the right of the table“● Absolute: „to the north of you“

● Third-party point of view:● Intrinsic: „The ball is in front of Peter“● Relative: „Viewed from the church’s entrance, there is a bookshop on

the right“● Absolute: „north of Bremen“

Interpreting spatial language in context

● 9 kinds of underlying reference systems possible → great potential for misunderstandings● e.g. „to the left“: misinterpreting the perspective

yields the direct opposite

● Our approach: Elicitation of data in experimental settings● Analysis of linguistic choices in relation to

situational parameters (spatial settings and tasks; robot appearance and behavior)

Robots: Pioneer & Aibo

One scenario

de

sk

test person ca -

mera

barrel

Pioneerrobot

Various configurations of similar and different objects

Various points of view

No list of commands

Written and spoken mode

Experimental research questions● Influence of scenario variables on natural language

instructions● What perspective is used?● How are the locations of objects referred to?

● Group-based reference / intrinsic reference / distance / counting

● Landmarks / figures / rows● Amount / sufficiency of information conveyed in

instruction● Users‘ hypotheses about robot functionalities● Effects of dialogue history

● Robot output● Success vs. failure

Situational & discoursal variability

● Variation in dialogic / interpersonal factors● robot: kind/ output/ functionality● mode: written/ spoken● experimental instruction: pointing/ reference to goal

● Variation in object localisation tasks● number of objects● positioning of objects● robot’s view direction● availability of unambiguous non-projective terms for

reference

Reference systems

● Overlapping reference systems: intrinsic regions roughly correspond to group-based regions

● Differing reference systems: intrinsic regions do not overlap with group-based regions

Variation in object localisation tasks I LANDMARK without-landmark 24 63.2% with-landmark 14 36.8%------------------------------------------------------ RELATION-GOAL- nearest-to-landmark 7 18.4% closer-to-other-obj 3 7.9% equal-to-others 4 10.5% ------------------------------------------------------REFERENCE-SYST overlap-intrinsic-g 15 39.5% diff-intrinsic-grou 23 60.5%------------------------------------------------------ PERSPECTIVE differ-robot-user 37 97.4% overlap-robot-user 1 2.6%------------------------------------------------------ ROBOTS-VIEW-DI

towards-objects 17 44.7% towards-user 11 28.9% both 6 15.8% none 4 10.5%

Variation in object localisation tasks IIOBJECTS-PRESEN

one 1 2.6% two 10 26.3% three 20 52.6% four 7 18.4%------------------------------------------------------OBJECTS-INTRIN max-one-per-section 20 52.6% more-leftright 8 21.1% more-frontback 10 26.3%------------------------------------------------------ PROTOTYPICAL-I no-prototypical 30 78.9% left 4 10.5% right 1 2.6% front 3 7.9%

Variation in object localisation tasks III

OBJECTS-BEHIND all-visible 35 92.1% one-behind 3 7.9%------------------------------------------------------GROUP one-clear-group 32 84.2% no-group 3 7.9% group-plus 3 7.9%------------------------------------------------------ NON-PROJECTIVE not-obvious 25 65.8% available 13 34.2%------------------------------------------------------ AVAILABLE-TYPE middle 9 23.7% class-name 4 10.5%------------------------------------------------------ MIDDLE-OF same-class 6 15.8% superordinated-class 3 7.9%

Range of variability in users’ utterances

● Classification of levels: ● not all utterances are instructions: “wie heißt du?”● not all instructions are goal-based: “vorwärts”● not all goal-based instructions are projective: “fahr

zur zweiten Kiste”

● Linguistic variability on different levels: ● Syntax and politeness: simple vs. complex

imperatives: “du sollst zur Kiste fahren”, presence vs. absence of politeness forms: “bitte” (in instructions)

● contracted or full prepositions (in goal-based instructions): “zur Kiste” vs. “zu der Kiste”

Analysis of projective instructions

● Reference to the goal object via a projective spatial expression● “zur Kiste links”, “zur linken Kiste”, “zur Kiste vor dir”

● Variability on three levels: ● Modification of projective term

○ with unmodified terms: syntactic variation and variation between axes

● Complexity○ with single terms: analysis wrt group-based, intrinsic,

relative, or indeterminate reference systems

● Perspective

Assumptions

● No direct correspondence between reference systems and syntactic variation● “Kiste links” could be either intrinsic or group-based● “linke Kiste”: sometimes assumed to be only group-

based● Which expressions are preferred (in which situation)

for which kind of reference system?

● Applicability regions of intrinsic reference system larger than previously assumed: overlap

Computational models: Basis for coding of reference systems

Method & present status

● Qualitative analysis supported by statistical facts● Looking for preferences and tendencies, dependencies and

determining factors

● All projective “China” HRI instructions have been coded according to scheme● extraction of relevant instructions via xml● quick schematic coding with statistical results: Systemic Coder

● Correlations between OL factors and instructions● Looking for reasons for statistical results: where there is

systematic variation, there must be a reason○ situational (variability in OL tasks), discoursal, or individual

● Each variable in which OLs differ has been examined separately (only “China” files)

Results: All files (1)

MODIFICATION unmodified 346 92.0% modified 30 8.0%------------------------------------------------------ PROJECTIVE-TER adjective 281 74.7% adverb 57 15.2% preposition 8 2.1%------------------------------------------------------ AXIS left-right 315 83.8% front-back 31 8.2%------------------------------------------------------ COMPLEXITY single-ref-sys 363 96.5% combined-terms 13 3.5%------------------------------------------------------

Results: All files (2)

REFERENCE-SYST group-based 58 15.4% intrinsic 42 11.2% indeterminate 239 63.6% relative 24 6.4%------------------------------------------------------RELATUM implicit 329 87.5% explicit 34 9.0%------------------------------------------------------PERSPECTIVE robot 361 96.0% user 15 4.0%------------------------------------------------------ GIVEN-PERSPECT perspective-implicit 376 100.0% perspective-explicit 0 0.0%

Default projective instruction

● “Fahre zur rechten Kiste”● Features: adjective; no modifications of projective term; left-

right axis; indeterminate reference system; implicit relatum; robot’s perspective (implicit)

● Linguistic variation: “Kiste”, “Karton”, “Hindernis”, “Box”, “Kubus”, “Objekt” etc. for reference; abbreviated or full verb or verb omission (“Fahre” or “fahr” or nothing); contracted or full preposition (“zur” or “zu der”); politeness forms (“bitte”; only exceptionally).

● Information structure unmarked: Process (movement) is Given (unchanged throughout experiment), Participant (goal object) is New. Given part can be left out: “zur rechten Kiste” (omission of verb)

● Departure from this default must have a reason● „fahre zu der Kiste, die rechts aussen steht vor dir“

Some specific results: Influence of OL variables

● Presence of landmark● 17,6% relative reference systems where goal object is closer to

landmark than to any other objects (even though configuration close to standard one); 24,2% adverbs

● 12,7% relative ref.sys. with equal distances; 12,7% adverbs● no reference to landmark if other objects closer to goal object;

2,9% adverbs

● Reference systems● without overlap: no indeterminate reference systems. 14,6% of

projective terms are modified. 9% prepositions, 12,4% adverbs, 64% adjectives

● with overlap: 83,3% of utterances not clearly group-based or intrinsic. 5,9% of utterances are modified. No prepositions, 16% adverbs, 78% adjectives

Robot’s view direction (OL4)

● Expectations where robot does not look at group centroid● users prefer intrinsic

reference system○ Result:

only slight difference: 5,2 vs. 7,8% intrinsic

● users assume a straight line as view direction between robot

and group in order to employ a group-based reference system,

yielding reference on the left-right axis○ Result (of all group-based or indeterminate instructions):

box 3: 93,9% left-right; 6,1% front-back

box 1: 59% left-right, 35,9% front-back !

1 2 3

Pioneer

barrel

Explanations

● Front/back axis only possible if users “shift” the robot’s view direction in parallel towards the group centroid

● box 3: “linker Karton” is true for both intrinsic and group-based reference systems

● box 1: “rechter Karton” as well as “hinterer Karton” is only true for group-based reference● it is not an intrinsic reference system: the box is not

behind the robot● two active factors: view direction and overlap of

reference system

Non-projective terms in goal-based instructions

● Where possible in a given configuration, users prefer reference by simple use of class name to reference by projective terms. ● “Fahre zur Kiste”

● Where possible in a given configuration, users prefer reference to the “middle object” of the same class to reference via projective terms.● Projective terms used in such a configuration are

“second choice” and may differ from standard usage.● Where a superordinate term is needed, users prefer

projective terms; but they also use the class name (incorrectly): “Fahre zur mittleren Kiste”

Problems● Not enough data (analysed so far…)

● often differences are not really due to the analysed factor, but can be traced back to something else

● some factors were not varied in the China experiment: e.g., there was never more than one object in each intrinsic region

● Confounding of factors● OL3: the only OL where there were only 2 objects (instead of

3), with only one object visible to the robot, and one object behind it, where there was no clear group of objects, and where the robot did not look at either the object(s) or the user

● Only situational factors examined – but discoursal factors equally important

● Users’ preference of non-projective terms in some scenarios leads to untypical usages of projective terms when employed at all

OL3: “Gehe zum linken Karton”

● Situational factors: ● there is no group of objects● the object is not located in the

prototypical “left” region of the robot

● robot is not oriented towards object(s)

● Discoursal factors: ● previous success with

adjectives● not clear whether robot

perceives object behind it

Default instructions● Clear “default” scenarios

● Only one object: class name● 3 objects of same class with a clear middle one: “middle”● Groups with clear regions without competing objects: simple

projective adjectives

● When exactly do these defaults cease to be useful? Some expected but disproved limits: ● no group of objects: adjective● non-prototypical intrinsic direction: no modification● no middle object of same class: middle & class name● presence of landmark: mainly no reference to landmark● robot does not look at group centroid: no preference of intrinsic

regions

● But: the more the scenario departs from the “default” configuration, the less the default instructions are used.

Interpreting spatial instructions

● “Generous interpretation” necessary● Higher-ranking condition: Sufficient contrast to

competing objects in order to identify goal object● For humans it is usually clear what is meant

● in natural human-human interaction as well as wrt human overhearers in our robot experiments

● We need to specify these human intuitions sufficiently for a robot to reach the same conclusions● Understanding production processes and strategies supports

interpretation● More complex scenarios will show more of users’ strategies

where default instructions fail