Lecture 3: Salience and Relations Reading: Krahmer and Theune (2002), in Van Deemter and Kibble...

Lecture 3: Salience and Relations

Reading: Krahmer and Theune (2002), in Van Deemter and Kibble (Eds.)

“Information Sharing: Reference and Presupposition in Language Generation and Interpretation”, CSLI Publications

Leftovers from yesterday

• D&R’s algorithm embodies the assumption that Content Determination can be done before everything else.

• Alternative account: Lecture 5.

• Some issues:


• Does CD know which properties can be expressed in the language?

• Strong form of the assumption: Realization may take any amount of ‘space’, e.g.,

‘(The treasure can be found ...)– … at the peak of the hill’

– … on the hill; the steep one with lots of green grass’

– (even an entire book)


• Properties can be context-dependent and vague (e.g., ‘steep (hill)’).– In context, the description can ‘nail’ the target– GRE algorithms can be expanded to do this– vague descriptions from crisp input – L now really becomes a list

• These and other extensions: see web page

1. Salience in GRE

• Before talking about ‘proper’ GRE, let’s briefly talk about category choice.

• Let every xi be a referring expression:

....x1....x2.....x3....... ..........x2....x2....x1.. ....x1....x4.....x5.......

• Definite descriptions are one option among many:

Category choice• Choosing between proper names, pronouns,

demonstratives, definite descriptions, etc.• Theories about category choice are often studied

using corpora, via hypothesis testing or learning.• Salience is a key concept, which takes a different

form in different theories (e.g., centering theory)• Related notions: focus, discourse-old/new,...

(e.g., McCoy & Strube 1999; Henschel, Cheng & Poesio 2000)

• Most research has focussed on possibility of pronominal reference.

• ‘Use pronoun if there is an antecedent in the previous clause, and there is no competing referent’ (Dale and Reiter 1995)

• (K&Th) This undergenerates pronouns

• Example of a more generous account:

Henschel, Cheng & Poesio (2000)

• Choose pronoun if– antecedent is realized as subject or

discourse-old &– no competing referent is realized as

subject or discourse-old &– no competing referent is ‘amplified’ by

appositive or restrictive relative clause

• Otherwise choose definite description

• We will largely ignore category choice, focussing on generation of definite descriptions.

• So far, we have also ignored salience,arguably at our peril ...

Salience in GRE

• Reiter and Dale (2000) “Building Natural Language Generation Systems”:

Domain = { elements that are salient enough }

• Krahmer and Theune (2002):1. This disregards different degrees of salience

within the Domain2. This fails to reflect that even the least salient

object can be referable

Salience in GRE

1. Suppose D contains many dogs. Still, if my chihuahua is the most salient dog in D then ‘the dog’ refers unambiguously to it.

2. If our chihuahua is the least salient object in the D then we might still refer to him (e.g.,‘the small ratty creature that’s trying to hide behind the chair’).

Krahmer and Theune (2002)

• Abandon D&R’s dichotomy.

• Assume: ‘the N’ = ‘the most salient N’.

• Exercise: Get the Incremental Algorithm to say ‘the N’ iff N is the most salient N.

• Reminder: This is the Incremental Algorithm …

FailureReturn

LReturn then {r}C If

]][[C:C

}{L:L

do then ]][[ C &]][[r If

:do P allFor

Domain:C

Φ:L

P

P

PP

P


• (My version): re-interpret Domain

as

)}()(:Domain{ rSalxSalx

FailureReturn


]][[C:C

}{L:L

do then ]][[ C &]][[r If

:do P allFor

Domain:C

Φ:L

P

P

PP

P

Example Situation

a, £100 b, £150

c, £100d, £150 e, £?SwedishItalian

most salient

least salient

SalMax={ac}, SalMid={b}, SalMin={de}

• Type: furniture (abcde), desk (ab), chair (cde)

• Origin: Sweden (ac), Italy (bde)

• Colours: dark (ade), light (bc), brown (a)

• Price: 100 (ac), 150 (bd) , 250 ({})

• Contains: wood ({}), metal (abcde), cotton (d)

Exercise: Describe a; Describe b; Describe d

SalMax={ac}, SalMid={b}, SalMin={de}

• Type: furniture (abcde), desk (ab), chair (cde)• Origin: Sweden (ac), Italy (bde)• Colours: dark (ade), light (bc), brown (a)• Price: 100 (ac), 150 (bd) , 250 ({})• Contains: wood ({}), metal (abcde), cotton (d)

a: Domain = {a,c}; description = {desk}b: Domain = {a,b,c}; description = {desk, Italy}d: Domain = {a,b,c,d,e}; description = {chair, Italy, 150}

• Krahmer & Theune are noncommittal about how salience is determined

• Compare Praguian/centering account

• Focus on textual salience:

....x1....x2.....x3....... ..........x2....x2....x1.. ....x1....x4.....x5.......

• Salience has a physical component as well (e.g., ‘the door’ = the nearest door)

Pronouns

• K&Th explore how their account may be generalized to generate pronouns:– ‘it/he/she’ = ‘the object’ (etc.)– Given their account, this means

‘the most salient object’.

• Predictions look OK, though it does not seem to allow antecedents beyond previous clause.

Pronouns

• Example: ‘The white chihuahua1 was chasing the cat2. It1/the cat2 ran fast’.

• K&Th: Perhaps it’s not enough being slightly more salient than your competitors:– ‘The white chihuahua1 was chasing the cat2. The

chihuahua1/the cat2 ran fast’.

– ‘The white chihuahua1 was eating. It1 was eating a cat’.

• K&Th discuss two other extensions:

– Bridging (e.g., ‘the car …. the motor’)– Relational properties

Since bridging involves a relation, let us start with relations.

2. Relational properties

• Tuesday’s lecture: Some properties involve a relation with another object, e.g.,

• Origin: Sweden (ac), Italy (bde)

From (a,Sweden)

• Recursion requires reification:

‘ x comes from the country where y lives’

Dale & Haddock (1991)

D&H modelled 2-place relations in GRE

Constraint satisfaction perspective, e.g.,

Constraints: {Orange(a), Orange(b), Table(c), On(a,c)}

Problem: construct sets of atoms that have r as the only value of a designated variable:

{Orange(x), Table(y), On(x,y)}

• D&H accumulate atoms until the target r is identified.

• This can be done in any order (cf., Dale and Reiter 1995)

• D&H choose a ‘greedy’ order: adding atoms that remove maximum number of distractors.

Exercise (relations)

• Greediness: you always add an atom that removes the maximum number of distractors.

• Construct an example that shows this approach not to be logically complete.

• Many later accounts, e.g., by Horacek, (also Krahmer et al.)

• Krahmer and Theune’s paper contains an alternative model that we will use for expository purposes

– One of the ‘extensions’ in K&Th– Incremental rather than greedy


• K&Th mix Content Determination with Syntactic Realization and Lexical Choice.

• We will continue to focus on Content Determination.

• We will make some other simplifications:

Simplifications

• Unlike K&Th, – We forget about salience

– property P instead of <Attribute,Value>

– No indefinite descriptions.

– Nothing about contrastive stress.

(Reminder: NLG is relevant to speech!)


• Preference ordering P contains ordinary properties and relations:

x:chair(x), x:from(x,Italy)

• Properties precede relations.

• In other respects they are treated alike. (Alternative: Mariet Theune’s thesis)

D&R, simplified:

FailureReturn


C)L,,te( Upda

do then C)P,r,, Useful(If

:do P allFor

D:C

Φ:L

P

P

P

Changes to incremental algorithm

• This function, Ref, now needs to become recursive.

• Whether a property is Useful may dependon the properties already present in L

Suppose you want to identify x. This makes properties of y irrelevant …. unless L contains a relation between x and y

• This leads to the following changes:

Changes to incremental algorithm

1. Make L an argument of Useful and Ref.

2. Record in L - the properties that were found useful- the things of which they were true

3. Useful(P,r,P,L) def

Confusables r (L{P}) Confusables r (L)

FailureReturn


L),CP,,REF(r' then )r'r,,Relation(P If

C)L,,te( Upda

do then L)P,r,, Useful(If

:do P allFor

:L),CP,REF(r,

r'

r

P

P

P

Example

P = < x:dog(x) {d1,d2}, x:doghouse(x) {h1,h2}, x:red(x) {h1}, x:brown(x) {h2}, x:in(x,h1) {d1}, x:in(d1,x) {h1}, x:in(x,h2) {d2}, x:in(d2,x) {h2} >

r = d1

Example (steps)

• Step 1: r = d1P = x:dog(x)

Example (steps)


• Step 2: r = d1P = x:in(x,h1) (Success if h1 can be identified)

Example (steps)


• Step 2: r = d1P = x:in(x,h1) (Success if h1 can be identified)

• Step 3 (recursion): r = h1P = x:red(x) (Success)

Example (details)

• Step 1: r = d1

P = dog(x) d1 [[P]]

Conf d1 (< dog(x) (d1?) >)

Conf d1 (< > (d1?))

(Therefore, P is a useful addition to L)

Example (details)

• Step 2: r = d1

P = in(x,h1) d1 [[P]]

Conf d1 < dog(x)(d1?), in(x,h1)(d1?) >

Conf d1 < dog(x)(d1?) >

Example (details)

• Step 3 (recursion): r = h1

P = red(x) h1 [[P]]

Conf h1 < dog(x)(d1?),in(x,h1?)(d1?), red(x)(h1?)>

Conf h1 < dog(x)(d1?),in(x,h1?)(d1?) >

Example 2

P = <x:dog(x) {d1,d2}, x:doghouse(x) {h1,h2}, x:in(x,h1) {d1}, x:in(d1,x) {h1}, x:in(x,h2) {d2}, x:in(d2,x) {h2} >

r = d1

Failure during REF(h1,P,C,L), where

L = < dog(x)(d1?), in(x,h1?)(d1?) >

Example 3

P = < x:dog(x) {d1,d2}, x:doghouse(x) {h1,h2}, x:in(x,h1) {d1}, x:in(d1,x) {h1} >

r = d1

Success through mutual identification:

‘The dog in the doghouse’ (D&H)

Problems with algorithms like this:

• Not very elegant; easy to make errors.(Worse with relations of larger arity.)

• Risk of loops: ‘The orange on the table under the orange on the table, ...’.

• Variant proposals:– Krahmer et al. (2001): labelled directed graphs– Gardent (2002): constraint satisfaction– Etc.

A more general problem

Any preference order will sometimes have strange results.

Exercise: construct example where putting 1-place properties first causes an excessively lengthy description.

Complexity

• Theoretical worst-case complexity of GRE + relations is exponential.

• This algorithm:– Number of loops is bounded by number of

properties (n-ary).– Whenever a relation is used, another

recursive call of Ref may be necessary.

A red thread

• ‘Simple’ GRE produces plausible descriptions at reasonable speed. But,

• when relations are added, fairly awfuldescriptions are generated slowly.

• This will become worse when other complications are taken into account: More options More problems

(‘embarrassment of riches’)

Combining relations and salience: Bridging

• { trailer(t1), trailer(t2) car(c1), car(c2), behind(t1,c1) }

• Sal(c1)>Sal(c2), Sal(t1)>Sal(t2):– ‘The trailer behind the car’– ‘The trailer’

Bridging (etc.)

• But …what if

{trailer(t1), trailer(t2), car(c1), car(c2), behind(t1,c1), behind(t2,c2)}

Sal(t2) > Sal(t1), Sal(c2) < Sal(c1)

Can we still say

‘The trailer behind the car’?, ‘The trailer’ ?

The problem:

• Relations involve more than one object• These objects can have different degrees

of salience.• It is unclear how this should affect the

algorithm.• In fact, this is a very common problem:

Different extensions of GRE combine in nontrivial ways.

Combining salience and relations: Paraboni and Van Deemter (2002)

• GRE algorithms tend to be applied to ‘flat’ domains.

• Let’s see what happens in a hierarchically ordered domain.

• Before doing this, let us step back ...

Making references easy

Consider these descriptions:

1. ‘the woman with red hair’(easy to find)

2. ‘the woman with green eyes’(difficult to find)

Incremental Algorithm can deal with this by making Hair-Colour more preferred than Eyes-Colour

Making references easy: the case of hierarchically ordered domains

Now consider these descriptions:

1. ‘... no. 2068 Lincoln Street, Brighton’

2. ‘... no. 2068, Brighton’

Determining the sense is faster with (2);

Determining the reference is faster with (1).

• Hierarchically ordered domains can be used to highlight some interesting issues.

• First issue: Salience can be determined by factors other than discourse structure.

b u ild in g 1

C o p ie r

b u ild in g 2

M e d ica l com p lex

D E S C R IP T IO N C o p ie r= T A R G E T

b u ild in g 1 b u ild in g 2

IT com p lex

u n ive rs ity cam p us

Example: To describe TARGET, it’s enough to distinguish it from distractors in building 1

So: Here ‘the copier’ is specific enough

So far, K&Th’s account applies, provided salience is measured adequately:

SAL (tree (parent (d) ) ) = max

SAL (tree (parent (parent(d) ) ) = max-1

…

Given a starting point d, the focus domain is the smallest subtree that contains d and r.

b u ild in g 1

T A R G E T 3

b u ild in g 2


D E S C R IP T IO N T A R G E T 1

b u ild in g 1S A L =m ax

T A R G E T 2

b u ild in g 2

IT com p lexS A L =m a x -1

u n ive rs ity cam p usS A L =m a x -2

So far, hierarchy does not pose any big problems.

But let’s consider some possible preference orders ….

b u ild in g 1

C o p ie r = T A R G E T

b u ild in g 2


D E S C R IP T IO N C o p ie r


IT com p lex


Exercise: if this is the situation, then which properties will be chosen to identify the TARGET?

b u ild in g 1


b u ild in g 2




IT com p lex


1. [Complex preferred over Building]:‘the copier in the Medical complex’ (-unique)This is not optimally helpful.

b u ild in g 1


b u ild in g 2




IT com p lex


2. [Building preferred over Complex]:‘the copier in building 2’ (-unique)This seems actually infelicitous.

No preference ordering gives accurate results.

Issues• Issue 1: Salience can be determined

by non-textual factors.

Our example: structural ‘distance’ between Description and Target

• Issue 2: Contradicting incrementality, redundancy can be crucial. E.g., ‘the copier in building 2 of the Medical Complex’

Our example: if you can reduce the search space strongly by one extra property then do it! (Experimentally validated.)

Issues

• Issue 3: Mutual identification is not always allowed. E.g., ‘the copier in building 2’.

Our example: D&H’s approach assumes that all referents are highly salient, and all properties/relations are highly transparent.

Ivandré Paraboni ’s thesis

• Documents are structured domains

• Generating references to parts of texts or documents. E.g.,

– ‘see figure 3 in section 5’,

– ‘the issues discussed in the Introduction’

• When to generate such references

• How to do it

Back to the issue of complexity

• Salience of objects helps reducing the number of distractors.

• Might properties also be subject to salience (reducing the size of P)?

• What is the role of incrementality in GRE?(Next lecture)

FailureReturn

LReturn then SC If

]][[C:C

}{L:L

do then [[P]] C &[[P]]S If

:do P allFor

Domain:C

Φ:L

P

P

P

Next lecture

• Theoretical departure: “What is NLG anyway” (Shieber 1993).

• Another way in which referring expressions can go beyond conjunction of atomic properties: Boolean descriptions (Van Deemter 2002).

Lecture 3: Salience and Relations Reading: Krahmer and Theune (2002), in Van Deemter and Kibble...

Documents

Transcript of Lecture 3: Salience and Relations Reading: Krahmer and Theune (2002), in Van Deemter and Kibble...