Lecture 3: Salience and Relations Reading: Krahmer and Theune (2002), in Van Deemter and Kibble...
-
Upload
greyson-barefoot -
Category
Documents
-
view
213 -
download
0
Transcript of Lecture 3: Salience and Relations Reading: Krahmer and Theune (2002), in Van Deemter and Kibble...
Lecture 3: Salience and Relations
Reading: Krahmer and Theune (2002), in Van Deemter and Kibble (Eds.)
“Information Sharing: Reference and Presupposition in Language Generation and Interpretation”, CSLI Publications
Leftovers from yesterday
• D&R’s algorithm embodies the assumption that Content Determination can be done before everything else.
• Alternative account: Lecture 5.
• Some issues:
Leftovers from yesterday
• Does CD know which properties can be expressed in the language?
• Strong form of the assumption: Realization may take any amount of ‘space’, e.g.,
‘(The treasure can be found ...)– … at the peak of the hill’
– … on the hill; the steep one with lots of green grass’
– (even an entire book)
Leftovers from yesterday
• Properties can be context-dependent and vague (e.g., ‘steep (hill)’).– In context, the description can ‘nail’ the target– GRE algorithms can be expanded to do this– vague descriptions from crisp input – L now really becomes a list
• These and other extensions: see web page
1. Salience in GRE
• Before talking about ‘proper’ GRE, let’s briefly talk about category choice.
• Let every xi be a referring expression:
....x1....x2.....x3....... ..........x2....x2....x1.. ....x1....x4.....x5.......
• Definite descriptions are one option among many:
Category choice• Choosing between proper names, pronouns,
demonstratives, definite descriptions, etc.• Theories about category choice are often studied
using corpora, via hypothesis testing or learning.• Salience is a key concept, which takes a different
form in different theories (e.g., centering theory)• Related notions: focus, discourse-old/new,...
(e.g., McCoy & Strube 1999; Henschel, Cheng & Poesio 2000)
• Most research has focussed on possibility of pronominal reference.
• ‘Use pronoun if there is an antecedent in the previous clause, and there is no competing referent’ (Dale and Reiter 1995)
• (K&Th) This undergenerates pronouns
• Example of a more generous account:
Henschel, Cheng & Poesio (2000)
• Choose pronoun if– antecedent is realized as subject or
discourse-old &– no competing referent is realized as
subject or discourse-old &– no competing referent is ‘amplified’ by
appositive or restrictive relative clause
• Otherwise choose definite description
• We will largely ignore category choice, focussing on generation of definite descriptions.
• So far, we have also ignored salience,arguably at our peril ...
Salience in GRE
• Reiter and Dale (2000) “Building Natural Language Generation Systems”:
Domain = { elements that are salient enough }
• Krahmer and Theune (2002):1. This disregards different degrees of salience
within the Domain2. This fails to reflect that even the least salient
object can be referable
Salience in GRE
1. Suppose D contains many dogs. Still, if my chihuahua is the most salient dog in D then ‘the dog’ refers unambiguously to it.
2. If our chihuahua is the least salient object in the D then we might still refer to him (e.g.,‘the small ratty creature that’s trying to hide behind the chair’).
Krahmer and Theune (2002)
• Abandon D&R’s dichotomy.
• Assume: ‘the N’ = ‘the most salient N’.
• Exercise: Get the Incremental Algorithm to say ‘the N’ iff N is the most salient N.
• Reminder: This is the Incremental Algorithm …
FailureReturn
LReturn then {r}C If
]][[C:C
}{L:L
do then ]][[ C &]][[r If
:do P allFor
Domain:C
Φ:L
P
P
PP
P
Krahmer and Theune (2002)
• (My version): re-interpret Domain
as
)}()(:Domain{ rSalxSalx
FailureReturn
LReturn then {r}C If
]][[C:C
}{L:L
do then ]][[ C &]][[r If
:do P allFor
Domain:C
Φ:L
P
P
PP
P
Example Situation
a, £100 b, £150
c, £100d, £150 e, £?SwedishItalian
most salient
least salient
SalMax={ac}, SalMid={b}, SalMin={de}
• Type: furniture (abcde), desk (ab), chair (cde)
• Origin: Sweden (ac), Italy (bde)
• Colours: dark (ade), light (bc), brown (a)
• Price: 100 (ac), 150 (bd) , 250 ({})
• Contains: wood ({}), metal (abcde), cotton (d)
Exercise: Describe a; Describe b; Describe d
SalMax={ac}, SalMid={b}, SalMin={de}
• Type: furniture (abcde), desk (ab), chair (cde)• Origin: Sweden (ac), Italy (bde)• Colours: dark (ade), light (bc), brown (a)• Price: 100 (ac), 150 (bd) , 250 ({})• Contains: wood ({}), metal (abcde), cotton (d)
a: Domain = {a,c}; description = {desk}b: Domain = {a,b,c}; description = {desk, Italy}d: Domain = {a,b,c,d,e}; description = {chair, Italy, 150}
• Krahmer & Theune are noncommittal about how salience is determined
• Compare Praguian/centering account
• Focus on textual salience:
....x1....x2.....x3....... ..........x2....x2....x1.. ....x1....x4.....x5.......
• Salience has a physical component as well (e.g., ‘the door’ = the nearest door)
Pronouns
• K&Th explore how their account may be generalized to generate pronouns:– ‘it/he/she’ = ‘the object’ (etc.)– Given their account, this means
‘the most salient object’.
• Predictions look OK, though it does not seem to allow antecedents beyond previous clause.
Pronouns
• Example: ‘The white chihuahua1 was chasing the cat2. It1/the cat2 ran fast’.
• K&Th: Perhaps it’s not enough being slightly more salient than your competitors:– ‘The white chihuahua1 was chasing the cat2. The
chihuahua1/the cat2 ran fast’.
– ‘The white chihuahua1 was eating. It1 was eating a cat’.
• K&Th discuss two other extensions:
– Bridging (e.g., ‘the car …. the motor’)– Relational properties
Since bridging involves a relation, let us start with relations.
2. Relational properties
• Tuesday’s lecture: Some properties involve a relation with another object, e.g.,
• Origin: Sweden (ac), Italy (bde)
From (a,Sweden)
• Recursion requires reification:
‘ x comes from the country where y lives’
Dale & Haddock (1991)
D&H modelled 2-place relations in GRE
Constraint satisfaction perspective, e.g.,
Constraints: {Orange(a), Orange(b), Table(c), On(a,c)}
Problem: construct sets of atoms that have r as the only value of a designated variable:
{Orange(x), Table(y), On(x,y)}
• D&H accumulate atoms until the target r is identified.
• This can be done in any order (cf., Dale and Reiter 1995)
• D&H choose a ‘greedy’ order: adding atoms that remove maximum number of distractors.
Exercise (relations)
• Greediness: you always add an atom that removes the maximum number of distractors.
• Construct an example that shows this approach not to be logically complete.
• Many later accounts, e.g., by Horacek, (also Krahmer et al.)
• Krahmer and Theune’s paper contains an alternative model that we will use for expository purposes
– One of the ‘extensions’ in K&Th– Incremental rather than greedy
Krahmer and Theune (2002)
• K&Th mix Content Determination with Syntactic Realization and Lexical Choice.
• We will continue to focus on Content Determination.
• We will make some other simplifications:
Simplifications
• Unlike K&Th, – We forget about salience
– property P instead of <Attribute,Value>
– No indefinite descriptions.
– Nothing about contrastive stress.
(Reminder: NLG is relevant to speech!)
Krahmer and Theune (2002)
• Preference ordering P contains ordinary properties and relations:
x:chair(x), x:from(x,Italy)
• Properties precede relations.
• In other respects they are treated alike. (Alternative: Mariet Theune’s thesis)
D&R, simplified:
FailureReturn
LReturn then {r}C If
C)L,,te( Upda
do then C)P,r,, Useful(If
:do P allFor
D:C
Φ:L
P
P
P
Changes to incremental algorithm
• This function, Ref, now needs to become recursive.
• Whether a property is Useful may dependon the properties already present in L
Suppose you want to identify x. This makes properties of y irrelevant …. unless L contains a relation between x and y
• This leads to the following changes:
Changes to incremental algorithm
1. Make L an argument of Useful and Ref.
2. Record in L - the properties that were found useful- the things of which they were true
3. Useful(P,r,P,L) def
Confusables r (L{P}) Confusables r (L)
FailureReturn
LReturn then {r}C If
L),CP,,REF(r' then )r'r,,Relation(P If
C)L,,te( Upda
do then L)P,r,, Useful(If
:do P allFor
:L),CP,REF(r,
r'
r
P
P
P
Example
P = < x:dog(x) {d1,d2}, x:doghouse(x) {h1,h2}, x:red(x) {h1}, x:brown(x) {h2}, x:in(x,h1) {d1}, x:in(d1,x) {h1}, x:in(x,h2) {d2}, x:in(d2,x) {h2} >
r = d1
Example (steps)
• Step 1: r = d1P = x:dog(x)
Example (steps)
• Step 1: r = d1P = x:dog(x)
• Step 2: r = d1P = x:in(x,h1) (Success if h1 can be identified)
Example (steps)
• Step 1: r = d1P = x:dog(x)
• Step 2: r = d1P = x:in(x,h1) (Success if h1 can be identified)
• Step 3 (recursion): r = h1P = x:red(x) (Success)
Example (details)
• Step 1: r = d1
P = dog(x) d1 [[P]]
Conf d1 (< dog(x) (d1?) >)
Conf d1 (< > (d1?))
(Therefore, P is a useful addition to L)
Example (details)
• Step 2: r = d1
P = in(x,h1) d1 [[P]]
Conf d1 < dog(x)(d1?), in(x,h1)(d1?) >
Conf d1 < dog(x)(d1?) >
Example (details)
• Step 3 (recursion): r = h1
P = red(x) h1 [[P]]
Conf h1 < dog(x)(d1?),in(x,h1?)(d1?), red(x)(h1?)>
Conf h1 < dog(x)(d1?),in(x,h1?)(d1?) >
Example 2
P = <x:dog(x) {d1,d2}, x:doghouse(x) {h1,h2}, x:in(x,h1) {d1}, x:in(d1,x) {h1}, x:in(x,h2) {d2}, x:in(d2,x) {h2} >
r = d1
Failure during REF(h1,P,C,L), where
L = < dog(x)(d1?), in(x,h1?)(d1?) >
Example 3
P = < x:dog(x) {d1,d2}, x:doghouse(x) {h1,h2}, x:in(x,h1) {d1}, x:in(d1,x) {h1} >
r = d1
Success through mutual identification:
‘The dog in the doghouse’ (D&H)
Problems with algorithms like this:
• Not very elegant; easy to make errors.(Worse with relations of larger arity.)
• Risk of loops: ‘The orange on the table under the orange on the table, ...’.
• Variant proposals:– Krahmer et al. (2001): labelled directed graphs– Gardent (2002): constraint satisfaction– Etc.
A more general problem
Any preference order will sometimes have strange results.
Exercise: construct example where putting 1-place properties first causes an excessively lengthy description.
Complexity
• Theoretical worst-case complexity of GRE + relations is exponential.
• This algorithm:– Number of loops is bounded by number of
properties (n-ary).– Whenever a relation is used, another
recursive call of Ref may be necessary.
A red thread
• ‘Simple’ GRE produces plausible descriptions at reasonable speed. But,
• when relations are added, fairly awfuldescriptions are generated slowly.
• This will become worse when other complications are taken into account: More options More problems
(‘embarrassment of riches’)
Combining relations and salience: Bridging
• { trailer(t1), trailer(t2) car(c1), car(c2), behind(t1,c1) }
• Sal(c1)>Sal(c2), Sal(t1)>Sal(t2):– ‘The trailer behind the car’– ‘The trailer’
Bridging (etc.)
• But …what if
{trailer(t1), trailer(t2), car(c1), car(c2), behind(t1,c1), behind(t2,c2)}
Sal(t2) > Sal(t1), Sal(c2) < Sal(c1)
Can we still say
‘The trailer behind the car’?, ‘The trailer’ ?
The problem:
• Relations involve more than one object• These objects can have different degrees
of salience.• It is unclear how this should affect the
algorithm.• In fact, this is a very common problem:
Different extensions of GRE combine in nontrivial ways.
Combining salience and relations: Paraboni and Van Deemter (2002)
• GRE algorithms tend to be applied to ‘flat’ domains.
• Let’s see what happens in a hierarchically ordered domain.
• Before doing this, let us step back ...
Making references easy
Consider these descriptions:
1. ‘the woman with red hair’(easy to find)
2. ‘the woman with green eyes’(difficult to find)
Incremental Algorithm can deal with this by making Hair-Colour more preferred than Eyes-Colour
Making references easy: the case of hierarchically ordered domains
Now consider these descriptions:
1. ‘... no. 2068 Lincoln Street, Brighton’
2. ‘... no. 2068, Brighton’
Determining the sense is faster with (2);
Determining the reference is faster with (1).
• Hierarchically ordered domains can be used to highlight some interesting issues.
• First issue: Salience can be determined by factors other than discourse structure.
b u ild in g 1
C o p ie r
b u ild in g 2
M e d ica l com p lex
D E S C R IP T IO N C o p ie r= T A R G E T
b u ild in g 1 b u ild in g 2
IT com p lex
u n ive rs ity cam p us
Example: To describe TARGET, it’s enough to distinguish it from distractors in building 1
So: Here ‘the copier’ is specific enough
So far, K&Th’s account applies, provided salience is measured adequately:
SAL (tree (parent (d) ) ) = max
SAL (tree (parent (parent(d) ) ) = max-1
…
Given a starting point d, the focus domain is the smallest subtree that contains d and r.
b u ild in g 1
T A R G E T 3
b u ild in g 2
M e d ica l com p lex
D E S C R IP T IO N T A R G E T 1
b u ild in g 1S A L =m ax
T A R G E T 2
b u ild in g 2
IT com p lexS A L =m a x -1
u n ive rs ity cam p usS A L =m a x -2
So far, hierarchy does not pose any big problems.
But let’s consider some possible preference orders ….
b u ild in g 1
C o p ie r = T A R G E T
b u ild in g 2
M e d ica l com p lex
D E S C R IP T IO N C o p ie r
b u ild in g 1 b u ild in g 2
IT com p lex
u n ive rs ity cam p us
Exercise: if this is the situation, then which properties will be chosen to identify the TARGET?
b u ild in g 1
C o p ie r = T A R G E T
b u ild in g 2
M e d ica l com p lex
D E S C R IP T IO N C o p ie r
b u ild in g 1 b u ild in g 2
IT com p lex
u n ive rs ity cam p us
1. [Complex preferred over Building]:‘the copier in the Medical complex’ (-unique)This is not optimally helpful.
b u ild in g 1
C o p ie r = T A R G E T
b u ild in g 2
M e d ica l com p lex
D E S C R IP T IO N C o p ie r
b u ild in g 1 b u ild in g 2
IT com p lex
u n ive rs ity cam p us
2. [Building preferred over Complex]:‘the copier in building 2’ (-unique)This seems actually infelicitous.
No preference ordering gives accurate results.
Issues• Issue 1: Salience can be determined
by non-textual factors.
Our example: structural ‘distance’ between Description and Target
• Issue 2: Contradicting incrementality, redundancy can be crucial. E.g., ‘the copier in building 2 of the Medical Complex’
Our example: if you can reduce the search space strongly by one extra property then do it! (Experimentally validated.)
Issues
• Issue 3: Mutual identification is not always allowed. E.g., ‘the copier in building 2’.
Our example: D&H’s approach assumes that all referents are highly salient, and all properties/relations are highly transparent.
Ivandré Paraboni ’s thesis
• Documents are structured domains
• Generating references to parts of texts or documents. E.g.,
– ‘see figure 3 in section 5’,
– ‘the issues discussed in the Introduction’
• When to generate such references
• How to do it
Back to the issue of complexity
• Salience of objects helps reducing the number of distractors.
• Might properties also be subject to salience (reducing the size of P)?
• What is the role of incrementality in GRE?(Next lecture)
FailureReturn
LReturn then SC If
]][[C:C
}{L:L
do then [[P]] C &[[P]]S If
:do P allFor
Domain:C
Φ:L
P
P
P
Next lecture
• Theoretical departure: “What is NLG anyway” (Shieber 1993).
• Another way in which referring expressions can go beyond conjunction of atomic properties: Boolean descriptions (Van Deemter 2002).