Formal Semantics Slides by Julia Hockenmaier, Laura McGarrity, Bill McCartney, Chris Manning, and...
-
Upload
brian-mills -
Category
Documents
-
view
221 -
download
0
Transcript of Formal Semantics Slides by Julia Hockenmaier, Laura McGarrity, Bill McCartney, Chris Manning, and...
Formal Semantics
Slides by Julia Hockenmaier, Laura McGarrity, Bill McCartney, Chris
Manning, and Dan Klein
Formal Semantics
It comes in two flavors:• Lexical Semantics: The meaning of words• Compositional semantics: How the meaning
of individual units combine to form the meaning of larger units
What is meaning
• Meaning ≠ Dictionary entriesDictionaries define words using words.Circularity!
Reference
• Referent: the thing/idea in the world that a word refers to
• Reference: the relationship between a word and its referent
Reference
Barack presidentObama
The president is the commander-in-chief.= Barack Obama is the commander-in-chief.
Reference
Barack presidentObama
I want to be the president.≠ I want to be Barack Obama.
Reference
• Tooth fairy?
• Phoenix?
• Winner of the 2016 presidential election?
What is meaning?
• Meaning ≠ Dictionary entries• Meaning ≠ Reference
Sense
• Sense: The mental representation of a word or phrase, independent of its referent.
Sense ≠ Mental Image• A word may have different mental images for
different people.– E.g., “mother”
• A word may conjure a typical mental image (a prototype), but can signify atypical examples as well.
Sense v. Reference
• A word/phrase may have sense, but no reference:– King of the world– The camel in CIS 8538– The greatest integer– The
• A word may have reference, but no sense:– Proper names: Dan McCloy, Kristi Krein
(who are they?!)
Sense v. Reference
• A word may have the same referent, but more than one sense:– The morning star / the evening star (Venus)
• A word may have one sense, but multiple referents:– Dog, bird
Some semantic relations between words
• Hyponymy: subclass– Poodle < dog– Crimson < red– Red < color– Dance < move
• Hypernymy: superclass• Synonymy:
– Couch/sofa– Manatee / sea cow
• Antonymy:– Dead/alive– Married/single
Lexical Decomposition
• Word sense can be represented with semantic features:
Compositional Semantics
Compositional Semantics
• The study of how meanings of small units combine to form the meaning of larger units
The dog chased the cat ≠ The cat chased the dog.ie, the whole does not equal the sum of the parts.
The dog chased the cat = The cat was chased by the dogie, syntax matters to determining meaning.
Principle of Compositionality
The meaning of a sentence is determined by the meaning of its words in conjunction with the way they are syntactically combined.
Exceptions to Compositionality
• Anomaly: when phrases are well-formed syntactically, but not semantically– Colorless green ideas sleep furiously. (Chomsky)– That bachelor is pregnant.
Exceptions to Compositionality
• Metaphor: the use of an expression to refer to something that it does not literally denote in order to suggest a similarity– Time is money.– The walls have ears.
Exceptions to Compositionality
• Idioms: Phrases with fixed meanings not composed of literal meanings of the words– Kick the bucket = die
(*The bucket was kicked by John.)– When pigs fly = ‘it will never happen’
(*She suspected pigs might fly tomorrow.)– Bite off more than you can chew
= ‘to take on too much’(*He chewed just as much as he bit off.)
Idioms in other languages
Logical Foundations for Compositional Semantics
• We need a language for expressing the meaning of words, phrases, and sentences
• Many possible choices; we will focus on– First-order predicate logic (FOPL) with types– Lambda calculus
Truth-conditional Semantics• Linguistic expressions
– “Bob sings.”
• Logical translations– sings(Bob)– but could be p_5789023(a_257890)
• Denotation:– [[bob]] = some specific person (in some context)– [[sings(bob)]] = true, in situations where Bob is singing; false, otherwise
• Types on translations:– bob: e(ntity)– sings(bob): t(rue or false, a boolean type)
Truth-conditional SemanticsSome more complicated logical descriptions of language:
– “All girls like a video game.”– x:e . y:e . girl(x) [video-game(y) likes(x,y)]
– “Alice is a former teacher.”– (former(teacher))(Alice)
– “Alice saw the cat before Bob did.”– x:e, y:e, z:e, t1:e, t2:e .
cat(x) see(y) see(z) agent(y, Alice) patient(y, x) agent(z, Bob) patient(z, x) time(y, t1) time(z, t2) <(t1, t2)
FOPL Syntax Summary
• A set of types T = {t1, … }
• A set of constants C = {c1, …}, each associated with a type from T
• A set of relations R = {r1, …}, where each ri is a subset of Cn for some n.
• A set of variables X = {x1, …}
• , , , , , , ., :
Truth-conditional semantics• Proper names:
– Refer directly to some entity in the world– Bob: bob
• Sentences:– Are either t or f– Bob sings: sings(bob)
• So what about verbs and VPs?– sings must combine with bob to produce sings(bob)– The λ-calculus is a notation for functions whose arguments are not yet filled.– sings: λx.sings(x)– This is a predicate, a function that returns a truth value. In this case, it takes a
single entity as an argument, so we can write its type as e t
• Adjectives?
Lambda calculus• FOPL + λ (new quantifier) will be our lambda calculus
• Intuitively, λ is just a way of creating a function– E.g., girl() is a relation symbol; but
λx . girl(x) is a function that takes one argument.
• New inference rule: function application(λx . L1(x)) (L2) → L1(L2)
E.g., (λx . x2) (3) → 32
E.g., (λx . sings(x)) (Bob) → sings(Bob)
• Lambda calculus lets us describe the meaning of words individually. – Function application (and a few other rules) then lets us combine those
meanings to come up with the meaning of larger phrases or sentences.
Compositional Semantics with the λ-calculus
• So now we have meanings for the words• How do we know how to combine the words?• Associate a combination rule with each grammar rule:– S : β(α) NP : α VP : β (function application)– VP : λx. α(x) ∧ β(x) VP : α and : ∅ VP : β
(intersection)
• Example:
Composition: Some more examples
• Transitive verbs:– likes : λx.λy.likes(y,x)– Two-places predicates, type e(et)– VP “likes Amy” : λy.likes(y,Amy) is just a one-place predicate
• Quantifiers:– What does “everyone” mean?– Everyone : λf.x.f(x)– Some problems:
• Have to change our NP/VP rule• Won’t work for “Amy likes everyone”
– What about “Everyone likes someone”?– Gets tricky quickly!
Composition: Some more examples
• Indefinites– The wrong way:• “Bob ate a waffle” : ate(bob,waffle)• “Amy ate a waffle” : ate(amy,waffle)
– Better translation:• ∃x.waffle(x) ^ ate(bob, x)• What does the translation of “a” have to be?• What about “the”?• What about “every”?
Denotation
• What do we do with the logical form?– It has fewer (no?) ambiguities– Can check the truth-value against a database– More usefully: can add new facts, expressed in
language, to an existing relational database– Question-answering: can check whether a statement
in a corpus entails a question-answer pair:“Bob sings and dances”
Q:“Who sings?” has answer A:“Bob”
– Can chain together facts for story comprehension
Grounding• What does the translation likes : λx. λy. likes(y,x) have
to do with actual liking?• Nothing! (unless the denotation model says it does)• Grounding: relating linguistic symbols to perceptual
referents– Sometimes a connection to a database entry is enough– Other times, you might insist on connecting “blue” to the
appropriate portion of the visual EM spectrum– Or connect “likes” to an emotional sensation
• Alternative to grounding: meaning postulates– You could insist, e.g., that likes(y,x) => knows(y,x)
More representation issues
• Tense and events– In general, you don’t get far with verbs as predicates– Better to have event variables e
• “Alice danced” : danced(Alice) vs.• “Alice danced” : ∃e.dance(e)^agent(e, Alice)^(time(e)<now)
– Event variables let you talk about non-trivial tense/aspect structures:
“Alice had been dancing when Bob sneezed”
More representation issues
• Propositional attitudes (modal logic)– “Bob thinks that I am a gummi bear”
• thinks(bob, gummi(me))?• thinks(bob, “He is a gummi bear”)?
– Usually, the solution involves intensions (^p) which are, roughly, the set of possible worlds in which predicate p is true.• thinks(bob, ^gummi(me))
– Computationally challenging• Each agent has to model every other agent’s mental state• This comes up all the time in language –
– E.g., if you want to talk about what your bill claims that you bought, vs. what you think you bought, vs. what you actually bought.
More representation issues
• Multiple quantifiers:“In this country, a woman gives birth every 15 minutes.Our job is to find her, and stop her.”
-- Groucho Marx
• Deciding between readings– “Bob bought a pumpkin every Halloween.”– “Bob put a warning in every window.”
More representation issues
• Other tricky stuff– Adverbs– Non-intersective adjectives– Generalized quantifiers– Generics
• “Cats like naps.”• “The players scored a goal.”
– Pronouns and anaphora• “If you have a dime, put it in the meter.”
– … etc., etc.
Mapping Sentences to Logical Forms
CCG Parsing• Combinatory Categorial
Grammar– Lexicalized PCFG– Categories encode
argument sequences• A/B means a category that
can combine with a B to the right to form an A
• A \ B means a category that can combine with a B to the left to form an A
– A syntactic parallel to the lambda calculus
Learning to map sentences to logical form
• Zettlemoyer and Collins (IJCAI 05, EMNLP 07)
Some Training Examples
CCG Lexicon
Parsing Rules (Combinators)Application
Right: X : f(a) X/Y : f Y : a
Left: X : f(a) Y : a X\Y : f
Additional rules:• Composition• Type-raising
CCG Parsing Example
Parsing a Question
Lexical Generation
Input Training ExampleSentence: Texas borders Kansas.Logical form: borders(Texas, Kansas)
GENLEX
• Input: a training example (Si, Li)
• Computation:– Create all substrings of consecutive words in Si
– Create categories from Li
– Create lexical entries that are the cross products of these two sets
• Output: Lexicon Λ
GENLEX Cross Product
Input Training ExampleSentence: Texas borders Kansas.Logical form: borders(Texas, Kansas)
Output LexiconOutput SubstringsTexasbordersKansasTexas bordersborders KansasTexas borders Kansas
X(cross product)
Output CategoriesNP : texasNP : kansas(S\NP)/NP : λx.λy.borders(y,x)
GENLEX Output LexiconWords Category
Texas NP : texas
Texas NP : kansas
Texas (S\NP)/NP : λx.λy.borders(y,x)
borders NP : texas
Borders NP : kansas
borders (S\NP)/NP : λx.λy.borders(y,x)
… …
Texas borders Kansas NP : texas
Texas borders Kansas NP : kansas
Texas borders Kansas (S\NP)/NP : λx.λy.borders(y,x)
Weighted CCG
Given a log-linear model with a CCG lexicon Λ, a feature vector f, and weights w:
The best parse is: y* = argmax w f(x,y)∙
where we consider all possible parses y for the sentence x given the lexicon Λ.
y
Parameter Estimation for Weighted CCG Parsing
Inputs: Training set {(Si,Li) | i = 1, …, n}Initial lexicon Λ, initial weights w, num. iter. T
Computation: For t=1 … T, i = 1 … n:Step 1: Check correctness
If y* = argmax w f(S∙ i,y) is Li, skip to next iStep 2: Lexical generation
Set λ = Λ ∪ GENLEX(Si,Li)Let y’ = argmax w f(S∙ i,y)
Define λi to be the lexical entries in y’Set Λ = Λ ∪ λi
Step 3: Update ParametersLet y’’ = argmax w f(S∙ i,y)If y’’ ≠ Li
Set w = w + f(Si, y’) – f(Si,y’’)
Output: Lexicon Λ and parameters w
y s.t. L(y) = Li
y
Example Learned Lexical Entries
Challenge Revisited
Disharmonic Application
Missing Content Words
Missing content-free words
A complete parse
Geo880 Test Set
Precision Recall F1
Zettlemoyer & Collins 2007 95.49 83.20 88.93
Zettlemoyer & Collins 2005 96.25 79.29 86.95
Wong & Mooney 2007 93.72 80.00 86.31
Summing Up
• Hypothesis: Principle of Compositionality– Semantics of NL sentences and phrases can be composed
from the semantics of their subparts• Rules can be derived which map syntactic analysis to
semantic representation (Rule-to-Rule Hypothesis)– Lambda notation provides a way to extend FOPC to this
end– But coming up with rule2rule mappings is hard
• Idioms, metaphors and other non-compositional aspects of language makes things tricky (e.g. fake gun)