PDT: Tectogrammatical Representation
description
Transcript of PDT: Tectogrammatical Representation
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
1
PDT:Tectogrammatical
Representation
Jan Hajič
Institute of Formal and Applied Linguistics
School of Computer Science
Faculty of Mathematics and Physics
Charles University, Prague
Czech Republic
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
2
Tectogrammatical Annotation (t-layer)
Underlying (deep) syntax 4 sublayers (integrated):
dependency structure, (detailed) functors valency annotation
topic/focus and deep word order coreference (mostly grammatical only) all the rest (“grammatemes”):
detailed functors, underlying gender, number, ...
Total 39 attributes (vs. 5 at m-layer, 2 at a-layer)
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
3
Analytical vs. Tectogrammatical representation
Underlying verb + tense
Deep function
Elided Actor in
Prepositions out
Another ellipsis...
(TR: sublayer 1 only shown)
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
4
Layer 3: Tectogrammatical
Underlying (deep) syntax 4 sublayers:
dependency structure, (detailed) functors topic/focus and deep word order coreference (mostly grammatical only) all the rest (grammatemes):
detailed functors underlying gender, number, ...
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
7
Tectogrammatical Functors
“Actants”: ACT, PAT, EFF, ADDR, ORIG
modify: verbs, nouns, adjectives cannot repeat in a clause, usually obligatory
Free modifications (~ 50), semantically defined can repeat; optional, sometimes obligatory Ex.: LOC, DIR1, ...; TWHEN, TTILL,...; RSTR; BEN, ATT, ACMP,
INTT, MANN; MAT, APP; ID, DPHR, ...
Special Coordination, Rhematizers, Foreign phrases,...
syntactic semantic
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
8
Tectogrammatical Example
Analytical verb form: he would be allowed to be enrolled
Additional attributes (grammatemes):conditional + “allow”
Collapsed
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
9
Tectogrammatical Example
Predicate with copula (state) you were fired
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
10
Tectogrammatical Example
Passive construction (action) (The) book has been translated [by Mr. X]
Disappeared Added
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
11
Tectogrammatical Example
Object he gave Mary a book
Obj goes into ACT, PAT, ADDR, EFF or ORIG based on governor’s valency frame
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
12
Relative clause (embedded) the woman, who had a French accent, was very pretty
Tectogrammatical Example
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
13
Tectogrammatical Example
Incomplete phrases Peter works well, but Paul badly
Added
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
14
Layer 3: Tectogrammatical
Underlying (deep) syntax 4 sublayers:
dependency structure, (detailed) functors topic/focus and deep word order coreference (mostly grammatical only) all the rest (grammatemes):
detailed functors underlying gender, number, ...
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
15
Deep Word Order, Topic/Focus
Example:
Baker bakes rolls. vs. BakerIC bakes rolls.
Analyticaldep. tree:
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
16
Deep Word OrderTopic/Focus
Deep word order: from “old” information to the “new” one (left-to-
right) at every level (head included) projectivity by definition (almost...)
i.e., partial level-based order -> total d.w.o.
Topic/focus/contrastive topic attribute of every node (t, f, c) restricted by d.w.o. and other constraints
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
17
Layer 3: Tectogrammatical
Underlying (deep) syntax 4 sublayers:
dependency structure, (detailed) functors topic/focus and deep word order coreference (mostly grammatical only) all the rest (grammatemes):
detailed functors underlying gender, number, ...
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
18
Coreference(intro only: see Silvie’s part)
Grammatical (easy) relative clauses
which, who Peter and Paul, who ...
control infinitival constructions
John promised to go ...
reflexive pronouns {him,her,thme}self(-ves)
Mary saw herself in ...
Johngo
he home
promisePRED
ACTPAT
ACT DIR3
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
19
Coreference
Textual Ex.: Peter moved to Iowa after he finished his PhD.
Peter Iowafinish
he PhD
movePRED
ACT DIR1TWHEN
ACT PAT
heAPP
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
20
Layer 3: Tectogrammatical
Underlying (deep) syntax 4 sublayers:
dependency structure, (detailed) functors topic/focus and deep word order coreference (mostly grammatical only) all the rest (grammatemes):
detailed functors underlying gender, number, ...
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
21
“Grammatemes”
Detailed functors (subfunctors) only for some functors:
TWHEN: before/after LOC: next-to, behind, in-front-of, ... also: ACMP, BEN, CPR, DIR1, DIR2, DIR3, EXT
Lexical (underlying) number (SG/PL), tense, modality, degree of
comparison, ... strictly only where necessary (agreement!)
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
22
Tectogrammatical attributes I
node typing complex, coap, qcomplex, root, atom, ...
functor, subfunctor TWHEN: TWHEN.basic, TWHEN.before
is_member, is_generated, is_parenthesis, is_dsp_root, is_state, quot_type, ...
grammatemes (16): aspect, degcmp, deontmod, sempos, tense, indeftype,
politeness, person, ...
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
23
Tectogrammatical attributes II
topic/focus: tfa, deepord
valency: t_lemma, val_frame.rf bookkeeping: id coref_gram.rf, coref_text.rf, compl.rf
reference to TR node, type of coreference sentmod Linking to analytical layer
a.lex.rf (“main” anal. node), a.aux.rf (others)
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
24
Fully Annotated Sentence
He spends his days sketching passers-by, or trying to.
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
25
Definition of Valency
Ability (“desire”) of words (verbs, nouns, adjectives) to combine themselves with other units of meaning
Properties of valency: Specific for every word meaning (in general)
leave: sb left sth for sb vs. sb left from somewhere same as in PropBank leave.02 vs. leave.01
Typically strongly correlates with surface form morphological case (~ ending), preposition+case, ...
Semantic constraintsare very dangerous
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
26
Structure of Valency
word (lemma) word sense group 1
valency frame: slot1 slot2 slot3
surface expression word sense group 2
...
PDT VALLEX (Cz), EngVallex (En)
vyměnit (to replace) vyměnit1
ACT PAT EFF
Nom. Acc.za+Acc.
vyměnit2
...
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
27
PDT-VALLEX Entry dosáhnout: “to reach”, “to get [sb to do sth]” browser/user-formatted example:
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
28
Corpus <-> Valency Lexicon Corpus:
ENTRY: uzavřít vf1: ACT(.1) CPHR({smlouva}.4)
ex: u. dohodu (close a contract)vf2: ACT(.1) PAT(.4)
ex.: u. pokoj (close a room, house)
Lexicon:
Sentence 2035: Sentence 15345: Sentence 51042:
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
29
Valency & Form: Constraints Tree structure:
(Sets of) Constraints: n1: lemma=uvažovat mode=active n2: case=Nom afun=Sb n3: lemma=o afun=AuxP n4: case=Loc afun=Obj
n1
n2 n3
n4
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
31
Example: Valency & Form
1:2 relative clause
to_say: ACT EFF
lemma=say mode=active
afun=AuxC lemma=that
afun=Obj POS=verb
afun=Sbcase=Nom
• linear representation: EFF(that[.v])
March 5, 2008 Companions Semantic Representation and Dialog Interfacing Workshop - Tectogrammatics
32
Valency and Text Generation
Using valency for... ...getting the correct (lemma, tag) of verb arguments
Example:
starat_se
PRED
Martin
ACT
tygr
PAT
Martin
....1..........
starat
V..............
o
...............
tygr
....4..........
VALLEX entry: starat (se) ACT(.1) PAT(o.[.4])
se
...............
Martin se stará o tygry.
“Martin takes care of tigers.”
“to take care of”
“tiger”