Linguistics 187/287 Week 6
description
Transcript of Linguistics 187/287 Week 6
![Page 1: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/1.jpg)
Linguistics 187/287 Week 6Linguistics 187/287 Week 6
Martin Forst, Ron Kaplan, Martin Forst, Ron Kaplan, and Tracy Kingand Tracy King
GenerationGenerationTerm-rewrite SystemTerm-rewrite SystemMachine TranslationMachine Translation
![Page 2: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/2.jpg)
GenerationGeneration
Parsing: string to analysis Generation: analysis to string What type of input? How to generate
![Page 3: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/3.jpg)
Why generate?Why generate?
Machine translationLang1 string -> Lang1 fstr -> Lang2 fstr -> Lang2 string
Sentence condensationLong string -> fstr -> smaller fstr -> new string
Question answering Production of NL reports
– State of machine or process– Explanation of logical deduction
Grammar debugging
![Page 4: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/4.jpg)
F-structures as inputF-structures as input
Use f-structures as input to the generator May parse sentences that shouldn’t be
generated May want to constrain number of generated
options Input f-structure may be underspecified
![Page 5: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/5.jpg)
XLE generatorXLE generator
Use the same grammar for parsing and generation
Advantages– maintainability– write rules and lexicons once
But– special generation tokenizer– different OT ranking
![Page 6: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/6.jpg)
Generation tokenizer/morphologyGeneration tokenizer/morphology
White space– Parsing: multiple white space becomes a single
TB John appears. -> John TB appears TB . TB
– Generation: single TB becomes a single space (or nothing)
John TB appears TB . TB -> John appears. *John appears .
Suppress variant forms– Parse both favor and favour– Generate only one
![Page 7: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/7.jpg)
Morphconfig for parsing & generationMorphconfig for parsing & generation
STANDARD ENGLISH MOPRHOLOGY (1.0)
TOKENIZE:
P!eng.tok.parse.fst G!eng.tok.gen.fst
ANALYZE:
eng.infl-morph.fst G!amerbritfilter.fst
G!amergen.fst
----
![Page 8: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/8.jpg)
Reversing the parsing grammarReversing the parsing grammar
The parsing grammar can be used directly as a generator
Adapt the grammar with a special OT ranking GENOPTIMALITYORDER
Why do this?– parse ungrammatical input– have too many options
![Page 9: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/9.jpg)
Ungrammatical inputUngrammatical input
Linguistically ungrammatical– They walks.– They ate banana.
Stylistically ungrammatical– No ending punctuation: They appear– Superfluous commas: John, and Mary appear.– Shallow markup: [NP John and Mary] appear.
![Page 10: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/10.jpg)
Too many optionsToo many options
All the generated options can be linguistically valid, but too many for applications
Occurs when more than one string has the same, legitimate f-structure
PP placement: – In the morning I left. I left in the morning.
![Page 11: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/11.jpg)
Using the Gen OT rankingUsing the Gen OT ranking
Generally much simpler than in the parsing direction– Usually only use standard marks and NOGOOD
no * marks, no STOPPOINT– Can have a few marks that are shared by several
constructions
one or two for dispreferred
one or two for preferred
![Page 12: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/12.jpg)
Example: Prefer initial PPExample: Prefer initial PP
S --> (PP: @ADJUNCT @(OT-MARK GenGood))
NP: @SUBJ;
VP.
VP --> V
(NP: @OBJ)
(PP: @ADJUNCT).
GENOPTIMALITYORDER NOGOOD +GenGood.
parse: they appear in the morning.
generate: without OT: In the morning they appear.
They appear in the morning.
with OT: In the morning they appear.
![Page 13: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/13.jpg)
Debugging the generatorDebugging the generator
When generating from an f-structure produced by the same grammar, XLE should always generate
Unless:– OT marks block the only possible string– something is wrong with the tokenizer/morphology regenerate-morphemes: if this gets a string the tokenizer/morphology is not the problem
Hard to debug: XLE has robustness features to help
![Page 14: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/14.jpg)
Underspecified InputUnderspecified Input
F-structures provided by applications are not perfect– may be missing features– may have extra features– may simply not match the grammar coverage
Missing and extra features are often systematic– specify in XLE which features can be added and
deleted Not matching the grammar is a more serious
problem
![Page 15: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/15.jpg)
Adding featuresAdding features English to French translation:
– English nouns have no gender– French nouns need gender– Soln: have XLE add gender
the French morphology will control the value
Specify additions in xlerc:– set-gen-adds add "GEND"– can add multiple features:
set-gen-adds add "GEND CASE PCASE"– XLE will optionally insert the feature
Note: Unconstrained additions make generation undecidable
![Page 16: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/16.jpg)
ExampleExample
[ PRED 'dormir<SUBJ>'
SUBJ [ PRED 'chat'
NUM sg
SPEC def ]
TENSE present ]
[ PRED 'dormir<SUBJ>'
SUBJ [ PRED 'chat'
NUM sg
GEND masc
SPEC def ]
TENSE present ]
The cat sleeps. -> Le chat dort.
![Page 17: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/17.jpg)
Deleting featuresDeleting features
French to English translation– delete the GEND feature
Specify deletions in xlerc– set-gen-adds remove "GEND"– can remove multiple features set-gen-adds remove "GEND CASE PCASE"– XLE obligatorily removes the features no GEND feature will remain in the f-structure– if a feature takes an f-structure value, that f-
structure is also removed
![Page 18: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/18.jpg)
Changing valuesChanging values
If values of a feature do not match between the input f-structure and the grammar:– delete the feature and then add it
Example: case assignment in translation– set-gen-adds remove "CASE"
set-gen-adds add "CASE"– allows dative case in input to become accusative
e.g., exceptional case marking verb in input language but regular case in output language
![Page 19: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/19.jpg)
Generation for DebuggingGeneration for Debugging
Checking for grammar and lexicon errors– create-generator english.lfg– reports ill-formed rules, templates, feature
declarations, lexical entries
Checking for ill-formed sentences that can be parsed– parse a sentence– see if all the results are legitimate strings– regenerate “they appear.”
![Page 20: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/20.jpg)
Rewriting/TransferRewriting/Transfer System System
![Page 21: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/21.jpg)
Why a Rewrite SystemWhy a Rewrite System
Grammars produce c-/f-structure output Applications may need to manipulate this
– Remove features– Rearrange features– Continue linguistic analysis (semantics, knowledge
representation – next week)
XLE has a general purpose rewrite system (aka "transfer" or "xfr" system)
![Page 22: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/22.jpg)
Sample Uses of Rewrite SystemSample Uses of Rewrite System
Sentence condensation Machine translation Mapping to logic for knowledge
representation and reasoning Tutoring systems
![Page 23: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/23.jpg)
What does the system do?What does the system do?
Input: set of "facts" Apply a set of ordered rules to the facts
– this gradually changes the set of input facts
Output: new set of facts
Rewrite system uses the same ambiguity management as XLE– can efficiently rewrite packed structures,
maintaining the packing
![Page 24: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/24.jpg)
Example F-structure FactsExample F-structure Facts
PERS(var(1),3)PRED(var(1),girl)CASE(var(1),nom)NTYPE(var(1),common)NUM(var(1),pl)
SUBJ(var(0),var(1))
PRED(var(0),laugh)TNS-ASP(var(0),var(2))TENSE(var(2),pres)
arg(var(0),1,var(1))lex_id(var(0),1)lex_id(var(1),0)
F-structures get var(#) Special arg facts lex_id for each PRED
Facts have two arguments (except arg) Rewrite system allows for any number of arguments
![Page 25: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/25.jpg)
Rule formatRule format Obligatory rule: LHS ==> RHS. Optional rule: LHS ?=> RHS. Unresourced fact: |- clause. LHS
clause : match and delete+clause : match and keep-LHS : negation (don't have fact)LHS, LHS : conjunction( LHS | LHS ) : disjunction{ ProcedureCall } : procedural attachment
RHSclause : replacement facts0 : empty set of replacement factsstop : abandon the analysis
![Page 26: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/26.jpg)
Example rulesExample rules
"PRS (1.0)"
grammar = toy_rules.
"obligatorily add a determiner if there is a noun with no spec"
+NTYPE(%F,%%), -SPEC(%F,%%)==>SPEC(%F,def).
"optionally make plural nouns singular this will split the choice space"
NUM(%F, pl) ?=> NUM(%F, sg).
PERS(var(1),3)PRED(var(1),girl)CASE(var(1),nom)NTYPE(var(1),common)NUM(var(1),pl)
SUBJ(var(0),var(1))
PRED(var(0),laugh)TNS-ASP(var(0),var(2))TENSE(var(2),pres)
arg(var(0),1,var(1))lex_id(var(0),1)lex_id(var(1),0)
![Page 27: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/27.jpg)
Example Obligatory RuleExample Obligatory Rule
"obligatorily add a determiner if there is a noun with no spec"
+NTYPE(%F,%%), -SPEC(%F,%%)==>SPEC(%F,def).
PERS(var(1),3)PRED(var(1),girl)CASE(var(1),nom)NTYPE(var(1),common)NUM(var(1),pl)
SUBJ(var(0),var(1))
PRED(var(0),laugh)TNS-ASP(var(0),var(2))TENSE(var(2),pres)
arg(var(0),1,var(1))lex_id(var(0),1)lex_id(var(1),0)
Output facts: all the input facts plus: SPEC(var(1),def)
![Page 28: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/28.jpg)
Example Optional RuleExample Optional Rule
"optionally make plural nouns singular this will split the choice space"
NUM(%F, pl) ?=> NUM(%F, sg).
PERS(var(1),3)PRED(var(1),girl)CASE(var(1),nom)NTYPE(var(1),common)NUM(var(1),pl)SPEC(var(1),def)
SUBJ(var(0),var(1))
PRED(var(0),laugh)TNS-ASP(var(0),var(2))TENSE(var(2),pres)
arg(var(0),1,var(1))lex_id(var(0),1)lex_id(var(1),0)
Output facts: all the input facts plus choice split: A1: NUM(var(1),pl) A2: NUM(var(1),sg)
![Page 29: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/29.jpg)
Output of example rulesOutput of example rules
Output is a packed f-structure Generation gives two sets of strings
– The girls {laugh.|laugh!|laugh}– The girl {laughs.|laughs!|laughs}
![Page 30: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/30.jpg)
Manipulating setsManipulating sets
Sets are represented with an in_set feature– He laughs in the park with the telescopeADJUNCT(var(0),var(2))
in_set(var(4),var(2))
in_set(var(5),var(2))
PRED(var(4),in)
PRED(var(5),with)
Might want to optionally remove adjuncts– but not negation
![Page 31: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/31.jpg)
Example Adjunct Deletion RulesExample Adjunct Deletion Rules
"optionally remove member of adjunct set"
+ADJUNCT(%%, %AdjSet), in_set(%Adj, %AdjSet), -PRED(%Adj, not)?=> 0.
"obligatorily remove adjunct with nothing in it"
ADJUNCT(%%, %Adj), -in_set(%%,%Adj)==> 0.
He laughs with the telescope in the park.He laughs in the park with the telescopeHe laughs with the telescope.He laughs in the park.He laughs.
![Page 32: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/32.jpg)
Manipulating PREDsManipulating PREDs
Changing the value of a PRED is easy– PRED(%F,girl) ==> PRED(%F,boy).
Changing the argument structure is trickier– Make any changes to the grammatical functions– Make the arg facts correlate with these
![Page 33: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/33.jpg)
Example Passive RuleExample Passive Rule
"make actives passive
make the subject NULL; make the object the subject;
put in features"
SUBJ( %Verb, %Subj), arg( %Verb, %Num, %Subj),
OBJ( %Verb, %Obj), CASE( %Obj, acc)
==>
SUBJ( %Verb, %Obj), arg( %Verb, %Num, NULL), CASE( %Obj, nom),
PASSIVE( %Verb, +), VFORM( %Verb, pass).
the girls saw the monkeys ==>The monkeys were seen.
in the park the girls saw the monkeys ==>In the park the monkeys were seen.
![Page 34: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/34.jpg)
Templates and MacrosTemplates and Macros
Rules can be encoded as templatesn2n(%Eng,%Frn) ::
PRED(%F,%Eng), +NTYPE(%F,%%)
==> PRED(%F,%Frn).
@n2n(man, homme).
@n2n(woman, femme).
Macros encode groups of clauses/factssg_noun(%F) :=
+NTYPE(%F,%%), +NUM(%F,sg).
@sg_noun(%F), -SPEC(%F)
==> SPEC(%F,def).
![Page 35: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/35.jpg)
Unresourced FactsUnresourced Facts
Facts can be stipulated in the rules and refered to– Often used as a lexicon of information not
encoded in the f-structure
For example, list of days and months for manipulation of dates|- day(Monday). |- day(Tuesday). etc.
|- month(January). |- month(February). etc.
+PRED(%F,%Pred), ( day(%Pred) | month(%Pred) ) ==> …
![Page 36: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/36.jpg)
Rule OrderingRule Ordering
Rewrite rules are ordered (unlike LFG syntax rules but like finite-state rules)– Output of rule1 is input to rule2 – Output of rule2 is input to rule3
This allows for feeding and bleeding– Feeding: insert facts used by later rules– Bleeding: remove facts needed by later rules
Can make debugging challenging
![Page 37: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/37.jpg)
Example of Rule FeedingExample of Rule Feeding
Early Rule: Insert SPEC on nouns+NTYPE(%F,%%), -SPEC(%F,%%) ==>
SPEC(%F, def).
Later Rule: Allow plural nouns to become singular only if have a specifier (to avoid bad count nouns)NUM(%F,pl), +SPEC(%F,%%) ==> NUM(%F,sg).
![Page 38: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/38.jpg)
Example of Rule BleedingExample of Rule Bleeding
Early Rule: Turn actives into passives (simplified)SUBJ(%F,%S), OBJ(%F,%O) ==>
SUBJ(%F,%O), PASSIVE(%F,+).
Later Rule: Impersonalize actives SUBJ(%F,%%), -PASSIVE(%F,+) ==>
SUBJ(%F,%S), PRED(%S,they), PERS(%S,3), NUM(%S,pl).
– will apply to intransitives and verbs with (X)COMPs but not transitives
![Page 39: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/39.jpg)
DebuggingDebugging XLE command line: tdbg
– steps through rules stating how they apply============================================Rule 1: +(NTYPE(%F,A)), -(SPEC(%F,B)) ==>SPEC(%F,def) File /tilde/thking/courses/ling187/hws/thk.pl, lines 4-10
Rule 1 matches: [+(2)] NTYPE(var(1),common) 1 --> SPEC(var(1),def)============================================Rule 2: NUM(%F,pl) ?=>NUM(%F,sg) File /tilde/thking/courses/ling187/hws/thk.pl, lines 11-17
Rule 2 matches: [3] NUM(var(1),pl) 1 --> NUM(var(1),sg)============================================Rule 5: SUBJ(%Verb,%Subj), arg(%Verb,%Num,%Subj), OBJ(%Verb,%Obj), CASE(%Obj,acc) ==>SUBJ(%Verb,%Obj), arg(%Verb,%Num,NULL), CASE(%Obj,nom), PASSIVE(%Verb,+), VFORM(%Verb,pass) File /tilde/thking/courses/ling187/hws/thk.pl, lines 28-37
Rule does not apply
girls laughed
![Page 40: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/40.jpg)
Running the Rewrite SystemRunning the Rewrite System
create-transfer : adds menu items load-transfer-rules FILE : loads rules from file f-str window under commands has:
– transfer : prints output of rules in XLE window– translate : runs output through generator
Need to do (where path is $XLEPATH/lib):setenv LD_LIBRARY_PATH
/afs/ir.stanford.edu/data/linguistics/XLE/SunOS/lib
![Page 41: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/41.jpg)
Rewrite SummaryRewrite Summary
The XLE rewrite system lets you manipulate the output of parsing– Creates versions of output suitable for applications– Can involve significant reprocessing
Rules are ordered Ambiguity management is as with parsing
![Page 42: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/42.jpg)
Grammatical Machine TranslationGrammatical Machine Translation
Stefan Riezler & John Maxwell
![Page 43: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/43.jpg)
Source
Translation SystemTranslation System
XLEParsing
TargetF-structuresXLE
Generation F-structures.Transfer
GermanLFG
Translationrules
EnglishLFG
+ Lots of statistics
![Page 44: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/44.jpg)
Transfer-Rule Induction from Transfer-Rule Induction from aligned bilingual corporaaligned bilingual corpora
1. Use standard techniques to find many-to-many candidate word-alignments in source-target sentence-pairs
2. Parse source and target sentences using LFG grammars for German and English
3. Select most similar f-structures in source and target4. Define many-to-many correspondences between
substructures of f-structures based on many-to-many word alignment
5. Extract primitive transfer rules directly from aligned f-structure units
6. Create powerset of possible combinations of basic rules and filter according to contiguity and type matching constraints
![Page 45: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/45.jpg)
InductionInduction
Example sentences: Dafür bin ich zutiefst dankbar.
I have a deep appreciation for that.
Many-to-many word alignment:Dafür{6 7} bin{2} ich{1} zutiefst{3 4 5} dankbar{5}
F-structure alignment:
![Page 46: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/46.jpg)
Extracting Primitive Transfer RulesExtracting Primitive Transfer Rules
Rule (1) maps lexical predicates Rule (2) maps lexical predicates and interprets subj-to-subj link as indication to
map subj of source with this predicate into subject of target and xcomp of source into object of target
%X1, %X2, %X3, … are variables for f-structures
(2) PRED(%X1, sein), SUBJ(%X1,%X2), XCOMP(%X1,%X3) ==> PRED(%X1, have), SUBJ(%X1,%X2) OBJ(%X1,%X3)
(1) PRED(%X1, ich) ==> PRED(%X1, I)
![Page 47: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/47.jpg)
Extracting Complex Transfer RulesExtracting Complex Transfer Rules Complex rules are created by taking all
combinations of primitive rules, and filtering
(4) zutiefst dankbar sein ==> have a deep appreciation
(5) zutiefst dankbar dafür sein ==> have a deep appreciation for that
(6) ich bin zutiefst dankbar dafür ==> I have a deep appreciation for that
![Page 48: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/48.jpg)
Transfer Contiguity constraintTransfer Contiguity constraint Transfer contiguity constraint:
1. Source and target f-structures each have to be connected2. F-structures in the transfer source can only be aligned with
f-structures in the transfer target, and vice versa Analogous to constraint on contiguous and
alignment-consistent phrases in phrase-based SMT Prevents extraction of rule that would translate
dankbar directly into appreciation since appreciation is aligned also to zutiefst
Transfer contiguity allows learning idioms like es gibt - there is from configurations that are local in f-structure but non-local in string, e.g., es scheint […] zu geben - there seems […] to be
![Page 49: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/49.jpg)
Linguistic Filters on Transfer RulesLinguistic Filters on Transfer Rules Morphological stemming of PRED values (Optional) filtering of f-structure snippets based on
consistency of linguistic categories– Extraction of snippet that translates zutiefst dankbar into a
deep appreciation maps incompatible categories adjectival and nominal; valid in string-based world
– Translation of sein to have might be discarded because of adjectival vs. nominal types of their arguments
– Larger rule mapping zutiefst dankbar sein to have a deep appreciation is ok since verbal types match
![Page 50: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/50.jpg)
TransferTransfer
Parallel application of transfer rules in non-deterministic fashion– Unlike XLE ordered-rule rewrite system
Each fact must be transferred by exactly one rule Default rule transfers any fact as itself Transfer works on chart using parser’s unification
mechanism for consistency checking Selection of most probable transfer output is done by
beam-decoding on transfer chart
![Page 51: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/51.jpg)
GenerationGeneration
Bi-directionality allows us to use same grammar for parsing training data and for generation in translation application
Generator has to be fault-tolerant in cases where transfer-system operates on FRAGMENT parse or produces non-valid f-structures from valid input f-structures
Robust generation from unknown (e.g., untranslated) predicates and from unknown f-structures
![Page 52: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/52.jpg)
Robust GenerationRobust Generation
Generation from unknown predicates: – Unknown German word “Hunde” is analyzed by German
grammar to extract stem (e.g., PRED = Hund, NUM = pl) and then inflected using English default morphology (“Hunds”)
Generation from unknown constructions:– Default grammar that allows any attribute to be generated in
any order is mixed as suboptimal option in standard English grammar, e.g. if SUBJ cannot be generated as sentence-initial NP, it will be generated in any position as any category
» extension/combination of set-gen-adds and OT ranking
![Page 53: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/53.jpg)
Statistical ModelsStatistical Models
1. Log-probability of source-to-target transfer rules, where probability r(e|f) or rule that transfers source snippet f into target snippet e is estimated by relative frequency
2. Log-probability of target-to-source transfer rules, estimated by relative frequency
r(e | f ) count( f e)
count( f e' )e
![Page 54: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/54.jpg)
Statistical Models, cont.Statistical Models, cont.
3. Log-probability of lexical translations l(e|f) from source to target snippets, estimated from Viterbi alignments a* between source word positions i=1, …n and target word positions j=1,…,m for stems fi and ej in snippets f and e with relative word translation frequencies t(ej|fi):
4. Log-probability of lexical translations from target to source snippets
l(e | f ) 1
| {i | (i, j) a*} |t(ej | fi )
(i, j)a*
j
![Page 55: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/55.jpg)
Statistical Model, cont.Statistical Model, cont.
5. Number of transfer rules6. Number of transfer rules with frequency 17. Number of default transfer rules8. Log-probability of strings of predicates from
root to frontier of target f-structure, estimated from predicate trigrams in English f-structures
9. Number of predicates in target f-structure10. Number of constituent movements during
generations based on original order of head predicates of the constituents
![Page 56: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/56.jpg)
Statistical Models, cont.Statistical Models, cont.
11. Number of generation repairs
12. Log-probability of target string as computed by trigram language model
13. Number of words in target string
![Page 57: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/57.jpg)
Experimental EvaluationExperimental Evaluation Experimental setup
– German-to-English on Europarl parallel corpus (Koehn ‘02)– Training and evaluation on sentences of length 5-15, for quick
experimental turnaround– Resulting in training set of 163,141 sentences, development
set of 1,967 sentences, test of 1,755 sentences (used in Koehn et al. HLT’03)
– Improved bidirectional word alignment based on GIZA++ (Och et al. EMNLP’99)
– LFG grammars for German and English (Butt et al. COLING’02; Riezler et al. ACL’02)
– SRI trigram language model (Stocke’02)– Comparison with PHARAOH (Koehn et al. HLT’03) and IBM
Model 4 as produced by GIZA++ (Och et al. EMNLP’99)
![Page 58: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/58.jpg)
Experimental Evaluation, cont.Experimental Evaluation, cont.
Around 700,000 transfer rules extracted from f-structures chosen by dependency similarity measure
System operates on n-best lists of parses (n=1), transferred f-structures (n=10), and generated strings (n=1,000)
Selection of most probable translations in two steps:– Most probable f-structure by beam search (n=20) on transfer
chart using features 1-10– Most probable string selected from strings generated from
selected n-best f-structures using features 11-13
Feature weights for modules trained by MER on 750 in-coverage sentences of development set
![Page 59: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/59.jpg)
Automatic EvaluationAutomatic Evaluation
NIST scores (ignoring punctuation) & Approximate Randomization for significance testing (see above)
44% in-coverage of grammars; 51% FRAGMENT parses and/or generation repair; 5% timeouts– In-coverage: Difference between LFG and P not significant– Suboptimal robustness techniques decrease overall quality
M4 LFG P
in-coverage 5.13 *5.82 *5.99
full test set *5.57 *5.62 6.40
![Page 60: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/60.jpg)
Manual EvaluationManual Evaluation
Closer look at in-coverage examples:– Random selection of 500 in-coverage examples – Two independent judges indicated preference for
LFG or PHARAOH, or equality, in blind test– Separate evaluation under criteria of
grammaticality/fluency and translational/semantic adequacy
– Significance assessed by Approximate Randomization via stratified shuffling of preference ratings between systems
![Page 61: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/61.jpg)
Manual EvaluationManual Evaluation
Result differences on agreed-on ratings are statistically significant at p < 0.0001
Net improvement in translational adequacy on agreed-on examples is 11.4% on 500 sentences (57/500), amounting to 5% overall improvement in hybrid system (44% of 11.4%)
Net improvement in grammaticality on agreed-on examples is 15.4% on 500 sentences, amounting to 6.7% overall improvement in hybrid system
adequacy grammaticality
j1\j2 P LFG eq P LFG eq
P 48 8 7 36 2 9
LFG 10 105 18 6 113 17
equal 53 60 192 51 44 223
![Page 62: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/62.jpg)
Examples: LFG > PHARAOHExamples: LFG > PHARAOH
src: in diesem fall werde ich meine verantwortung wahrnehmen
sef: then i will exercise my responsibility
LFG: in this case i accept my responsibility
P: in this case i shall my responsibilities
src: die politische stabilität hängt ab von der besserung der lebensbedingungen
ref: political stability depends upon the improvement of living conditions
LFG: the political stability hinges on the recovery the conditions
P: the political stability is rejects the recovery of the living conditions
![Page 63: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/63.jpg)
Examples: PHARAOH > LFGExamples: PHARAOH > LFG
src: das ist schon eine seltsame vorstellung von gleichheit
ref: a strange notion of equality
LFG: equality that is even a strange idea
P: this is already a strange idea of equality
src: frau präsidentin ich beglückwünsche herrn nicholson zu seinem ausgezeichneten bericht
ref: madam president I congratulate mr nicholson on his excellent report
LFG: madam president I congratulate mister nicholson on his report excellented
P: madam president I congratulate mr nicholson for his excellent report
![Page 64: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/64.jpg)
DiscussionDiscussion High percentage of out-of-coverage examples
– Accumulation of 2 x 20% error-rates in parsing training data – Errors in rule extraction – Together result in ill-formed transfer rules causing high
number of generation failures/repairs
Propagation of errors through the system also for in-coverage examples– Error analysis: 69% transfer errors, 10% due to parse errors
Discrepancy between NIST and manual evaluation– Suboptimal integration of generator, making training and
translation with large n-best lists infeasible– Language and distortion models applied after generation
![Page 65: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/65.jpg)
ConclusionConclusion
Integration of grammar-based generator into dependency-based SMT system achieves state-of-the-art NIST and improved grammaticality and adequacy on in-coverage examples
Possibility of hybrid system since it is determinable when sentences are in coverage of system
![Page 66: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/66.jpg)
Grammatical Machine Grammatical Machine Translation IITranslation II
Ji Fang, Martin Forst, John Maxwell, and Michael Tepper
![Page 67: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/67.jpg)
Overview over different Overview over different approaches to MT approaches to MT
Level of transfer Transfer Disambiguation
“Traditional” MT(e.g. Systran)
String(with minimal analysis)
Mainly hand-developed rules
Heuristics
Statistical MT(e.g. Google)
String(morpholical analysis)(synt. rearrangements)
Phrase correspondences with statistics acquired on
bitexts
Machine-Learned (transfer
probabilities, LM)
Grammatical MT I (2006)
F-structure Term-rewriting rules with statistics induced from
parsed bitexts
Machine-Learned (ME models, LM)
Context-Based MT (Meaningful
Machines)
String Semi-automatically developed phrase pairs
Machine-Learned (LM)
Grammatical MT II (2008)
F-structure Term-rewriting rules without statistics induced from semi-automatically developed phrase pairs,
potentially bitexts
Machine-Learned (ME models, LM)
![Page 68: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/68.jpg)
Limitations of string-based Limitations of string-based approaches approaches
Transfer rules/correspondences of little generality
Problems with long-distance dependencies Perform less well for morphologically rich
(target) languages N-gram LM-based disambiguation seems to
have leveled out
![Page 69: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/69.jpg)
Limitations of string-based Limitations of string-based approaches - little generalityapproaches - little generality
From Europarl: Das tut mir leid. = I’m sorry [about that].
Google (SMT): I’m sorry. Perfect! But: As soon as input changes a bit, we get garbage.
Das tut ihr leid. ‘She is sorry about that.’ It does their suffering.
Der Tod deines Vaters tut mir leid. ‘I am sorry about the death
of your father.’ The death of your father I am sorry.Der Tod deines Vaters tut ihnen leid. ‘They are sorry about the
death of your father.’ The death of your father is doing them sorry.
![Page 70: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/70.jpg)
Limitations of string-based Limitations of string-based approaches - problems with LDDsapproaches - problems with LDDs
From Europarl: Dies stellt eine der großen Herausforderungen für die französische Präsidentschaft dar . =
This is one of the major issues of the French Presidency .
Google (SMT): This is one of the major challenges for the French presidency represents.
Particle verb is identified and translated correctly But: two verbs ungrammatical; seem to be too far
apart to be filtered by LM
![Page 71: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/71.jpg)
Limitations of string-based Limitations of string-based approaches - rich morphology approaches - rich morphology
Language pairs involving morphologically rich languages, e.g., Finnish, are hard
From Koehn (2005, MT Summit)
![Page 72: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/72.jpg)
Limitations of string-based Limitations of string-based approaches - rich morphology approaches - rich morphology
Morphologically rich, free word order languages, e.g. German, are particularly hard as target languages.
Again from Koehn (2005, MT Summit)
![Page 73: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/73.jpg)
Limitations of string-based Limitations of string-based approaches - n-gram LMs approaches - n-gram LMs Even for morphologically poor languages,
improving n-gram LMs becomes increasingly expensive.
Adding data helps improve translation quality (BLEU scores), but not enough.
Assuming best improvement rate observed in Brants et al. (2007), ~400 million times available data needed to attain human translation quality by LM improvement.
![Page 74: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/74.jpg)
Limitations of string-based Limitations of string-based approaches - n-gram LMs approaches - n-gram LMs
From Brants et al. (2007)
Best improvement rate: +0.7 BP/x2
Would need 40 more doublings to obtain human translation quality. (42 + 0.7*40 ≈ 70)
Necessary training data in tokens: 1e22 (1e10*2^40 ≈ 1e22)
4e8 times current English Web (estimate) (2.5e13*4e8 = 1e22)
![Page 75: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/75.jpg)
Limitations of bitext-based Limitations of bitext-based approaches approaches Generally available bitexts are limited in size and
specialized in genre– Parliament proceedings– UN texts– Judiciary texts (from multilingual countries)
Makes it hard to repurpose bitext-based systems to new genres
Induced transfer rules/correspondences often of mediocre quality– “Loose” translations– Bad alignments
![Page 76: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/76.jpg)
Limitations of bitext-based Limitations of bitext-based approaches - availability and qualityapproaches - availability and quality
Readily available bitexts are limited in size and specialized in genre
Approaches to auto-extracting bitexts from the web exist.
Additional data help to some degree, but then effect levels out.– Still a genre bias in bitexts, despite automatic
acquisition?– Still more general problems with alignment quality
etc.?
![Page 77: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/77.jpg)
Limitations of bitext-based Limitations of bitext-based approaches - availability and qualityapproaches - availability and quality
Much more data needed to attain human translation quality
Logarithmic gains (at best) by adding bitext data
From Munteanu & Marcu (2005)
Base Line: 100K - 95M English Words
Mid Line (+auto): + 90K - 2.1M
Top Line (+oracle): + 90K - 2.1M
![Page 78: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/78.jpg)
Context-Based MT /Context-Based MT /Meaningful MachinesMeaningful Machines
Combines example-based MT (EBMT) and SMT
Very large (target) language model, large amount of monolingual text required
No transfer statistics, thus no parallel text required
Translation lexicon is developed semi-automatically (i.e. hand-validated)
Lexicon has slotted phrase pairs (like EBMT), i.e. “NP1 biss ins Gras.” = “NP1 bit the dust.”
![Page 79: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/79.jpg)
Context-Based MT /Context-Based MT /Meaningful Machines - prosMeaningful Machines - pros
High-quality translation lexicon seems to allow for– Easier repurposing of system(s) to new genres– Better translation quality
From Carbonell (2006)
![Page 80: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/80.jpg)
Context-Based MT /Context-Based MT /Meaningful Machines - consMeaningful Machines - cons
Works really well for English-Spanish. How about other language pairs?
Same problems with n-gram LMs as “traditional” SMT; probably affects pairs involving morphologically rich (target) language particularly badly.
How much manual labor involved in development of translation lexicon?
Computationally expensive
![Page 81: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/81.jpg)
Grammatical Machine TranslationGrammatical Machine Translation
Syntactic transfer-based approach Parsing and generation identical/similar
between GMT I and GMT II
pyramid
String-level statistical methods
F-structure transfer rules
– transfer, score target FSs –
– pa
rse
sour
ce, s
core
f-st
ruct
ures
– – generate, pick best realization –
![Page 82: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/82.jpg)
Grammatical Machine Translation Grammatical Machine Translation GMT I vs. GMT IIGMT I vs. GMT II
GMT I
Transfer rules induced from parsed bitexts
Target f-structures ranked using individual transfer rule statistics
GMT II
Transfer rules induced from manually/semi-automatically construc-ted phrase lexicon
Target f-structures ranked using monolingually trained bilexical dependency statistics and general transfer rule statistics
![Page 83: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/83.jpg)
GMT IIGMT II Where do the transfer rules come from? Where do statistics/machine learning come
in?pyramid
String-level statistical methods
F-structure transfer rules
– transfer, score target FSs –
– pa
rse
sour
ce, s
core
f-st
ruct
ures
– – generate, pick best realization –
log-linear model trained on synt. annotated monolingual corpus
log-linear model trained on bitext data; includes score from parse ranking model and very general transfer features
log-linear model trained on bitext data; includes scores from other two models and features/score of monolingually trained model for realization ranking
induced from manually/semi-automatically compiled phrase pairs with ``slots’’; potentially, but not necessarily from bitexts
![Page 84: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/84.jpg)
GMT II - The phrase dictionaryGMT II - The phrase dictionary
Contains phrase pairs with ``slot’’ categories (Ddeff, Ddef, NP1nom, NP1, etc.) that allow for well-formed phrases without being included in induced rules
Currently hand-written Will hopefully be compiled (semi-)automati-
cally from bilingual dictionaries Bitexts might also be used; how exactly
remains to be defined.
![Page 85: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/85.jpg)
GMT II - Rule induction from the GMT II - Rule induction from the phrase dictionaryphrase dictionary Sub-FSs of “slot” variables are not included FS attributes can be defined as irrelevant for
translation, e.g. CASE (in both en and de), GEND (in de). Attributes so defined are never included in induced rules.
set-gen-adds remove CASE GEND FS attributes can be defined as
“remove_equal_features”. Attributes defined as such are not included in induced rules when they are equal.
set remove_equal_features NUM OBJOBL-AG PASSIVE SUBJ TENSE
more general rules
![Page 86: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/86.jpg)
GMT II - Rule induction from the GMT II - Rule induction from the phrase dictionary (noun)phrase dictionary (noun) Ddeff Verfassung = Ddef constitution
PRED(%X1, Verfassung),NTYPE(%X1, %Z2),
NSEM(%Z2, %Z3),COMMON(%Z3, count),
NSYN(%Z2, common)==>PRED(%X1, constitution),NTYPE(%X1, %Z4),
NSYN(%Z4, common).
![Page 87: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/87.jpg)
GMT II - Rule induction from the GMT II - Rule induction from the phrase dictionary (adjective)phrase dictionary (adjective) europäische = European
PRED(%X1, europäisch) ==>PRED(%X1, European).
To accommodate certain non-parallelism with respect to SUBJs of adjectives etc., special mechanism removes SUBJs of non-verbs and makes them addable in generation.
![Page 88: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/88.jpg)
GMT II - Rule induction from the GMT II - Rule induction from the phrase dictionary (verb)phrase dictionary (verb) NP1nom koordiniert NP2acc. =
NP1 coordinates NP2.
PRED(%X1, koordinieren),arg(%X1, 1, %A2),arg(%X1, 2, %A3),VTYPE(%X1, main)==>PRED(%X1, coordinate),arg(%X1, 1, %A2),arg(%X1, 2, %A3),VTYPE(%X1, main).
![Page 89: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/89.jpg)
GMT II - Rule inductionGMT II - Rule induction(argument switching)(argument switching) NP1nom tut NP2dat leid. = NP2
is sorry about NP1.
PRED(%X1, leid#tun),SUBJ(%X1, %A2),OBJ-TH(%X1, %A3),VTYPE(%X1, main)==>PRED(%X1,be),SUBJ(%X1,%A3),XCOMP-PRED(%X1,%Z1),
PRED(%Z1, sorry),OBL(%Z1,%Z2),
PRED(%Z2,about),OBJ(%Z2,%A2),
VTYPE(%X1,copular).
![Page 90: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/90.jpg)
GMT II - Rule inductionGMT II - Rule induction(head switching)(head switching) Ich versuche nur, mich jeder Demagogie zu enthalten. =
It is just that I am trying not to indulge in demagoguery.
NP1nom Vfin nur. = It is ist just that NP1 Vs. +ADJUNCT(%X1,%Z2), in_set(%X3,%Z2), PRED(%X3,nur), ADV-
TYPE(%X3,unspec)
==>
PRED(%Z4,be), SUBJ(%Z4,%X3), NTYPE(%X3,%Z5), NSYN(%Z5,pronoun), GEND-SEM(%Z5,nonhuman), HUMAN(%Z5,-), NUM(%Z5,sg), PERS(%Z5,3), PRON-FORM(%Z5,it),
PRON-TYPE(%Z5,expl_), arg(%Z4,1,%Z6), PRED(%Z6, just), SUBJ(%Z6,%Z7), arg(%Z6,1,%A1), COMP-FORM(%A1,that), COMP(%Z6,%A1), nonarg(%Z6,1,%Z7), ATYPE(%Z6,predicative), DEGREE(%Z6, positive), nonarg(%Z4,1,%X3),
TNS-ASP(%Z4,%Z8), MOOD(%Z8,indicative), TENSE(%Z8, pres), XCOMP-PRED(%Z4,%Z6), CLAUSE-TYPE(%Z4,decl), PASSIVE(%Z4,-), VTYPE(%A2,copular).
![Page 91: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/91.jpg)
GMT II - Rule inductionGMT II - Rule induction(more on head switching)(more on head switching) In addition to rewriting terms, system re-attaches
rewritten FS if necessary. Here, this might be the case of %X1.
+ADJUNCT(%X1,%Z2), in_set(%X3,%Z2), PRED(%X3,nur), ADV-TYPE(%X3,unspec)==>PRED(%Z4,be), SUBJ(%Z4,%X3), NTYPE(%X3,%Z5), NSYN(%Z5,pronoun), GEND-SEM(%Z5,nonhuman), HUMAN(%Z5,-), NUM(%Z5,sg), PERS(%Z5,3), PRON-FORM(%Z5,it),
PRON-TYPE(%Z5,expl_), arg(%Z4,1,%Z6), PRED(%Z6, just), SUBJ(%Z6,%Z7), arg(%Z6,1,%A1), COMP-FORM(%A1,that), COMP(%Z6,%A1), nonarg(%Z6,1,%Z7), ATYPE(%Z6,predicative), DEGREE(%Z6, positive), nonarg(%Z4,1,%X3),
TNS-ASP(%Z4,%Z8), MOOD(%Z8,indicative), TENSE(%Z8, pres), XCOMP-PRED(%Z4,%Z6), CLAUSE-TYPE(%Z4,decl), PASSIVE(%Z4,-), VTYPE(%A2,copular).
![Page 92: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/92.jpg)
GMT II - Pros and cons of rule GMT II - Pros and cons of rule induction from a phrase dictionaryinduction from a phrase dictionary Development of phrase pairs can be carried out by someone
with little knowledge of grammar and transfer system; manual development of transfer rules would require experts (for boring, repetitive labor).
Phrase pairs can remain stable while grammars keep evolving. Since transfer rules are induced fully automatically, they can easily be kept in sync with grammars.
Induced rules are of much higher quality than rules induced from parsed bitexts (GMT I).
Although there is hope that phrase pairs can be constructed semi-automatically from bilingual dictionaries, it is not yet clear to what extent this can be automated.
If rule induction from parsed bitexts can be improved, the two approaches might well be complementary.
![Page 93: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/93.jpg)
Lessons Learned for Parallel Lessons Learned for Parallel Grammar DevelopmentGrammar Development
Absence of a feature like PERF=+/- is not equivalent to PERF=-.
FS-internal features should not say anything about the function of the FS– Example: PRON-TYPE=poss instead of PRON-
TYPE=pers
Compounds should be analyzed similarly, whether spelt together (de) or apart (en)– Possible with SMOR– Very hard or even impossible with DMOR
![Page 94: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/94.jpg)
Absence of PERF Absence of PERF PERF=- PERF=-
![Page 95: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/95.jpg)
No function info in FS-internal No function info in FS-internal featuresfeatures
I think NP1 Vs. = In my opinion NP1 Vs.
![Page 96: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/96.jpg)
Parallel analysis of compoundsParallel analysis of compounds
![Page 97: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/97.jpg)
More Lessons Learned for Parallel More Lessons Learned for Parallel Grammar DevelopmentGrammar Development
ParGram needs to agree on a parallel PRED value for (personal) pronouns
We need an “interlingua” for numbers, clock times, dates etc.
Guessers should analyze (composite) names similarly
![Page 98: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/98.jpg)
Parallel PRED values for Parallel PRED values for (personal) pronouns(personal) pronouns
Otherwise the number of rules we have to learn for them explodes.de-en: pro/er → he, pro/er → it, pro/sie → she, pro/sie → it,
pro/es → it, pro/es → he, pro/es → sheAlso: PRED-NUM-PERS combination may make no
sense!!! Result: A lot of generator effort for nothing…en-de: he → pro/er, she → pro/sie, it → pro/es, it → pro/er,
it → pro/sie, …
![Page 99: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/99.jpg)
Interlingua for numbers, clock Interlingua for numbers, clock times, dates, etc.times, dates, etc.
We cannot possibly learn transfer rules for all dates.
![Page 100: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/100.jpg)
Guessed (composite) namesGuessed (composite) names
We cannot possibly learn transfer rules for all proper names in this world.
![Page 101: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/101.jpg)
And Yet More Lessons Learned for And Yet More Lessons Learned for Grammar DevelopmentGrammar Development
Reflexive pronouns - PERS and NUM agreement should be insured via inside-out function application, e.g. ((SUBJ ^) PERS)= (^PERS).
Semantically relevant features should not be hidden in CHECK
![Page 102: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/102.jpg)
Reflexive pronounsReflexive pronouns
Introduce their own values for PERS and NUM– Overgeneration: *Ich wasche sich.– NUM ambiguity for (frequent) “sich”– Less generalization possible in transfer rules for
inherently reflexive verbs - 6 rules necessary instead of 1.
![Page 103: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/103.jpg)
Reflexive pronounsReflexive pronouns
![Page 104: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/104.jpg)
Semantically relevant features in Semantically relevant features in CHECKCHECK
sie = they Sie = you (formal)
Since CHECK features are not used for translations, the distinction between “sie” and “Sie” is lost.
![Page 105: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/105.jpg)
Planned experiments - MotivationPlanned experiments - Motivation
We do not have the resources to develop a “general purpose” phrase dictionary in the short or medium term.
Nevertheless, we want to get an idea about how well our new approach may scale.
![Page 106: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/106.jpg)
Planned Experiments 1Planned Experiments 1
Manually develop phrase dictionary for a few hundred Europarl sentences
Train target FS ranking model and realization ranking model on those sentences
Evaluate output in terms of BLEU, NIST and manually
Can we make this new idea work under ideal conditions? It seems we can.
![Page 107: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/107.jpg)
Planned Experiments 2Planned Experiments 2
Manually develop phrase dictionary for a few hundred Europarl sentences
Use bilingual dictionary to add possible phrase pairs that may distract the system
Train target FS ranking model and realization ranking model on those sentences
Evaluate output in terms of BLEU, NIST and manually
How well can our system deal with the “distractors”?
![Page 108: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/108.jpg)
Planned Experiments 3Planned Experiments 3
Manually develop phrase dictionary for a few hundred Europarl sentences
Use bilingual dictionary to add possible phrase pairs that may distract the system
Degrade the phrase dictionary at various levels of severity– Take out a certain percentage of phrase pairs– Shorter phrases may be penalized less than longer ones
Train target FS ranking model and realization ranking model on those sentences
Evaluate output in terms of BLEU, NIST and manually
How good or bad is the output of the system when the bilingual phrase dictionary lacks coverage?
![Page 109: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/109.jpg)
Main Remaining ChallengesMain Remaining Challenges
Get comprehensive and high-quality dictionary of phrase pairs
Get more and better (i.e. more normalized and parallel) analyses from grammars
Improve ranking models, in particular on source side Improve generation behavior of grammars - So far,
grammar development has mostly been “parsing-oriented”.
Efficiency, in particular on the generation side, i.a. packed transfer and generation
![Page 110: Linguistics 187/287 Week 6](https://reader030.fdocuments.net/reader030/viewer/2022033019/568144ff550346895db1cb78/html5/thumbnails/110.jpg)