Co-funded by the European Union Information Access through Textual Entailment: The Experience of the...
-
Upload
horace-stephens -
Category
Documents
-
view
216 -
download
0
Transcript of Co-funded by the European Union Information Access through Textual Entailment: The Experience of the...
Co-funded by the European Union
Information Access through Textual Information Access through Textual Entailment: The Experience of the Entailment: The Experience of the
QALL-ME projectQALL-ME project
Bernardo MagniniFBK-irst, Trento, Italy
First Kyoto Workshop, February 3, 2009 2
OutlineOutline The Qallme scenario
Semantic Interpretation of user queriesS
uggested direction: textual entailment engines
Interacting with the userS
uggested direction: provide answers with as much structure a
s possible (RDF)
Porting the systemS
uggested direction: learn as much as possible from data (user q
uestions)
Conclusions
First Kyoto Workshop, February 3, 2009 3
QALLME
Reference: FP6 IST-033860 Contract Type: STREP Start date: October 1st, 2006 Duration: 36 months Project Funding: 2.82 M euros
http://qallme.itc.it
FBK-irst, Italy Comdata S,p.A., Italy
DFKI, Germany Ubiest S.p.A., Italy
University of Alicante, Spain Waycom S.r.l., Italy
University of Wolverhampton, UK
Question Answering Learning Technologies in a Multilingual and Multimodal Environment
First Kyoto Workshop, February 3, 2009 4
Query Driven vs Answer Driven Information Access How many people live in Trento?
No answer in the first ten documents using Google.
When did Hitler attack Soviet Union? We find documents containing the question itself, no matter
whether or not the answer is actually provided.
Current information access is query driven. Question Answering proposes an answer driven approach to
information access.
See how Google and Yahoo answer to “Who is Bill Clinton?”
First Kyoto Workshop, February 3, 2009 5
SMSSMS
INPUT OUTPUT
SMSSMS
MMSMMS
VOICE
TEXT
TEXT
VOICE
VIDEO
DIGITAL
ASSISTANT
QALL-ME QALL-ME ScenarioScenario Mobile Devices: Mobile Phones & PDA Question Input: Voice/SMS Answer Output: Voice/SMS/MMS/Digital Assistant
(Images/Audio/Video/Maps and geo-referenced interactive maps)
First Kyoto Workshop, February 3, 2009 6
hallo I am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks
from the QALL-ME benchmark
QALL-ME: RequestsQALL-ME: Requests
First Kyoto Workshop, February 3, 2009 7
hallo I am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks
To greet
from the QALL-ME benchmark
QALL-ME QuestionsQALL-ME Questions
First Kyoto Workshop, February 3, 2009 8
hallo I am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks
To contextualise
This is explicit context Time is implicit
from the QALL-ME benchmark
QALL-ME QuestionsQALL-ME Questions
First Kyoto Workshop, February 3, 2009 9
hallo I am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks
To ask from the QALL-ME benchmark
QALL-ME QuestionsQALL-ME Questions
First Kyoto Workshop, February 3, 2009 10
hallo I am in Trento and I would like to visit a church in the centre of the town I would like to know the name and the location of one of these churches thanks
To thank
from the QALL-ME benchmark
QALL-ME QuestionsQALL-ME Questions
First Kyoto Workshop, February 3, 2009 11
audio transcr. Eng. translat.
speech acts
EAT Sekine
EAT ontology
ITALIAN X X X X X X
SPANISH X X X X X X
ENGLISH X X --- X almost finished almost finished
GERMAN X X X X in progress in progress
Both the QALL-ME benchmark and QALL-ME ontology are being made incrementally available at the project website (http://qallme.fbk.eu) under a creative common licenseTwo papers at LREC 2008
Qallme benchmark Acquisition for four languages (about 12,000 requests in total). Semantic annotations: transcriptions, speech acts, EAT, translations
Qallme Ontology: version 4
QALL-ME ResourcesQALL-ME Resources
First Kyoto Workshop, February 3, 2009 12
Front-EndAPPLICATION
Front-EndAPPLICATION
CLIENT LIBRARYCLIENT LIBRARY
VirtualPhoneEngine
VirtualPhoneEngine
ManagerASR
ManagerASR
ASREngineASR
EngineResourceInterfaceResourceInterface
ManagerTTS
ManagerTTS
ResourceInterfaceResourceInterface
TTSEngineTTS
Engine
VoicedataVoicedata
IPIP
IPIP
APIAPI
APIAPI
Server Side APPLICATIONServer Side
APPLICATION
QALL-MEQALL-MEWebservicesWebservices
WebservicesWebservices
IPIP
IPIP
Waycom srl, Demo Prototype
ResourceInterface
(German/English)
ResourceInterface
(German/English)
Application data
Application data
TOWN
-TRENTO
Address
- VIA VERDI 3
QALL-ME QALL-ME Mobile InfrastructureMobile Infrastructure
First Kyoto Workshop, February 3, 2009 13
ShowcasesShowcases
Cinema and Accommodation domain• Automatic procedures for daily updating
(Trento) • Distributed services • Cross-language• More complex questions
Mobile showcase • Infrastructure has been consolidated• Run on Comdata server• Nokia N95 with GPS• Speech input (Italian only)• Cross-language: SMS only• Navigation• Text to Speech
First Kyoto Workshop, February 3, 2009 14
QALL-ME QALL-ME architecturearchitecture
Spanish Answer
Extractor
Italian Answer
Extractor
German Answer
ExtractorQALL-ME
central QA planner
Service Provider
Question Type
Ontology
Answer Type Ontology
Dialog Models
English Answer
Extractor
Local Information Sources
Shared Semantic
representation
Speech Recognizer
s
First Kyoto Workshop, February 3, 2009 15
Structured and Unstructured DataStructured and Unstructured Data
MOVIEtitle 007 casino Royale
… …
Date From: 01/26/2007
To: 02/01/2007
Hours From 01/26 to 01/30: 19.30
02/01: 18.20
… …
Original Title
Casino Royale
Director Martin Campbell
Genre Action
Characters James Bond
QALL-ME in a nutshellQALL-ME in a nutshell
User Data
Question Collection
Question Annotation
TrainingEntailment
Engine
QallmeOntology
Question
Presentationoutput
AnswerRepresentation
PresentationTemplate
QALL-ME
M
M
M
A
SM
First Kyoto Workshop, February 3, 2009 17
QALL-ME QALL-ME ArchitectureArchitecture
First Kyoto Workshop, February 3, 2009 18
OutlineOutline The Qallme scenario
Semantic interpretation of user queriesS
uggested direction: Entailment Engine
Presenting information
How to build the system
Conclusions
First Kyoto Workshop, February 3, 2009 19
Question Interpretation
Given:1. A domain ontology
Domain ontology
(entailment-based Relation Extraction)
First Kyoto Workshop, February 3, 2009 20
Given:1. A domain ontology describing binary relations of interest
Domain ontology
(entailment-based RE)
Question Interpretation
First Kyoto Workshop, February 3, 2009 21
Question
Given:1. A domain ontology describing binary relations of interest2. A natural language question
Domain ontology
(entailment-based RE)
Question Interpretation
First Kyoto Workshop, February 3, 2009 22
Domain ontology
Question
Given:1.A domain ontology describing binary relations of interest2.A natural language question
Determine ALL the relations of interest expressed by the question
(entailment-based RE)
Question Interpretation
First Kyoto Workshop, February 3, 2009 23
Q1
Q2
Q3
Q4
Q5
Q6
Q7
Q8
Q9
Q10
(entailment-based RE)
Question Interpretation
First Kyoto Workshop, February 3, 2009 24
Out of domain questions
Q1
Q2
Q3
Q4
Q5
Q6
Q7
Q8
Q10
(entailment-based RE)
Q9
Question Interpretation
First Kyoto Workshop, February 3, 2009 25
The task: example
R1: HasDirector(Movie,Director) R6: HasDescription(Movie,Description)
R2: HasGenre(Movie,Genre) R7: IsInSite(Movie, Site)
R3: HasPhoneNumber(Cinema,Phone) R8: HasDate(Movie, Date)
R4: HasActor(Movie,Actor) …
R5: IsInDestination(Cinema, Destination) Rn: HasStartTime(Movie,StartTime)
OUTPUT:
INPUT:“ What science fiction movie can I see today at cinema Astra in Trento?”
First Kyoto Workshop, February 3, 2009 26
R1: HasDirector(Movie,Director) R6: HasDescription(Movie,Description)
R2: HasGenre(Movie,Genre) R7: IsInSite(Movie, Site)
R3: HasPhoneNumber(Cinema,Phone) R8: HasDate(Movie, Date)
R4: HasActor(Movie,Actor) …
R5: IsInDestination(Cinema, Destination) Rn: HasStartTime(Movie,StartTime)
The task: example
R2: HasGenre(Movie,Genre)R2: HasGenre(Movie,Genre)
OUTPUT:
R2
INPUT:“ What science fiction movie can I see today at cinema Astra in Trento?”
First Kyoto Workshop, February 3, 2009 27
The task: example
R1: HasDirector(Movie,Director) R6: HasDescription(Movie,Description)
R2: HasGenre(Movie,Genre) R7: IsInSite(Movie, Site)
R3: HasPhoneNumber(Cinema,Phone) R8: HasDate(Movie, Date)
R4: HasActor(Movie,Actor) …
R5: IsInDestination(Cinema, Destination) Rn: HasStartTime(Movie,StartTime) R5: IsInDestination(Cinema, Destination)R5: IsInDestination(Cinema, Destination)
OUTPUT:
R2, R5
INPUT:“ What science fiction movie can I see today at cinema Astra in Trento?”
First Kyoto Workshop, February 3, 2009 28
The task: example
R1: HasDirector(Movie,Director) R6: HasDescription(Movie,Description)
R2: HasGenre(Movie,Genre) R7: IsInSite(Movie, Site)
R3: HasPhoneNumber(Cinema,Phone) R8: HasDate(Movie, Date)
R4: HasActor(Movie,Actor) …
R5: IsInDestination(Cinema, Destination) Rn: HasStartTime(Movie,StartTime)
INPUT:“ What science fiction movie can I see today at cinema Astra in Trento?”
R7: IsInSite(Movie, Site)R7: IsInSite(Movie, Site)
OUTPUT:
R2, R5, R7
First Kyoto Workshop, February 3, 2009 29
The task: example
R1: HasDirector(Movie,Director) R6: HasDescription(Movie,Description)
R2: HasGenre(Movie,Genre) R7: IsInSite(Movie, Site)
R3: HasPhoneNumber(Cinema,Phone) R8: HasDate(Movie, Date)
R4: HasActor(Movie,Actor) …
R5: IsInDestination(Cinema, Destination) Rn: HasStartTime(Movie,StartTime)
OUTPUT:
R2, R5, R7,R8
INPUT:“ What science fiction movie can I see today at cinema Astra in Trento?”
R8: HasDate(Movie, Date)R8: HasDate(Movie, Date)
First Kyoto Workshop, February 3, 2009 30
Textual Entailment
t: The technological triumph known as GPS … was incubated in the mind of Ivan Getting.
h: Ivan Getting invented the GPS.
TE tutorial at ACL 2007, Dagan, Roth, Zanzotto
First Kyoto Workshop, February 3, 2009 31
Applied Textual Entailment
A directional relation between two text fragments: Text (t) and Hypothesis (h):
t entails h (th) if
humans reading t will infer that h is most likely true
Operational (applied) definition: Human gold standard - as in NLP applications Assuming common background knowledge –
which is indeed expected from applications
TE tutorial at ACL 2007, Dagan, Roth, Zanzotto
32
Distance-Based TE EngineDistance-Based TE Engine
Determines the best (less costly) sequence of edit operations that allow to transform T into H: - Linear distance - Tree Edit Distance
Determines the cost of the three edit operations (insertion, deletion, substitution)
Each rule has a probability representing the degree of confidence of the rule. Rules can be at different levels (e.g. lexical, syntactic)
First Kyoto Workshop, February 3, 2009 33
Q: “Where is cinema Astra located?” P1: What is the telephone number of Cinema:X?
P1SPARQL
P2: Who is the director of Movie:X? P2SPARQL
P3: What is the ticket price of Cinema:X? P3SPARQL
P4: Give me the address of Cinema:X. P4SPARQL
… …
Pn PnSPARQL
Pattern repositoryInput question
Entailment engine
Entailment-based QA over structured dataEntailment-based QA over structured data
First Kyoto Workshop, February 3, 2009 34
Entailment-based QA over structured dataEntailment-based QA over structured data
Q: “Where is cinema Astra located?” P1: What is the telephone number of Cinema:X?
P1SPARQL
P2: Who is the director of Movie:X? P2SPARQL
P3: What is the ticket price of Cinema:X? P3SPARQL
P4: Give me the address of Cinema:X. P4SPARQL
… …
Pn PnSPARQL
Pattern repositoryInput question
Entailment engine
Q P4
First Kyoto Workshop, February 3, 2009 35
Entailment-based QA over structured dataEntailment-based QA over structured data
Q: “Where is cinema Astra located?” P1: What is the telephone number of Cinema:X?
P1SPARQL
P2: Who is the director of Movie:X? P2SPARQL
P3: What is the ticket price of Cinema:X? P3SPARQL
P4: Give me the address of Cinema:X. P4SPARQL
… …
Pn PnSPARQL
Pattern repositoryInput question
Entailment engine
Q P4
CONSTRUCT ?addressWHERE { ?cinema rdf:type tourism:Cinema
?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address }
First Kyoto Workshop, February 3, 2009 36
Entailment-based QA over structured dataEntailment-based QA over structured data
Q: “Where is cinema Astra located?” P1: What is the telephone number of Cinema:X?
P1SPARQL
P2: Who is the director of Movie:X? P2SPARQL
P3: What is the ticket price of Cinema:X? P3SPARQL
P4: Give me the address of Cinema:X. P4SPARQL
… …
Pn PnSPARQL
Pattern repositoryInput question
Entailment engine
Q P4
CONSTRUCT ?addressWHERE { ?cinema rdf:type tourism:Cinema
?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address }
A: Corso Buonarroti, 16 - Trento
Answer
First Kyoto Workshop, February 3, 2009 37
Entailment-based QA over structured dataEntailment-based QA over structured data
Q: “What’s the address of Astra?” P1: What is the telephone number of Cinema:X?
P1SPARQL
P2: Who is the director of Movie:X? P2SPARQL
P3: What is the ticket price of Cinema:X? P3SPARQL
P4: Give me the address of Cinema:X. P4SPARQL
… …
Pn PnSPARQL
Pattern repositoryInput question
Entailment engine
Q P4
CONSTRUCT ?addressWHERE { ?cinema rdf:type tourism:Cinema
?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address }
Answer
A: Corso Buonarroti, 16 - Trento
First Kyoto Workshop, February 3, 2009 38
Entailment-based QA over structured dataEntailment-based QA over structured data
Q: “Where can I find a cinema in the city centre?” P1: What is the telephone number of
Cinema:X? P1SPARQL
P2: Who is the director of Movie:X? P2SPARQL
P3: What is the ticket price of Cinema:X? P3SPARQL
P4: Give me the address of Cinema:X. P4SPARQL
… …
Pn PnSPARQL
Pattern repositoryInput question
Entailment engine
Q P4
CONSTRUCT ?addressWHERE { ?cinema rdf:type tourism:Cinema
?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address }
Answer
A: Corso Buonarroti, 16 - Trento
First Kyoto Workshop, February 3, 2009 39
Entailment-based QA over structured dataEntailment-based QA over structured data
Q: “I want to see a movie at Astra. Where is it?” P1: What is the telephone number of
Cinema:X? P1SPARQL
P2: Who is the director of Movie:X? P2SPARQL
P3: What is the ticket price of Cinema:X? P3SPARQL
P4: Give me the address of Cinema:X. P4SPARQL
… …
Pn PnSPARQL
Pattern repositoryInput question
Entailment engine
Q P4
CONSTRUCT ?addressWHERE { ?cinema rdf:type tourism:Cinema
?cinema tourism:name “Astra”. ?cinema tourism:hasPostalAddress ?addr. ?addr tourism:street ?address }
Answer
A: Corso Buonarroti, 16 - Trento
First Kyoto Workshop, February 3, 2009 40
Entailment-Based QAEntailment-Based QA
Language variations are held at textual level. • Alleviate the need of lexical mapping (as in traditional NLI
systems)
Any textual entailment approach/algorithm can be used• Distance-based, Machine Learning based
• Entailment rules with lexical and syntactic information
Linguistic phenomena are independent from the database organization• Re-usable across different tasks (e.g. Relation Extraction)
• Does not change in case of open domain QA
First Kyoto Workshop, February 3, 2009 41
OutlineOutline The Qallme scenario
Semantic Interpretation of user queries
Presenting informationS
uggested direction: provide answers with as much structure
as possible (RDF)
How to build the system
Conclusions
First Kyoto Workshop, February 3, 2009 42
QALLME: RDF-based outputQALLME: RDF-based output RDF is a standard for representing knowledge in the Semantic Web
RDF is independent both from languages and from media, allowing specific presentation components to be designed on top of it.
All reasoning capabilities allowed by RDF will be available in order to draw inferences from answers.
In order to represent the informative content of an answer, it seems natural to re-use concepts and relations already defined for the QALL-ME Ontology, rather then define a new set of predicates.
However the informative content is not adequate for generating interactive QA presentations
First Kyoto Workshop, February 3, 2009 43
A closer look to SPARQL queriesA closer look to SPARQL queries
CONSTRUCT{
…
}
WHERE{
…
}
First Kyoto Workshop, February 3, 2009 44
A closer look to SPARQL queriesA closer look to SPARQL queries
CONSTRUCT{
…
}
WHERE{
…
}
“Construct” portion
Selects fragments of the ontology, that represent the “answer” (core answer PLUS relevant additional information, for different answer presentation strategies)
First Kyoto Workshop, February 3, 2009 45
A closer look to SPARQL queriesA closer look to SPARQL queries
CONSTRUCT{
…
}
WHERE{
…
}
“Construct” portion
Returns fragments of the ontology in the form of an RDF graph, that represent the “answer” (core answer PLUS relevant additional information, useful for answer presentation)
“Where” portion
Represents the constraints necessary for answer extraction
First Kyoto Workshop, February 3, 2009 46
CONSTRUCT portionCONSTRUCT portion
CONSTRUCT {?event qmo:hasPeriod ?period .?event qmo:isInSite ?cinema .?event qmo:hasEventContent ?movie .?movie rdf:type ?movietype .?movie qmo:name ?moviename .?cinema qmo:hasGPSCoordinate ?coordinate .?cinema qmo:name ?cinemaname .?cinema qmo:hasPostalAddress ?postaladdress .?postaladdress qmo:isInDestination ?destination .…qma:AnswerInstance a qma:AnswersObject ;qma:hasAnswerValue ?movie}
IN: What’s on at Modena?
First Kyoto Workshop, February 3, 2009 47
CONSTRUCT portionCONSTRUCT portionIN: What’s on at Modena? eventperiod hasPeriod
cinema
isInSite
movie
hasEventContent
movietype
type
moviename
name
coordinate
cinemaName
name
postalAddress
hasPostalAddr.
Destination isInDestination
CONSTRUCT {?event qmo:hasPeriod ?period .?event qmo:isInSite ?cinema .?event qmo:hasEventContent ?movie .?movie rdf:type ?movietype .?movie qmo:name ?moviename .?cinema qmo:hasGPSCoordinate ?coordinate .?cinema qmo:name ?cinemaname .?cinema qmo:hasPostalAddress ?postaladdress .?postaladdress qmo:isInDestination ?destination .…qma:AnswerInstance a qma:AnswersObject ;qma:hasAnswerValue ?movie}
hasGPSCoord.
First Kyoto Workshop, February 3, 2009 48
WHERE portionWHERE portion
CONSTRUCT {…} WHERE{?event qmo:hasPeriod ?period .?event qmo:isInSite ?cinema .… { ?cinema qmo:name ”Supercinema Modena" } UNION { ?cinema qmo:name "Multisala Modena" } } .…FILTER (xsd:dateTime("2008-12-05T14:19:55") <= xsd:dateTime(fn:string-join(fn:string-join(xsd:string(?date),"T"),xsd:string(?time))))
…}
IN: What’s on at Modena?
…the name of the cinema is “SUPERCINEMA MODENA” or “MULTISALA MODENA”
First Kyoto Workshop, February 3, 2009 49
WHERE portionWHERE portion
CONSTRUCT {…} WHERE{?event qmo:hasPeriod ?period .?event qmo:isInSite ?cinema .… { ?cinema qmo:name ”Supercinema Modena" } UNION { ?cinema qmo:name "Multisala Modena" } } .…FILTER (xsd:dateTime("2008-12-05T14:19:55") <= xsd:dateTime(fn:string-join(fn:string-join(xsd:string(?date),"T"),xsd:string(?time))))
…}
IN: What’s on at Modena?
…the movie should be TODAY, and AFTER THE TIME OF THE QUERY
First Kyoto Workshop, February 3, 2009 50
Resulting RDF graphResulting RDF graphIN: What’s on at Modena? eventperiod hasPeriod
cinema
isInSite
movie
hasEventContent
Crime
type
La Fuga
name
coordinate
Modena
name
postalAddress
hasPostalAddr.
Trento isInDestination
hasGPSCoord.
11°7′0′′E
Longitude
46°4′0′′N
Latitude
Dateperiod
HasDatePeriod
12/11/2008
StartDate
Timeperiod
HasTimePeriod
21.00
StartTime
First Kyoto Workshop, February 3, 2009 51
Using RDF for PresentationsUsing RDF for Presentations
RDF for interactive/presentation purposes• Annotations over RDF triples for interactive/presentation purposes:• Core vs complementary information wrt question• Typical follow up questions in specific domains• Explanations for error• Aggregation of redundant information• Natural messages
Three steps:1. Produce RDF output2. Annotate RDF with presentation metadata (in progress)3. Generate a presentation for a specific
media/language/devise/user/… (in progress)
First Kyoto Workshop, February 3, 2009 52
Resulting RDF graphResulting RDF graphIN: What’s on at Modena? eventperiod hasPeriod
cinema
isInSite
movie
hasEventContent
Crime
type
La Fuga
name
coordinate
Modena
name
postalAddress
hasPostalAddr.
Trento isInDestination
hasGPSCoord.
11°7′0′′E
Longitude
46°4′0′′N
Latitude
Dateperiod
HasDatePeriod
12/11/2008
StartDate
Timeperiod
HasTimePeriod
21.00
StartTime
First Kyoto Workshop, February 3, 2009 53
Adding metadata to the RDF graphAdding metadata to the RDF graph
IN: What’s on at Modena? eventperiod hasPeriod
cinema
isInSite
movie
hasEventContent
Crime
type
La Fuga
name
coordinate
Modena
name
postalAddress
hasPostalAddr.
Trento isInDestination
hasGPSCoord.
11°7′0′′E
Longitude
46°4′0′′N
Latitude
Dateperiod
HasDatePeriod
12/11/2008
StartDate
Timeperiod
HasTimePeriod
21.00
StartTime
CORE ANSWERCORE ANSWER
First Kyoto Workshop, February 3, 2009 54
Adding metadata to the RDF graphAdding metadata to the RDF graph
IN: What’s on at Modena? eventperiod hasPeriod
cinema
isInSite
movie
hasEventContent
Crime
type
La Fuga
name
coordinate
Modena
name
postalAddress
hasPostalAddr.
Trento isInDestination
hasGPSCoord.
11°7′0′′E
Longitude
46°4′0′′N
Latitude
Dateperiod
HasDatePeriod
12/11/2008
StartDate
Timeperiod
HasTimePeriod
21.00
StartTime
DEFAULT TIMEDEFAULT TIME
First Kyoto Workshop, February 3, 2009 55
Adding metadata to the RDF graphAdding metadata to the RDF graph
IN: What’s on at Modena? eventperiod hasPeriod
cinema
isInSite
movie
hasEventContent
Crime
type
La Fuga
name
coordinate
Modena
name
postalAddress
hasPostalAddr.
Trento isInDestination
hasGPSCoord.
11°7′0′′E
Longitude
46°4′0′′N
Latitude
Dateperiod
HasDatePeriod
12/11/2008
StartDate
Timeperiod
HasTimePeriod
21.00
StartTime
NAMED BY USERNAMED BY USER
First Kyoto Workshop, February 3, 2009 56
Adding metadata to the RDF graphAdding metadata to the RDF graph
IN: What’s on at Modena? eventperiod hasPeriod
cinema
isInSite
movie
hasEventContent
Crime
type
La Fuga
name
coordinate
Modena
name
postalAddress
hasPostalAddr.
Trento isInDestination
hasGPSCoord.
11°7′0′′E
Longitude
46°4′0′′N
Latitude
Dateperiod
HasDatePeriod
12/11/2008
StartDate
Timeperiod
HasTimePeriod
21.00
StartTime
COMPLEMENTARY INFORMATIONCOMPLEMENTARY INFORMATION
First Kyoto Workshop, February 3, 2009 57
OutlineOutline
The Qallme scenario
Semantic Interpretation of user queries
Presenting information
How to port the system to a different domainA
101 hours experiment
Conclusions
First Kyoto Workshop, February 3, 2009 58
Porting through domainsPorting through domains Methodological issues
• Acquisition of domain-specific questions
• Questions Annotation
Estimating domain complexity Estimating the costs of domain portability
Porting to ACCOMMODATION System Training and Evaluation
Focus on two crucial aspects of the QALL-ME approach• Expected Answer Type recognition
• Entailment-based Relation extraction from questions
First Kyoto Workshop, February 3, 2009 59
First Kyoto Workshop, February 3, 2009 60
STRUCTURED FIELDSnamecategory ratingstreetpostal codetownregiontelephone numberfax numberemailwebsite
UNSTRUCTURED FIELDSwebsitedescriptionemailyear of costructionnumber of jounior suitesnumber of single roomscategory (recommended)serviceslanguagesfederal statestarsroom ….
First Kyoto Workshop, February 3, 2009 61
CINEMA ACCOMMODATION
Class instances 1010 4595
Relations 1692 6895
Data properties 1107 7912Total 3809 19402
Cinema Vs Accommodation: a look at the data
Domain Complexity Domain Complexity
First Kyoto Workshop, February 3, 2009 62
Domain Complexity Domain Complexity
1. Number of possible Expcted Answer Types (EAT) The more the EATs, the harder the EAT recognition task
2. Number of valid relations Impact on RTE performance, the more the relations the
harder the task
3. Size of the vocabulary Impact on EAT and RTE performance (language variability)
4. Average number of relations per question Impact on RTE performance (questions complexity)
5. Average length of the collected questions Impact on RTE performance (questions complexity)
First Kyoto Workshop, February 3, 2009 63
101 hours experience101 hours experience Porting from Movies to ACCOMMODATION
• 1 Italian speaker, hired for a 40 hours work ~32 hours for question collection ~8 hours for question annotation (EAT, Relations, MRPs) Result:
232 questions collected and annotated 144 MRPs (13 on average per rel. –min 5, max 57– 18per hour)
• System training (182 questions) Development of a rule-based EAT recognizer Training of the RTE engine
• System evaluation (50 questions) EAT recognition Entailment-based Relation Extraction
First Kyoto Workshop, February 3, 2009 64
Development and evaluationDevelopment and evaluation
Matteo Negri, Domain Portability - Dec. 11, 2008
EAT recognition, on Italian data:
• ML results confirm that ACCO is slightly easier than CIN
• On ACCO, +9% Accuracy over ML with only 120 rules
• Limited impact on performance for the combined system In the combined test set, most of the errors are still due to wrong
assignments for the ACCO domain
Domain #Questions EATs Approach Person hours Accuracy
CIN (ALL) 283 18 Rules (257) ~100 75%
CIN (correct) 86 18 Rules (257) ~100 92%
CIN (correct) 86 18 ML1 ~30 52%
ACCO 232 14 Rules (120) ~30 68%
ACCO 232 14 ML2 ~1 59%
CIN+ACCO 318 28 Rules (437) ~130 78%
First Kyoto Workshop, February 3, 2009 65
Accuracy: CINEMA Vs ACCOAccuracy: CINEMA Vs ACCOA
ccur
acy
Number of patterns (hours of work)
Accuracy in terms of Exact Matches (proportion of questions for which the system recognized ALL and ONLY the correct relations), using different amounts of patterns/person hours
First Kyoto Workshop, February 3, 2009 66
Accuracy: CINEMA Vs ACCOAccuracy: CINEMA Vs ACCOA
ccur
acy
Number of patterns (hours of work)
First Kyoto Workshop, February 3, 2009 67
Accuracy: ACCO Vs CINEMAAccuracy: ACCO Vs CINEMA
• The curves confirm the previous conclusions…• ACCO is slightly easier to handle than CIN (92% Vs 82% with IDF, 80 Vs 48 with WOLP) • In the CIN domain, higher Accuracy differences reflect differences between questions and patterns
• …together showing that• more than the number of instantiated classes/relations/properties, the number of different classes/relations/properties (the database schema) is a relevant indicator of domain complexity
First Kyoto Workshop, February 3, 2009 68
101 hours experiment: summary101 hours experiment: summary 40 hours acquisition/annotation
• 232 questions featuring:
1. Number of possible EATs: 14
2. Number of valid relations: 11
3. Size of the vocabulary: 120
4. Average number of relations per question: 1.44
5. Average length of the collected questions: 9.45
• 144 MRPs (13 on average per relation)
30 hours on EAT rules development• 68% Accuracy
31 hours on ML-based EAT recognizer (CIN)• 59% Accuracy on ACCO
0 hours on entailment-based RE• 92% Exact Matches in RE
First Kyoto Workshop, February 3, 2009 69
ConclusionConclusion The QALL-ME project
• Entailment-based QA EDITS system: open source release expected March 2009
• Consolidated web service architecture: open source release expected March 2009
• Interactive QA Based on richer structured output From RDF to annotated RDF for interaction/presentations
• Porting through domains Learn from data (questions)
Interesting perspectives for deployment• Several application scenarios (mobile services in a town, FAQ)• Integration with digital assistant