Danica Damljanović University of Sheffield [email protected].

27
Usability Enhancement Methods in Natural Language Interfaces for Querying Ontologies Birmingham, 12 April, 2011 Danica Damljanović University of Sheffield [email protected]

Transcript of Danica Damljanović University of Sheffield [email protected].

Page 1: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

Usability Enhancement Methods in Natural Language Interfaces for

Querying OntologiesBirmingham, 12 April, 2011

Danica DamljanovićUniversity of [email protected]

Page 2: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

OutlineBackground:

What are Ontologies?What are Natural Language Interfaces (NLIs)?What are Usability Enhancement Methods?

ObjectiveImprove NLIs to Ontologies with usability enhancement

methodsOur approach

Two NLI systems for querying ontologies: QuestIO FREyA

Two usability studies to test the usability enhancement methodsFindingsDemo

Conclusion

Page 3: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

Mary works for University of Sheffield, which is located in Sheffield. Sheffield is located in the United Kingdom. Mary lives in Sheffield.

MARY <is a> PERSONUNIVERSITY OF SHEFFIELD <is an> ORGANISATIONMARY <works for> UNIVERSITY OF SHEFFIELDSHEFFIELD <is a> CITYUNIVERSITY OF SHEFFIELD <is located in> SHEFFIELDUNITED KINGDOM <is a> COUNTRYSHEFFIELD <is located in> UNITED KINGDOMMARY <lives in> SHEFFIELD

SELECT ?countryWHERE{?person <lives in> ?city?city <located in> ?countryFILTER ?person = MARY}

Page 4: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

4

In which country does Mary live?

Page 5: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

What are Usability Enhancement Methods?Who are the users?

application developers

end users

Page 6: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

The Objective

Increase usability of Natural Language Interfaces to ontologies For end users: increase precision and recall For application developers: decrease the time

for customisation

Page 7: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

Our ApproachScope Ranking Lexicon Usabilit

y methods

Grammar analysis

Supported language

QuestIO

one ontology

ontology structure, string similarity

ontology lexicalisaions

none (automatic)

Shallow (morphological analysis)

grammatically correct/question fragments

FREyA a set of ontologies/repository

string similarity, synonym detection, user

ontology lexicalisations, Wordnet, user vocabulary

feedback, clarification dialogs (user interaction)

Deeper (parsing)

grammatically correct/question fragments

Page 8: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

QuestIO

1.15

1.19

compare

Page 9: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

QuestIO prototype

Page 10: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

QuestIO: User EvaluationUsability testing:

effectiveness: could the tasks could be finished using QuestIOefficiency: how quickly?user satisfaction

System Usability Scale (SUS) subjective (was it easy to formulate a query?, etc.)

Experimental setup: a complete counterbalanced repeated measures, task-based

evaluation design Baseline (search engines) vs. QuestIO

12 subjects familiar with the domain (GATE software) four tasks:

three defined, e.g. ...find parameters of Cebuano gazetteer... one undefined task, ...find anything you want about GATE software...

Page 11: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

QuestIO User Evaluation: ResultsEffectiveness:

the scale from 0 (easy) to 2 (impossible)

0.355 for QuestIO in comparison to 0.895 for baseline, p = 0 .001

Efficiency: the subjects significantly

slower when using baseline (157s) in comparison to QuestIO (107s), p=0.001

User satisfaction: SUS score satisfactory (69.38)

Tasks: defined tasks: user

satisfaction reaching 90%

undefined tasks: user satisfaction low (~44%)

Page 12: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

QuestIO: weaknessesLexical failures: Tokenizer vs. TokeniserConceptual failures:

missing concepts, relations, or bothThe users not being aware of why the

failures happenedCan this be improved with usability

enhancement methods such as feedback and clarification dialogs?

Page 13: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

15

FREyA - Feedback, Refinement, Extended Vocabulary Aggregator

Feedback: showing the user system interpretation of the queryRefinement:

resolving ambiguity: generating dialog whenever one term refers to more than one concept in the ontology (precision)

Extended Vocabulary: expressiveness: generating dialog whenever an “unknown” term

appears in the question (recall) portability: no need for customisation from application developers

The dialog: generated by combining the syntactic parsing and ontology-based

lookup the system learns from the user’s selections

Page 14: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

Feedback: answer is found

Page 15: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

Feedback: No answer is found

Page 16: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

Feedback: User EvaluationUsability testing:

effectiveness efficiency user satisfaction

System Usability Scale (SUS) subjective (was it easy to formulate a query?, etc.)

Experimental setup: 30 subjects outside Sheffield, two domains (GATE software and

US geography) four tasks:

three defined: two repeated from the previous study one where the answer was not available, e.g. ...find states bordering

hawaii... one undefined task, ...find anything you want about GATE software or

rivers, cities, ... in the United States...

Page 17: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

Does the feedback make any difference?Effectiveness: yes , p=0.01, 0.67 for QuestIO, 0.13 for

FREyAEfficiency: no, although the overall result differs (180.5

seconds for QuestIO, 155.27 seconds for FREyA), 2-tailed independent t-test reveals that this difference is not significant (p=0.852)

Query Formulation: for the defined tasks there is no difference in the perception of the difficulty of the supported language (F=5.255, p=0.071), but for the undefined tasks the users believed that the language supported by FREyA is easier! (F=8.016, p=0.015)

Showing that the system knows about certain concepts, but cannot find any relation between them was not clear.

Interactive features were well accepted.

Page 18: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

FREyA Workflow

Page 19: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

ESWC 2010 21

Demo

03 June 2010

http://gate.ac.uk/freya

Page 20: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

22

Evaluation: correctness Mooney GeoQuery dataset, 250 questions

34 no dialog, 14 failed to be answered

Precision=recall=94.4%

Page 21: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

23

Evaluation: Learning 10-fold cross-

validation

202 Mooney GeoQuery questions that could be correctly mapped into SPARQL and required dialog

improvement from 0.25 to 0.48

Errors: ambiguity and sparseness

Page 22: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

Evaluation: Ranking

Mean Reciprocal Rank: 0.76

Page 23: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

Learning the Correct Ranking Randomly selected 103 dialogs from 202 questions (343

dialogs)

MRR increased for 6% from 0.72 to 0.78

Page 24: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

26

Evaluation: Answer Type

45.60%

53.20%

0.01%

Answer TypeCorrect (1 dia-log)

Page 25: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

27

ConclusionCombining syntactic parsing with ontology-

based lookup in an interactive process of feedback and query refinement can increase the precision and recall of NLIs to ontologies,

while reducing the time for customisation by shifting some tasks from application developers to end users.

Page 27: Danica Damljanović University of Sheffield danica@dcs.shef.ac.uk.

More information... D. Damljanovic, M. Agatonovic, H. Cunningham: FREyA: an Interactive Way of

Querying Linked Data, 1st Workshop on Question-Answering over Linked Data, in conjunction with ESWC’11, 2011. (to appear)

D. Damljanovic, M. Agatonovic, H. Cunningham: Natural Language Interfaces to Ontologies: Combining Syntactic Analysis and Ontology-based Lookup through the User Interaction. In Proceedings of the 7th Extended Semantic Web Conference (ESWC 2010), Springer Verlag, Heraklion, Greece, May 31-June 3, 2010. PDF

D. Damljanovic, M. Agatonovic, H. Cunningham: Identification of the Question Focus: Combining Syntactic Analysis and Ontology-based Lookup through the User Interaction. In Proceedings of the 7th Language Resources and Evaluation Conference (LREC 2010), ELRA 2010, La Valletta, Malta, May 17-23, 2010. PDF

D. Damljanovic. Towards portable controlled natural languages for querying ontologies. In Rosner, M., Fuchs, N., eds.: Proceedings of the 2nd Workshop on Controlled Natural Language. Lecture Notes in Computer Science. Springer Berlin/Heidelberg, Marettimo Island, Sicily (September 2010)