Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa...

20
Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi

Transcript of Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa...

Page 1: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Louisa Casely-Hayford

e-Science

The ISIS Facilities Ontology and OntoMaintainer

Louisa Casely-Hayford and Shoaib Sufi

Page 2: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

ISIS a CCLRC Neutron & Muon Facility

• ISIS is the worlds leading pulsed Neutron & Muon source situated at the CCLRC Rutherford Appleton Laboratory. ISIS supports an international community of around 1600 scientists in a range of scientific disciplines.

• Currently ISIS produces about 700GB of combined Neutron & Muon data each year and this figure is set to rise with the addition of a new target station.

• The ISIS Metadata Catalogue (ICAT) is a twenty year back catalogue of experiments conducted at ISIS it contains approximately 3GB of metadata which references 3TB of data.

• In order to maximise the value of data produced from the ISIS facility, it must be fully searchable.

• To address this problem, e-Science is developing numerous software solutions and ontologies are seen as one of these useful approaches.

Page 3: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Why Ontologies are a useful solution?

• Ontologies offer a powerful means to formally express the nature of a domain.

• To share common understanding of the structure of information among people

• To enable reuse of domain knowledge

• To make domain assumptions explicit

• They provide central controlled vocabularies that can be integrated into catalogues, databases, web publications and knowledge management applications

Page 4: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Why Ontologies?

• Currently ICAT contains over 10,000 keywords describing experiments that are used to index experimental studies

• However this is seen as a limited method

Reasons why keywords are limited

•This is because these free text keywords….

•Have no context

•A great deal of their meaning is implicit to ISIS users

•Hard to map by non-experts to terms used by facilities in the same domain and harder still to those outside

•E.g. The keyword HRPD, is a ‘powder diffractometer’ to ISIS users, however other collaborating Neutron Facilities like SNS at ORNL would understand ‘powder diffractometer’ but not the cryptic HRPD.

Page 5: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

• The creation of ontologies at ISIS will aid in the mapping of concrete manifestations of familiar terms in one domain as well as related concepts in different domains.

• This will facilitate searching of data by category and grouping of data into keywords across studies.

• This could aid in the cross facility searching of related scientific data from the various scientific facilities housed at CCLRC e.g. CLF and DL.

Why Ontologies are a useful Solution?

Page 6: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Building an Ontology

• Defining terms in a domain and relations between them.

– Defining concepts in the domain (classes)

– Arranging the concepts in a hierarchy (subclass-superclass hierarchy)

– Defining which attributes and properties (slots) classes can have and constraints on

their values

– Defining individuals

• Involves collaboration between domain experts and ontology builders.

• Ontologies are expressed in a formal language and developed within an editing environment.

Page 7: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

A Protégé-OWL Ontology

• Classes• Individuals• Properties

A class is a concept in the domain - a class of People - a class of Pets - a class of Countries

A class is a collection of elements with similar properties.

Instances of classes- America can be an instance of the class Country.

Gemma

Mathew

Fluffy

Italy

America

England

Fido

Class Person

Class Pet

Class Country

livesIn

hasSibling

hasPet

Page 8: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Building of the ISIS Facilities Ontology

Examples of keywords in these five categories are:

• HRP00145.RAW - a datafile name.

• HRPD - a High Resolution Powder Diffractometer one of the many instruments used in experiments at the ISIS facility.

• Hydrazinium - an investigation title, chemical names and compounds were used as investigation titles of experiments in ICAT.

• 1986 - the year in which a particular experiment was conducted

• JINR (Joint Institute for Nuclear Research) - the name of an investigator.

• The ISIS facilities ontology is based on keywords in the ISIS Metadata catalogue (ICAT).

• Over 10,000 keywords housed in ICAT and many are synonyms.

• Keywords in ICAT were grouped into 5 main categories:

1. Datafile name

2. Instrument

3. Investigation title

4. Investigator

5. Year

Page 9: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

ISIS Facilities Ontology Hierarchy

Page 10: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Class ISISExperiment

Class DataFile

Class Year

wasConductedIn

hasInvestigator

Class Instrument

Class InvestigatorHRP00145.RAW

1986

Pete Jones

HRPD

Class CrystallographyGroupExperiment

hasUsedInstrument

HydraziniumClass InvestigationTitle

hasTitle

hasDataFileName

Protein Crystallography GroupExperiment

ISIS Facilities Ontology

Page 11: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Page 12: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Further Application of Ontologies to ICAT-ISIS Online Proposal System

• Scientists can submit applications for beamtime at ISIS through an online application form which is known as the ISIS Online Proposal System.

• The ICAT(Metadata catalog) not only holds the 20 year back catalog of data, but will also hold data from approved proposals and data generated from experiments conducted at ISIS.

• Three separate modular ontologies for Sample, Investigator and Experiment are being developed to mark up the Proposal system.

• These ontologies are partly based on the proposal system database schema.

Page 13: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Sample, Investigator and Experiment Ontologies

Sample Investigator Experiment

Page 14: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

OntoMaintainer

• Consensus on Concepts modelled in the ISIS Facilities ontology, was achieved through a series of interviews with domain experts.

• During the design and creation process, there was a difficulty in sharing current versions of the ontology with our collaborators at ISIS.

• This is because to view the hierarchical structure of the ontology, scientists would have to download and install Protégé locally.

• The Ontology Maintainer was developed to facilitate the community in remotely viewing current versions of the ontology.

Page 15: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Screen Shot of OntoMaintainer

Page 16: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Benefits of OntoMaintainer

• It is easily accessible because it is available over the web

• Allows domain experts to contribute towards the maintenance of the ontologies

• Encourages collaboration between domain experts (scientists) and ontology builders by allowing members of the community to be involved in the development and maintenance of ontologies

• Makes collaboration between domain experts and ontology builders more efficient

Page 17: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Future Work

• Completion of Sample, Investigator, Experiment and ISIS Facilities Ontologies

• Ontologies will be used to mark up the ICAT back catalogue and new approved studies submitted through Online Proposal System

• Ontology Maintainer will be improved through the addition of properties to enable relationships between individuals in classes to be shown.

• Graphical view of the total hierarchy of the ontology will be added to the user interface of the Ontology Maintainer.

• Tree hierarchy will be made more dynamic through automatic updating of classes

• Currently creating a Mapping tool to map between the newly created ontologies and existing databases.

Page 18: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Conclusion

• Ontologies will help maximise the value of data collected at ISIS and other CCLRC facilities by improving the access, navigation and reuse of data.

• Ontologies will facilitate the mapping of terms across CCLRC facilities which will allow cross-facility searching e.g. external users will be able to search for all experiments carried out across CCLRC using a powder diffractometer (instrument) even if they do not know the local names of the specific instruments.

• The OntoMaintainer will facilitate the process of creating and maintaining ontologies by providing a means of getting feedback directly from domain experts. It will improve the social aspect of building ontologies by allowing all members of the community to provide input in the building of ontologies.

• Major challenge scope, modularity and integration of ontologies.

Page 19: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

References

• Protégé web site: http://protege.stanford.edu – Documentation– User’s Guide– Tutorial– Protégé -discussion mailing list– Ontology library

– CO-ode http://www.co-ode.org/

Page 20: Louisa Casely-Hayford e-Science The ISIS Facilities Ontology and OntoMaintainer Louisa Casely-Hayford and Shoaib Sufi.

Presenter Name

Facility NameLouisa Casely-Hayford

e-Science

Questions