RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation...

33
RightField The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft, Stuart Owen, Matthew Horridge, Olga Krebs, Wolfgang Mueller Carole Goble

Transcript of RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation...

Page 1: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

RightFieldThe Semantic Annotation of

Experimental Data using Spreadsheets,

Katy Wolstencroft, Stuart Owen, Matthew Horridge,

Olga Krebs, Wolfgang Mueller Carole Goble

Page 2: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

RightField

A tool for embedding ranges of ontology terms into spreadsheets to allow the users of those spreadsheets to add semantic annotations from simple drop-down lists

Page 3: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

RightField

A tool for embedding ranges of ontology terms into spreadsheets to allow the users of those spreadsheets to add semantic annotations from simple drop-down lists

Why? Makes annotation quicker and more efficient Standardises annotation Hides the ontology complexity from the users

Page 4: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Describe experiments and results of experiments

Minimal Information ModelsGuidelines,Checklists,

vocabularies

Managing Biological Data

Necessary for publication, submission to public databases and

sharing

Page 5: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Describe experiments and results of experiments

Minimal Information ModelsGuidelines,Checklists,

Managing Biological Data

MIACA Minimal Information About a Cellular Assay MIAME Minimum Information About a Microarray ExperimentMIAPE Minimum Information About a Proteomics Experiment MIARE Minimum Information About a RNAi Experiment MIASE Minimum Information About a Simulation Experiment

MIBBI >30

Page 6: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Describe experiments and results of experiments

Ontologies and Vocabularies for Annotation

Managing Biological Data

Gene OntologyChEBIMGEDSBO

BioPortal >270 biomedical ontologies

Page 7: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

DataMIBBI Model Ontologies

Microarray MIAME:Minimum Information about a Microarray Experiment

MGED

Proteomics MIAPE: Minimum Information about a Proteomics Experiment

PSI-MI, PSI-MS, PSI-MOD

Interaction experiments

MIMIX:Minimum Information about a Molecular Interaction Experiment

PSI-MI

Protein-Protein Interaction

Systems Biology Models

MIRIAM:Minimal Information Required In the Annotation of biochemical Models

SBO: Systems Biology Ontology

Systems Biology Model Simulation

MIASE:Minimum Information About a Simulation Experiment

KISAO:Kinetic Simulation Algorithm Ontology

Page 8: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

SysMO: Systems Biology of Micro-Organisms

SysMO Consortium Pan-European consortium > 100 research groups > 320 scientists Distributed, interdisciplinary

projects Expected to pool data and

results and disseminate Microbiologists, molecular

biologists, biochemists, mathematicians....not many informaticians

SysMO-DB SysMO-SEEK – a platform for

systems biology data sharing Web based environment for

sharing in the consortium and disseminating to the community

Used in other consortia: Virtual Liver, EraSysBio+,

UNICELLSYS and more....

Page 9: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

SOP

Associating Experiments

Investigation Study Assay

Construction Validation

SOP

SOP

http://isatab.sourceforge.net/

Page 10: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

SOP

Data Templates and Vocabularies

Construction Validation

SOP

SOP

Metabolomics

Metabolomics

Mass Spec

Transcriptomics

Proteomics

Fluxomics

Page 11: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Fitting in with Laboratory practices

Scientists can continue to do what they have always done

Embedding semantics into the tools already in use

Excel, excel, excel.....

Page 12: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Ontology terms for marked-up cells in drop-down boxes

The End Result

Page 13: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Excel Workbook

Ontology“Portion” of ontology terms

Terms Embedded into Excel Workbook

RightField Client

How it Works

Marked-up workbookSaved in plain Excel

Informaticians/ontologists

End Users

Page 14: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

RightField Application

Page 15: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,
Page 16: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Loading Ontologies

Published ontologiesPublished ontologies

Multiple versionsMultiple versions

You can also load local ontologies from file or URL

Page 17: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Loading Ontologies

Page 18: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Excel workbook loaded into

RightField with multiple worksheets

Page 19: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Class hierarchies ofloaded ontologies

Page 20: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Term lists for selected cells

Methods for specifying ontology terms

Selected parent term from the ontology

Page 21: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Excel workbook with marked-up cells

Page 22: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Marking-up Columns or Rows

Page 23: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Ontology terms for marked-up cells in drop-down boxes

The User View

Page 24: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Ontology Information

Ontologies encapsulated Scientists can work offline Ensures same versions of ontologies used for a series

of experiments No special macros or plugins required, just Excel or

Open Office Versions and URIs captured in hidden

worksheets Provenance Comparisons between sheets Linking back to the vocabularies

Page 25: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Provenance

Term LabelThe human readable term label

Term IRIThe (unique) term identifier

Ontology IRI

Ontology Version

The ontology that defines the term

The version of the ontology

Physical LocationThe (web) location of the ontology

Page 26: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

RightField Technologies

OWL APILoading ontologies and reasoning

Apache POI HSSF librariesLoading and saving of Excel Spreadsheets

JavaPlatform Independent

Page 27: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Ontology Languages

RDFS - RDF Schema

OBO - Open Biomedical Ontologies

OWL - Web Ontology Language

Page 28: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

RightField in Use

SysMO – Systems Biology of MicroOrganisms E-Lico - a virtual laboratory for interdisciplinary

collaborative research in data mining and data-intensive sciences. Case Studies in kidney research

BioBanking in the Netherlands

Outside Biology Oil and Gas industry Egyptology specimen classification

Page 29: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Populate

Store / Reuse

ExtractRDF Graph

Using RightField Spreadsheets

Page 30: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Future Developments

Auto-complete Validation of annotation Populating ontology content - Populous

Page 31: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Populous

Generic tool for populating ontology templates Supports validation at the point of data entry Expressive Pattern language for OWL Ontology

generation Helps biologists with ontology design patterns

http://www.e-lico.eu/populous

Simon Jupp, Robert Stevens, University of Manchester

Page 32: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Availability

Open source http://www.rightfield.org.uk

Page 33: RightField The Semantic Annotation of Experimental Data using Spreadsheets, The Semantic Annotation of Experimental Data using Spreadsheets, Katy Wolstencroft,

Acknowledgements

Stuart Owen Katy Wolstencroft Carole Goble

Wolfgang Mueller Olga KrebsMatthew Horridge