Collaborative ontology development

43
Collaborative Ontology Development Natasha Noy Stanford University Monday, July 15, 13

description

Natasha Noy's presentation at the SSSW 2013 summer school

Transcript of Collaborative ontology development

Page 1: Collaborative ontology development

Collaborative Ontology Development

Natasha NoyStanford University

Monday, July 15, 13

Page 2: Collaborative ontology development

The ontology development that we grew up with

Courtesy of Mark Musen

Monday, July 15, 13

Page 3: Collaborative ontology development

Lots of databases and sources

The data is in different silos

Need to integrate them

Considerable benefit if you can integrate the data

Ontologies are essential to science

Monday, July 15, 13

Page 4: Collaborative ontology development

Many ontologies today are largeand there are lots of them

• Gene ontology: 28K classes• Foundational Model of Anatomy: >80K classes• NCI Thesaurus: 80K classes• SNOMED CT: >300K classes

Monday, July 15, 13

Page 5: Collaborative ontology development

There are lots of ontologies and more to come

BioPortal has more than 350 ontologiesonly in the field of

biomedicine

Users uploaded more than 230 ontologies to

WebProtégé in the first two months after its release

Monday, July 15, 13

Page 6: Collaborative ontology development

To provide canonical representation of scientific knowledge

To annotate experimental data to enable interpretation, comparison, and discovery across databases

To facilitate knowledge-based applications for decision support, natural language-processing, data integration

and other applications

Scientists have adopted ontologies

Monday, July 15, 13

Page 7: Collaborative ontology development

Ontology development has changed, too

or to any number ofusers anywhere

in the world

from a loneknowledge engineer

to a few distributed

users

Monday, July 15, 13

Page 8: Collaborative ontology development

Courtesy of Mark Musen

Monday, July 15, 13

Page 9: Collaborative ontology development

Collaborative Ontology Development

• Collaborative• Several users contribute to a single developing

ontology• There are mechanisms to carry out discussions and

to reach consensus

• Ontologies• From simple taxonomies• To expressive OWL ontologies

Monday, July 15, 13

Page 10: Collaborative ontology development

Ontologies That Are Being Developed Collaboratively

Monday, July 15, 13

Page 11: Collaborative ontology development

Gene Ontology (GO)

• Developed by the Gene Ontology Consortium• Goal: create a single terminological resource

for annotating genes and gene function from different model organisms:• drosophilla, mouse, e.coli, homo sapiens, ...

• GO: 38,000 classes

Monday, July 15, 13

Page 12: Collaborative ontology development

Monday, July 15, 13

Page 13: Collaborative ontology development

Key Resource: GO Annotations

Manually curated over the past 10 yearsPublicly available

345,000 annotations for homo sapiens

TP53

Gene productGO:0007569

cell aging

GO Term

PubMed article

ManualGO

Annotation

Monday, July 15, 13

Page 14: Collaborative ontology development

Monday, July 15, 13

Page 15: Collaborative ontology development

The Gene Ontology

Terminology for consistent description of gene products

Issue Tracker

Curators of biomedical

databases

GO Curators 3 full-time curators have access to edit GO

Anyone in the community can submit an issue or request

Monday, July 15, 13

Page 16: Collaborative ontology development

Monday, July 15, 13

Page 17: Collaborative ontology development

The NCI Thesaurus

A reference ontology for cancer biology, translational science, and clinical oncology

~20 full-time editors making changes

Changes are not immediately visible

A “lead editor” who approves the changes, and assigns new tasks

Monday, July 15, 13

Page 18: Collaborative ontology development

International Classification of Diseases (ICD)

Have you looked at your medical insurance bill lately?

Monday, July 15, 13

Page 19: Collaborative ontology development

International Classification of Diseases

Monday, July 15, 13

Page 20: Collaborative ontology development

ICD – Why should you care?

Certificate of death

Policy making

Medical bills

Monday, July 15, 13

Page 21: Collaborative ontology development

Developing ICD-10: Revision process in the 20th century

8 Annual Revision Conferences (1982 - 89)

17 – 58 Countries participated

1- 5 person delegations

Mainly Health Statisticians

Manual curation

List exchange

Index was done later

"Decibel” Method of discussion

Output: Paper Copy

Work in English only

Limited testing in the field

Monday, July 15, 13

Page 22: Collaborative ontology development

ICD-11: the 21st century

• ICD-11 is being developed as an OWL ontology• Being developed collaboratively, in an open

editing process• Links to other ontologies, such as SNOMED CT• 33,000 classes

Monday, July 15, 13

Page 24: Collaborative ontology development

ICD-11 development process

• Each night a snapshot of the commonly edited ontology is published in a public platform to encourage feedback from the larger community http://apps.who.int/classifications/icd11/browse/f/en

• Editorial workflow• Centrally overseen by WHO• Peer-reviewed process for the content and structure• Experts may add change proposals• WebProtégé used as the collaborative ontology

development platform

Monday, July 15, 13

Page 25: Collaborative ontology development

Modeling ICD-11: Different views

Monday, July 15, 13

Page 26: Collaborative ontology development

Linearization

Foundation:ICD categories with

Definitions, synonymsClinical descriptionsDiagnostic criteriaCausal mechanismFunctional impact

Primary care

Morbidity

Mortality

Monday, July 15, 13

Page 27: Collaborative ontology development

Multi-Linguality

Monday, July 15, 13

Page 28: Collaborative ontology development

Links to Other Terminologies

Search in BioPortal

Monday, July 15, 13

Page 29: Collaborative ontology development

All properties are reified

Multi-lingualityExternal references

MetadataEvidence

Monday, July 15, 13

Page 30: Collaborative ontology development

related to

linguisticEntity : LinguisticEntity

LanguageTerm

id : xsd:stringlinearizationSpecification* :

LinearizationSpecificationdefinition : DefinitionTermsynonym* : LanguageTermbodyPart* : BodyPartTerm ...

ICDCategory

source : xsd:string label : LinguisticEntity ...

ReferenceTerm

label : xsd:stringlanguage : xsd:string

LinguisticEntity linearizationView : LinearizationValueSetlinearizationParent : ICDCategoryType ...

LinearizationSpecification

id : xsd:string

Term

DomainConceptsubclass of

Courtesy of Tania Tudorache

Monday, July 15, 13

Page 31: Collaborative ontology development

Monday, July 15, 13

Page 32: Collaborative ontology development

Ontology Development as a Collaborative Process

• Ontology development is an inherently collaborative process

• It is also inherently modular, so “stepping on someone else’s toes” is not a big issue

• Users expect Web 2.0-style interaction: • feeds, emails• watched entities• Web interface• social-networking features

Monday, July 15, 13

Page 33: Collaborative ontology development

Dimensions of Collaborative Workflows

• Ontology size• from 100s to 10,000s of concepts

• Size of the community• Contributors (in some form): from 2-3

to dozens

• Editors: from 1-2 to 20

• Control mechanisms• Variety of roles

• Gatekeepers, etc.

• Client-server editing

• Discussion tools• mailing lists, message boards

• face-to-face meetings, telecons

• Synchronization and editing mechanisms• CVS, SVN

Monday, July 15, 13

Page 34: Collaborative ontology development

WebProtégé

Monday, July 15, 13

Page 35: Collaborative ontology development

“Google docs” for ontologies

Monday, July 15, 13

Page 36: Collaborative ontology development

Collaboration Features

• Simultaneous editing• Change tracking• Threaded discussions for ontology entities and changes

(notes, discussions, proposals, reviews)• Watching ontology entities and branches and notifications• Upload and sharing of ontologies• Download any revision of the ontology• Access policies• User interface customization for domain experts• Change analysis and statistics

Monday, July 15, 13

Page 37: Collaborative ontology development

Monday, July 15, 13

Page 38: Collaborative ontology development

Notes and discussions

Monday, July 15, 13

Page 39: Collaborative ontology development

Monday, July 15, 13

Page 40: Collaborative ontology development

Change tracking

Monday, July 15, 13

Page 41: Collaborative ontology development

Watching entities and branches

Monday, July 15, 13

Page 42: Collaborative ontology development

Download any snapshot in time

Monday, July 15, 13

Page 43: Collaborative ontology development

Research Challenges

• Human-Computer Interaction:• How do we enable domain experts to contribute effectively?

• What are the minimal sets of constructs necessary?

• Change analysis:• Are there patterns in how users edit ontologies?

• Can we use these patterns to guide user interfaces?

• Community dynamics:• What are the dynamics in groups that develop ontologies

collaboratively?

• Are there explicit or implicit roles?

• Do roles change over time?

Monday, July 15, 13