A Tool for Collaborative Construction of Large Biological Ontologies

31
Iowa State University Department of Computer Science Artificial Intelligence Research Laboratory A Tool for Collaborative Construction of Large Biological Ontologies Jie Bao a , Zhiliang Hu b , Doina Caragea a , James Reecy b , Vasant G Honavar a a Artificial Intelligence Research Laboratory, Department of Computer Science a Center for Computational Intelligence, Learning, and Discovery b Department of Animal Science, Iowa State University, Ames, IA 50011, USA Email: {baojie, zhu, dcaragea, jreecy, honavar}@iastate.edu

Transcript of A Tool for Collaborative Construction of Large Biological Ontologies

Page 1: A Tool for Collaborative Construction of Large Biological Ontologies

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

A Tool for Collaborative Construction of Large Biological Ontologies

Jie Baoa, Zhiliang Hub, Doina Carageaa, James Reecyb, Vasant G HonavaraaArtificial Intelligence Research Laboratory, Department of Computer Science

aCenter for Computational Intelligence, Learning, and DiscoverybDepartment of Animal Science,

Iowa State University, Ames, IA 50011, USAEmail: {baojie, zhu, dcaragea, jreecy, honavar}@iastate.edu

Page 2: A Tool for Collaborative Construction of Large Biological Ontologies

2

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Outline

• Collaborative Ontology Building (COB) Desiderata

• Limitations of CVS-based Collaboration

• COB-based on Modular Ontologies

• The COB Editor

Page 3: A Tool for Collaborative Construction of Large Biological Ontologies

3

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Large Biological Ontologies

Gramineae Taxonomy

Plant Ontology

Gene Ontology

MGED Ontology

(microarray)

Page 4: A Tool for Collaborative Construction of Large Biological Ontologies

4

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Non-collaborative Ontology Building

DownloadOntology Local Editing

UploadOntology

(single curator)

(Protégé) (OBO-Edit)

Page 5: A Tool for Collaborative Construction of Large Biological Ontologies

5

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Collaboration In NeedExample: Gene Ontology Consortium

Page 6: A Tool for Collaborative Construction of Large Biological Ontologies

6

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Collaboration In Need (2)

Swine

Cattle Chicken

Horse

Each group works on an ontology module for a particular species (according to the group’s best expertise)

Example 2: an animal trait ontology that involves multiple research groups across the world

Page 7: A Tool for Collaborative Construction of Large Biological Ontologies

7

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Challenges

• Knowledge Integration

• Concurrence Management

• Consistency Maintenance

• Privilege Management

• History Maintenance

• Scalability

Page 8: A Tool for Collaborative Construction of Large Biological Ontologies

8

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Solutions1. Pipeline

• Divide the ontology building process into sequential phrases

• Each phrase is assigned to a particular contributor.

2. CVS• Treat an ontology as a monolithic file/document;• use collaborative tools like CVS to build the ontology.

3. Modular Ontology • Build the ontology with fine-grained modules; • Different contributors can concurrently edit different

modules.

<= Very limited collaboration

<= Collaboration with high cost

<= Our approach

Page 9: A Tool for Collaborative Construction of Large Biological Ontologies

9

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Outline

• Collaborative Ontology Building (COB) Desiderata

• Limitations of CVS-based Collaboration

• COB-based on Modular Ontologies

• The COB Editor

Page 10: A Tool for Collaborative Construction of Large Biological Ontologies

10

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

CVS-based Ontology Building

Get GO CVS Account

Get Source Forge Account

Set Up CVS Access

Submit Change Request

Track the Request

User submit change suggestion

(in natural language)

Get Source Forge Account

Take a Change Request

Curator

Download Whole GO Flat File

Local Editing

Make Local Log File

Save GO Flat File

Version Control

Commit Whole New Ontology to CVS

Page 11: A Tool for Collaborative Construction of Large Biological Ontologies

11

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Unprincipled Authorization and Organization

• No principled mechanism to ensure curator privilege assignments,

• No clear organizational division of the whole ontology into smaller manageable units.

Page 12: A Tool for Collaborative Construction of Large Biological Ontologies

12

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Risk of Inconsistency

• No principled way to avoid unintended couplings and over-writing.

• The validity and consistency of the ontology are heavily dependent on the curator discipline and good community communications (e.g., via email lists).

Page 13: A Tool for Collaborative Construction of Large Biological Ontologies

13

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Lack of Partial Editing/Reuse

• A curator has to – download the entire ontology, before

editing,

– and submit the entire modified ontology, after editing;

• A user cannot download and reuse only a selected subset of the ontology

Page 14: A Tool for Collaborative Construction of Large Biological Ontologies

14

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Expensive History Maintenance

• Even a minor edit of the ontology causes the ontology file to be replicated in its entirety

• Tracing the changing history of a term requires processing the entire ontology file for comparisons

Page 15: A Tool for Collaborative Construction of Large Biological Ontologies

15

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Limited Participation

• Since all editing has global effect, it is diffcult to – grant privileges scope to different types of

users (e.g., core curators versus normal curators)

– accept/deny/modify/revert local changes made by other curators

• The curator community has to belimited to a small number of trusted curators.

Page 16: A Tool for Collaborative Construction of Large Biological Ontologies

16

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Outline

• Collaborative Ontology Building (COB) Desiderata

• Limitations of CVS-based Collaboration

• COB based on Modular Ontologies

• The COB Editor

Page 17: A Tool for Collaborative Construction of Large Biological Ontologies

17

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Basic Strategy

• Localize the interactions among different parts of a large ontology.

• Build an ontology with fine-grained organizational structure.

• Allow group collaboration on different ontology modules.

Page 18: A Tool for Collaborative Construction of Large Biological Ontologies

18

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Package-based Ontologies

• The whole ontology consists of a set of packages

• Each package represents a fragment of the whole ontology

• Each term has a "home package"

General Cattle

Pig Chicken

Animal Trait ontology

EggChicken

ReproductionGeneral

Page 19: A Tool for Collaborative Construction of Large Biological Ontologies

19

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Package Nesting• A nested package is a part of

another package• Could be used to represent

the organizational structure of an ontology– Arrange knowledge– Enforce hierarchical

management of knowledge

General

Pig

Pig Health

Animal trait ontology

Page 20: A Tool for Collaborative Construction of Large Biological Ontologies

20

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Division of Labor

• A package can be assigned to curators with the best knowledge of the relevant sub-domain. – e.g. Pig Health, Pig Reproduction

• The package hierarchy helps to manage interactions among experts with different degrees of expertise.– e.g. Pig, Pig Health

Page 21: A Tool for Collaborative Construction of Large Biological Ontologies

21

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Partial Reuse

General Cattle

Pig Chicken

Animal Trait Ontology(Centralized)

Pork

General

Pig

Cattle

Chicken

Pork

Animal Trait Ontology(Package-based)

Semantic importing

Knowledge incorporated in Pork ontology

Knowledge not presented in Prok ontologyLegend:

Page 22: A Tool for Collaborative Construction of Large Biological Ontologies

22

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Scaleability• Reduction in communication overhead and

computational time cost – Parsing– Transfering– Consistency check

• Reduction in memory requirements– Ontology can be partially loaded into memory

• Reduction in history tracking cost– Effect of changes is localized

Page 23: A Tool for Collaborative Construction of Large Biological Ontologies

23

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Broadened Participation• Open-community collaboration success witnessed

by DMOZ and Wikipedia• Package-based ontology management can

– Control the scope of an editing action– Minimize the risk of vandalization

• Better tradeoff between broader participation and ontology quality– There are different levels of curators, e.g. ontology

admins, pig experts, pig health experts.– An editing action can be approved or denied by a

curator with higher privileges

Page 24: A Tool for Collaborative Construction of Large Biological Ontologies

24

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Outline

• Collaborative Ontology Building (COB) Desiderata

• Limitations of CVS-based Collaboration

• COB-based on Modular Ontologies

• The COB Editor

Page 25: A Tool for Collaborative Construction of Large Biological Ontologies

25

Iowa State University Department of Computer ScienceArtificial Intelligence Research LaboratoryThe COB Editor

Pig Package

Cattle Package

Chicken Package

[BIDM06 Paper] a.k.a [8]

Page 26: A Tool for Collaborative Construction of Large Biological Ontologies

26

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Collaborative Ontology BuildingOntology modularity facilitates collaborative building• Each package can be independently developed• Different curators can concurrently edit the

ontology on different packages• Ontology can be only partially loaded• Unwanted interactions are minimized by limiting

term and axiom visibility• Module access privileges can be controlled by the

package hierarchy

Page 27: A Tool for Collaborative Construction of Large Biological Ontologies

27

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Work with COB Editor

Download

• http://www.animalgenome.org/bioinfo/projects/ATO/

• http://sourceforge.net/projects/cob (source code)

Get Ontology Account

Check out a package

CuratorCreate new

package

or Lock Package

Edit the Package

Commit the Package

(Auto) Server Change Log

Page 28: A Tool for Collaborative Construction of Large Biological Ontologies

28

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

More Features

• Support import/export from/to OWL and OBO format– can be used for Gene Ontology and others

• Ontology shared on a database server

• Allows multi-relational hierarchies– e.g. both is-a and part-of

• Visibility of a term can be controlled by scope limitation modifiers– e.g. public, private, protected

Page 29: A Tool for Collaborative Construction of Large Biological Ontologies

29

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Conclusions

• Modular ontologies can improve collaborative ontology building in many aspects

• Package-based Ontology offers an "importing" based ontolog language.

• COB Editor provides the necessary tool to collaboratively build well-structured, large-scale, biomedical ontologies

Page 30: A Tool for Collaborative Construction of Large Biological Ontologies

30

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Future Work

• Support of inference and consistency checking

• Accommodation and modularization of existing ontologies, e.g. GO, EC, SCOP

• Support of ontology mapping and ontology integration

• Support of more expressive ontologies, e.g. UMLS, SNOMED

Page 31: A Tool for Collaborative Construction of Large Biological Ontologies

31

Iowa State University Department of Computer ScienceArtificial Intelligence Research Laboratory

Thanks!