Post on 11-Jan-2016
description
Principles for Building Biomedical Ontologies
Talk delivered by Jennifer Clark, GO Editorial Office
Clark et al., 2005
part_of
is_a
Clark et al., 2005
•formal ontology
•information science
•special reference to the bio-medical domain.
Barry Smith
http://ifomis.de
Rama Balakrishnan, David Hill, Jennifer Clark.
http://www.geneontology.org
Slides and content by:
The Rules
1. Univocity
2. Positivity
3. Objectivity
4. Single Inheritance
5. Intelligibility of Definitions
6. Basis in Reality
classes
GO terms, types, kinds, universals
instances
annotated gene product attributes,
tokens, individuals, particulars
1 Univocity:
Terms should have the same meanings on every occasion of use
= bud initiation
= bud initiation
= bud initiation
The Challenge of Univocity: People use the same words to describe different things
Bud initiation? How is a computer to distinguish?
= bud initiation
sensu Metazoa
= bud initiation
sensu Saccharomyces
= bud initiation
sensu Viridiplantae
Univocity: GO adds “sensu” descriptors to discriminate among organisms
Tactile senseTactionTactition
?
The Challenge of Univocity:People call the same thing by different names
Tactile senseTactionTactition
perception of touch ; GO:0050975
Univocity: GO uses 1 term and many characterized synonyms
‘is at times part of’ antlers part_of red deer
‘necessarily is_part’ Seed dormancy part_of seed development
‘necessarily has_part’ Plant embryo part_of seed
Univocity in part_of relation
2 Positivity:
The complements of classes are not themselves classes.
Vertebrates
http://www.cucco.org/CatPictures/Cat%20Nap.jpg
Vertebratesnon-vertebrates
http://www.cucco.org/CatPictures/Cat%20Nap.jpg
Vertebratesnon-vertebrates
http://www.cucco.org/CatPictures/Cat%20Nap.jpg
http://www.digibarn.com/collections/systems/canon-cat/Image53.jpg
Vertebratesnon-vertebratesSet of
all things
http://www.cucco.org/CatPictures/Cat%20Nap.jpg
http://www.digibarn.com/collections/systems/canon-cat/Image53.jpg
VertebratesSet of
all organisms
http://www.cucco.org/CatPictures/Cat%20Nap.jpg
VertebratesInvertebratesSet of
all organisms
http://www.cucco.org/CatPictures/Cat%20Nap.jpg
VertebratesInvertebratesSet of
all organisms
http://www.artalyst.com/files/userimages/user70/21088058-O.preview.jpg
http://www.cucco.org/CatPictures/Cat%20Nap.jpg
membrane-bound organelle
GO:0043227
V. Not a membrane bound organelle
Non-membrane bound organelle
A centrosome is not a membrane bound organelle,but it still may be considered an organelle.
Non-membrane bound organelles
3 Objectivity:
The existence of classesis not dependenton our biological knowledge.
do not designate biological natural kinds.
‘unlocalised’‘unknown’ ‘unclassified’
http://news.bbc.co.uk/1/hi/sci/tech/4501152.stm
Task:
Annotate
molecular function
of 10-4,
a gene from Drosophila melanogaster
molecular function
molecular function unknown
is_a
10-4
Molecular function ontology Annotations
molecular function
molecular function unknown
is_a
10-4
Molecular function ontology Annotations
molecular function
molecular function unknown
is_a
10-4
Molecular function ontology Annotations
4 Single Inheritance:
No class in a classificationhierarchy should have morethan one is_a parent on theimmediate higher level
Clark et al., 2005
part_of
is_a
Rule of Single Inheritance
no diamonds:
C
is_a2
B
is_a1
A
Problems with multiple inheritance
B C
is_a1 is_a2
A
‘is_a’ no longer univocal
(univocal: having only one meaning)
Is_a diamond in GO Process
behavior
locomotory behavior larval behavior
larval locomotory behavior
is_a
behavior
locomotory behavior larval behavior
larval locomotory behavior
behavior of a thingdescriptive
behavior
is_a
Is_a diamond in GO Process
behavior
locomotory behavior
larval behavior
larval locomotory
behavior
is_a1 is_a2
5 Intelligibility of Definitions:
The terms used in a definition should be simplerthan the term to be defined
cellular process
cell differentiation
cell fate cell
Specification development
Is_a
part_of
cell differentiation
osteoblast neuron keratinocyte differentiation differentiation differentiation
adipocyte garland celldifferentiation differentiation
‘X cell differentiation’
is_a
Essence = Genus + Differentiae
Genus: differentiation
Differentiae: a neuron (or x cell)
X cell differentiation
X cell differentiation
Differentiation of an x cell.
X cell differentiation
The process whereby
a relatively unspecialized cell
acquires specialized features
of an x cell.
[List characteristics of x cell.]
cone cell fate commitment retinal_cone_cell
keratinocyte differentiation keratinocyte
adipocyte differentiation fat_cell
dendritic cell activation dendritic_cell
Process ontology Cell Ontology
[Term]id: GO:0030182name: neuron differentiationnamespace: biological_process
def: "The process whereby a relatively unspecialized cell acquires specialized features of a neuron." [GO:mah]
is_a: GO:0030154 ! cell differentiationrelationship: part_of GO:0048699 ! neurogenesis
[Term]id: CL:0000540name: neuron
def: "The basic cellular unit of nervous tissue. Each neuronconsists of a body\, an axon\, and dendrites. Their purposeis to receive\, conduct\, and transmit impulses in the nervous system." [MESH:A.08.663]
xref_analog: FBbt:00005106xref_analog: FBbt:00005146is_a: CL:0000393 ! electrically responsive cellis_a: CL:0000404 ! electrically signaling cellrelationship: develops_from CL:0000031 ! neuroblast
[Term]id: GO:0030182name: neuron differentiationnamespace: biological_process
def: "The process whereby a relatively unspecialized cell acquires specialized features of a neuron. The basic cellular unit of nervous tissue. Each neuron consists of a body\, an axon\, and dendrites. Their purpose is to receive\, conduct\, and transmit impulses in the nervous system." [MESH:A.08.663, GO:mah]
is_a: GO:0030154 ! cell differentiationintersection_of: is_a GO:0030154 ! cell differentiationintersection_of: has_participant CL:0000540 ! neuron
Other Ontologies that can be aligned with GO
Chemical ontologies 3,4-dihydroxy-2-butanone-4-phosphate synthase activity
Anatomy ontologies metanephros development
But Eventually…
Building Ontology
Improve
Collaborate and Learn
6 Basis in Reality:
When building or maintaining an ontology, always think carefullyabout how classes relate to instances in reality
supermanstrengthflightx-ray visionleaps over tall buildings in a single bound
Catwomanstrength,speed,agilityand ultra-keen senses of a cat.
http://www.uncleodiescollectibles.com/doesnotcompute/2004-10-11/Actor%20Christoper%20Reeve.jpg
http://home.austarnet.com.au/davekimble/catwoman.jpg
cartoon character super power ontology
super senses super physical powers
x-ray cat super supervision senses leaping strength
is_a
Annotations
Ontology
cartoon character super power ontology
super senses super physical powers
x-ray cat super supervision senses leaping strength
is_a
Catwoman Catwoman
Annotations
Ontology
Superman
Superman
cartoon character super power ontology
super senses super physical powers
x-ray cat super supervision senses leaping strength
is_a
Catwoman’scat senses
Catwoman’ssuper strength
Annotations
Ontology
Superman’sX-ray vision
Superman’ssuper leaping
cartoon character super power ontology
super senses super physical powers
is_a
Catwoman’scat senses
Catwoman’ssuper strength
Annotations
Ontology
Superman’sX-ray vision
Superman’ssuper leaping
molecular function
binding
tetrapyrrole binding cofactor binding
chlorophyll heme coenzyme quinonebinding binding binding binding
is_a
PSBI
Annotations
Ontology
PSBI
molecular function
binding
tetrapyrrole binding cofactor binding
chlorophyll heme coenzyme quinonebinding binding binding binding
is_a
PSBI’s quinone binding function
Annotations
Ontology
PSBI’s chlorophyll binding function
The Rules1. Univocity: Terms should have the same meanings
on every occasion of use2. Positivity: Terms such as ‘non-mammal’ or ‘non-
membrane’ do not designate genuine classes.3. Objectivity: Terms such as ‘unknown’ or
‘unclassified’ or ‘unlocalized’ do not designate biological natural kinds.
4. Single Inheritance: No class in a classification hierarchy should have more than one is_a parent on the immediate higher level
5. Intelligibility of Definitions: The terms used in a definition should be simpler (more intelligible) than the term to be defined
6. Basis in Reality: When building or maintaining an ontology, always think carefully about how classes relate to instances in reality
END
spare slides follow.
How to define A is_a B
A is_a B =def.
1. A and B are names of universals (natural kinds, types) in reality
2. all instances of A are as a matter of biological science also instances of B
True path violationWhat is it?
chromosome
Mitochondrial chromosome
Is_a relationship
Part_of relationship
nucleus
True path violationWhat is it?
nucleus chromosome
Nuclear chromosome
Mitochondrial chromosome
Is_a relationshipsPart_of relationship
The Importance of synonyms for utility:How do we represent the function of tRNA?
Molecular_function
Triplet_codon amino acid adaptor activity
GO Definition: Mediates the insertion of an amino acid at the correct point in the sequence of a nascent polypeptide chain during protein synthesis.
Synonym: tRNA
Main obstacle to integration
Current ontologies do not deal well with Time and Space and Instances (particulars)
Our definitions should link the terms in the ontology to instances in spatio-temporal reality
7 Distinguish Universals and Instances
Don’t forget instances when defining relations
part_of as a relation between classes versus part_of as a relation between instances
nucleus part_of cell your heart part_of you
•formal ontology
•information science
•special reference to the bio-medical domain.
Barry Smith
http://ifomis.de
Rama Balakrishnan, David Hill, Jennifer Clark.
http://www.geneontology.org
Slides and content by: