An Introduction to Anatomy Ontologies Phenotype RCN Feb 23, 2012
description
Transcript of An Introduction to Anatomy Ontologies Phenotype RCN Feb 23, 2012
An Introduction to Anatomy Ontologies
Phenotype RCN Feb 23, 2012
Melissa Haendel
Setting the stage1. Who are we? What do we need? Why
are we here?2. What is an anatomy ontology?3. What kinds of anatomy ontologies
exist? 4. How are anatomy ontologies used?5. Anatomical evidence
Who are we? Domain Experts:Anatomists, comparative morphologists,developmental biologists, immunologists, neuroscientists, etc.
Ontologists:Biologists-gone-informatics, computer scientists and logicians
Engineers:Our tool builders
Domain experts: want to query for gene expression and phenotypes across species
Ontologists: have to be able to interpret and
represent domain knowledge
computationally
Engineers: have to build tools that can
consume ontologies and give the
Domain Experts the right results
Therefore, we build ontologies that are intelligible to:
Domain experts Machines
Comparison of structures across different organisms, scales Standardization of anatomical vocabulary among and between
communities Integration of anatomical data across databases Query across large amount of data Automatic reasoning to infer related classes Error checking Annotation consistency
We want to enable:
OntologistsEngineers
OMIM Query # Records
“large bone” 785
“enlarged bone” 156
“big bone” 16
“huge bones” 4
“massive bones” 28
“hyperplastic bones” 12
“hyperplastic bone” 40
“bone hyperplasia” 134
“increased bone growth” 612
Anatomical information retrieval from text-based resources
Less than ideal.
Why build an anatomy ontology? A simple example
Number of genes annotated to each of the following brain parts in an ontology:
brain 20part_of hindbrain 15part_of rhombomere 10
Query brain without ontology 20Query brain with ontology 45
Ontologies can facilitate grouping and retrieval of data
There are many useful ways to classify parts of organisms:
its parts and their arrangement its relation to other structures
what is it: part of; connected to; adjacent to, overlapping?
its shape its function its developmental origins its species or clade its evolutionary history
Cajal 1915, “Accept the view that nothing in nature is useless, even from the human point of view.”
An ontology is a classification
appendage
antenna forewing
wing
hindwing
Relationships record classifications too
leg
part_of some ‘thoracic segment
wing
‘leg’ SubClassOf part_of some thoracic segment
It is difficult to keep track of multipleclassification chains to: • ensure completeness;• avoid redundancy;• Incorrect inheritance of classification
criteria from a distant superclass
Multiple inheritance is very hard to manage by hand
The knowledge in an ontology can make the reasons for classification explicit
Any sense organ that functions in the detection of smell is an olfactory sense organ
sense organcapable_of some detection of smell
olfactory sense organ
nose
sense organ
nose
capable_of some detection of smell
Classifying
sense organcapable_of some detection of smell
olfactory sense organ
nose
Compositionality and avoiding asserted multiple inheritance
We can logically define composed classes and create complex definitions from simpler ones
aka: building blocks, cross-products, logical definitionsDescriptions can be composed at any time
Ontology construction time (pre-composition) Annotation time (post-composition)
Formal necessary and sufficient definitions + a reasoner
Automatic (and therefore manageable) classification Requires subtype classification, so apart from the root
term(s), no term should lack an is_a parent.
Let the reasoner do the work!
Example of a post-composed anatomical entity
Plasma membrane of spermatocyte• Plasma membrane [GO CC]• Spermatocyte [Cell Ontology]
a plasma membrane which is part_of a spermatocyte
Gene Ontology Basic Formal Ontology Cell Ontology
Genus Differentia
chemical entities
Many perspectives, many ontologies
grossanatomy
tissues
cellscellanatomy
proteins
phenotypes
clinical disorders
processes
physiological processes
development
reactions
cellular processes
behavior
evolutionary characters
nervous systemneural crest
What kinds of anatomy ontologies exist?Mouse
MA (adult) EMAP / EMAPA (embryonic)
Human FMA (adult) EHDAA2 (CS1-CS20)
Amphibian AAO XAO
Fish ZFA (zebrafish) MFO (medaka) TAO (teleosts)
Nematode WBbt (c elegans)
Arthropod FBbt (Drosophila) TGMA (Mosquito) HAO (hymenoptera) Arthropod anatomy ontology
Plant ontology
Species-centric and multi-species ontologiesSpecies neutral ontologies
CARO (common anatomy reference ontology)
Uberon (cross-species anatomy)vHOG (vertebrate homologous organs)
CL (cell ontology)
GO (gene ontology)
Phenotype ontologies
MP mammalian phenotype
HP human phenotype
WB worm phenotype
Species-centric ontologiesThe Zebrafish Anatomy Ontology
Used to record gene expression and phenotypes at different stages of development
Ontologies built for one species will not work for others
http://fme.biostr.washington.edu:8080/FME/index.html
http://ccm.ucdavis.edu/bcancercd/22/mouse_figure.html
Multi-species anatomy ontologies
Seed plants(Angiosperms and
Gymnosperms)
Pteridophytes(Ferns and Lycopods)
Bryophytes(Mosses, Hornworts
and Liverworts)
Algae
Bowman et al, Cell, 2007
The Plant Ontology
Challenge is in representing diversity in anatomy, morphology, life cycles, growth patterns
Example of complexity arising from multiple species-contexts
erythrocyte
cell
nucleate cell enucleate cell
not applicable in all contexts
Example of complexity arising from multiple species-contexts
erythrocyte
nucleate erythrocyte
enucleate erythrocyte
cell
nucleate cell enucleate cell
zebrafish nucleate
erythrocyte
human erythrocyteZFA:0009256
… …
CL:0000562
CL:0000232
CL:0000592
FMA:81100
species ontologiesattached at appropriatelevel
Developmental Biology, Scott Gilbert, 6th ed.
Using reasoners to detect errors
Fruit fly FBbt ‘tibia’ Human FMA ‘tibia’
UBERON: tibia
UBERON: bone
is_a
is_a
is_a
Vertebrata
Drosophila melanogaster
part_of
Homo sapiens
is_a
only_in_taxon
part_of
disjoint with
✗
The Gene Ontology has an anatomy ontology
Look ma, no pons!
human
zebrafish
Phenotype ontologies also have inherent anatomy
Designed primarily for annotation of phenotypes within a single species
WBbt C. elegansphenotype
Representing different levels of granularity
lateral line development
?
?
cilium part_of hair cell part_of neuromast
hair cell part_of neuromast
neuromast part_of lateral line
GO
cilium development
hair cell development
neuromast development
lung
lung
respiratory gaseous exchange
lobular organ
parenchymatous organ
solid organ
pleural sac
thoracic cavity organ
thoracic cavity
multicellular organismal process
abnormal lung morphology
abnormal respiratory system morphology
GO
MPO
MA
FMA
abnormal pulmonary acinus morphology
abnormal pulmonary alveolus morphology
lungalveolus
respiratory system process
organ system
respiratory system
Lower respiratory
tract
alveolar sac
pulmonary acinus
organ system
respiratory system
EHDAA2
lung
lung bud
respiratory primordium
pharyngeal region
develops_frompart_of
is_a (SubClassOf)
surrounded_by
The problem: Data Silos
How to synchronize anatomy ontologies
Mapping Direct reconciliation Synchronization using imports/MIREOT
Three approaches:
There are issues with mappingsClass A Class B In Bioportal? Useful?
FMA extensor retinaculum of wrist
MA retina Yes No
FMA portion of blood MA blood No Yes
ZFA Macula MA macula Yes No
ZFA aortic arch MA arch of aorta Yes Dubious
ZFA hypophysis MA pitiuitary No Yes
FMA tibia FBbt tibia Yes No
FMA colon GAZ Colón, Panama Yes No
PATO male Chebi maleate 2(-) Yes No
Zebrafish terms are is_a subtypes of teleost terms
is_a
Zebrafish Anatomy Teleost Anatomy Ontology
Reconciliation and linking between TAO and ZFA
Logic implemented via Xrefs- difficult to keep synchronized
The Common Anatomy Reference OntologyCARO is a structural classification based on
granularity
From the bottom up:Cell componentCellPortion of tissueMulti-tissue structure
From the top down:Organism subdivisionAnatomical system
Acellular structuresNote: CARO is being updated to be more interoperable, include logical definitions, and functional differentia
Synchronization by import across ontologies
One can import a whole ontology or just portions of another ontologyMIREOT: Minimum information to reference an external ontology term
CARO
VAO
Present TAO Modularized ontology
Uberon – a multi-species ontology for phenomics and evo-devo analyses
Uberon.org
anatomical structure
endoderm of forgut
lung bud
lung
respiration organ
organ
foregut
alveolus
alveolus of lung
organ part
FMA:lung
MA:lung
endoderm
GO: respiratory gaseous exchange
MA:lung alveolus
FMA: pulmonary
alveolus
is_a (taxon equivalent)
develops_frompart_of
is_a (SubClassOf)
capable_of
NCBITaxon: Mammalia
EHDAA:lung bud
only_in_taxon
pulmonary acinus
alveolar sac
lung primordium
swim bladder
respiratory primordium
NCBITaxon:Actinopterygii
Uberon classes generalize species-specific ones, and connect to other ontologies via a variety of relations
OntoFox: a Web Server for MIREOTing Good things: Based on MIREOT principle Web-based data input and output Output OWL file can be directly imported in your ontology No programming needed Programmatically accessible
Improvements: Integration into ontology editing tools More customizable
http://ontofox.hegroup.org
Proposed model moving forward
Maintain series of ontologies at different taxonomic levels- euk, plant, metazoan, vertebrate, mollusc, arthropod,
insect, mammal, human, drosophila Each ontology imports/MIREOTs relevant subset of
ontology “above” it- this is recursive
Subtypes are only introduced as needed Work together on commonalities at appropriate
level above your ontology
zebrafish
caro / uberon/allcell tissue
metazoa
muscletissue
vertebrata
mesonephros
limb
arthropoda
antenna
teleost
weberian ossicle
mammalia
mammary gland
nervous system
mollusca
foot
cephalopod
tentacle
mantle
drosophila
neuron types XYZ
mushroom body
brachial lobe
NO pons
vertebravertebralcolumn
circulatory system
appendage
mesoderm
gut
tibia
gland
bone
skeletaltissue
parietalbone
fin
gonad
trachea
respiratoryairway
cross-ontologylink (sample)
amphibia
tibiafibula
larva
shellcuticle
skeleton
import
mouse human
Leveraging an integrated set of ontologies
Not all classification is useful
Be practical: Build ontologies for what you need and for what can be reused
About thirty years ago there was much talk that geologists ought only to observe and not theorise; and I well remember some one saying that at this rate a man might as well go into a gravel-pit and count the pebbles and describe the colours.C. Darwin
Ontologies can help reconcile annotation inconsistencies
Semantic Similarity of Phenotypes
"Linking Human Diseases to Animal Models Using Ontology-Based Phenotype Annotation." PLoS Biol 7(11): e1000247. doi:10.1371/journal.pbio.1000247 Washington NL, Haendel MA, Mungall CJ, Ashburner M, Westerfield M, Lewis SE
FMA+PATO MP ZFA+PATO FBbt+PATO
A
C
B
D EVertebrata
Ascidians
Arthropoda
Annelida
Mollusca
Echinodermata
tetrapod limbs
ampullae
tube feet
parapodia
Querying for genes in similar structures across species
Panganiban et al., PNAS, 1997
Distal-less orthologs participate in distal-proximal pattern formation and appendage morphogenesis
mouse limbsea urchin tube feet
ascidian ampulla polychaete parapodia
Anatomy ontologies in 2012 Identify key points of integration between ontologies Modularize based on domain or taxon
Import and reuse rather than cross-referencing or “aligning”
Let the reasoner help do the work Work together to distribute work
Reproduced with permission, Jason Freenyhttp://web.mac.com/moistproduction/flash/index.html
Anatomical evidence: what is it, and why do we care about it?
What is evidence?
Synaptolaemus cingulatusAMNH 91095
Draw prepared specimen
Drawing about anatomical entitym
aterial_processing
is_ou
tput
OBI: Interpreting Data-phenotypic assessment
Phenotype (character) annotation:S. Cingulatus: mesethmoid
narrowis_input
is_ou
tput
cleared and stained for cartilage and bone
OBI:processed specimen
is_in
put
OBI:imaging assay
OBI:Conclusion (textual entity)
OBI:Specimen
OBI:Image
Sidlauskas and Vari, Zoological Journal of the Linnean Society, 2008, 154, 70–210
Brian, 2008, maybe in Venezuela
ECO:000000XImaging assay evidence
Anatomical evidence is cumulative and synergistic
Synaptolaemus cingulatusAMNH 91095
mesethmoidnarrow
.
. OBI: Interpreting Data
Schizodon fasciatus INPA 21606
mesethmoidwide
..
is_input
is_ou
tput
Brian, 2008
Phylogeny construction using PAUP* 4.0 Beta 10
phylogeny
OBI:Conclusion
ECO:0000080phylogenetic evidence
Caenotropus maculosusUSNM 231545
mesethmoidnarrow
ECO:0000071morphological similarity evidence
The means to the end mattersSynaptolaemus cingulatus
AMNH 91095Mesethmoid
.
. OBI: Interpreting Data
Schizodon fasciatus INPA 21606
mesethmoidwide
..
is_input
is_ou
tput
Brian, 2008
Phylogeny construction using PAUP* 4.0 Beta 10
phylogeny
OBI:Conclusion
ECO:0000080phylogenetic evidence
Caenotropus maculosusUSNM 231545
mesethmoidnarrow
ECO:0000071sequence similarity evidence
So what should one do about evidence?
• Keep in mind that as you record your phenotype data, the means by which you obtained it can matter later one
• Others may want to use your data, and they too will care
• You may find that how you know what you know depends on the means to the end
• You can work with ECO and OBI to get the terms you need for your work
Acknowledgments Jonathan Bard Marcus Chibucos Wasila Dahdul Paula Mabee Chris Mungall David Osumi-Sutherland Alan Ruttenberg Erik Segerdell Carlo Torniai Matt Yoder Jie Zheng AND numerous others
Larson, October 1987