Working in Real Time: Building Ontologies While Annotating the Mouse from Genotype to Phenotype

40
Working in Real Time: Building Ontologies While Annotating the Mouse from Genotype to Phenotype Judith Blake, Ph.D. Mouse Genome Informatics The Jackson Laboratory Bar Harbor, ME 04609

description

Working in Real Time: Building Ontologies While Annotating the Mouse from Genotype to Phenotype. Judith Blake, Ph.D . Mouse Genome Informatics The Jackson Laboratory Bar Harbor, ME 04609. Mouse Genome Informatics. Genotype. Expression. Phenotype. Mouse Genome Database Project (MGD) - PowerPoint PPT Presentation

Transcript of Working in Real Time: Building Ontologies While Annotating the Mouse from Genotype to Phenotype

Page 1: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

Working in Real Time: Building Ontologies While

Annotating the Mouse from Genotype to

Phenotype

Judith Blake, Ph.D.Mouse Genome InformaticsThe Jackson LaboratoryBar Harbor, ME 04609

Page 2: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Genotype

Objective:

Facilitate the use of the mouse as a model for human biology by furthering our understanding of the relationship between genotype and phenotype.

PhenotypeExpression

Mouse Genome Informatics

Mouse Genome DatabaseProject (MGD)• Genes and Gene Products• Comparative Analysis• Alleles and Phenotypes

Gene Expression DB Project (GXD)• Embryonic gene

expression• Extensive experimental

data

Mouse Genome Sequence Project (MGS)• Connecting sequence &

biology

Page 3: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

MGI Integration Efforts

Integrated experimental and consensus views

Mapping, molecular, alleles, expression, phenotypesGene to GO associations

Canonical gene and sequenceCollaborations with SWISS-PROT and LocusLinkNomenclature standards, gene groupings

Curated mammalian orthologiesused in collaborations with RatDB, NCBI and others

Index of primary literature Share knowledge from mouse disease models with medical informatics resources

All data associations supported with evidence and citation

Page 4: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Common Issues for Model Organism

DatabasesData Integration• From Genotype to Phenotype• Experimental and Consensus Views

Incorporation of large datasets• Whole genome annotation pipelines• Large scale mutagenesis projects

Computational vs. Literature-based data collection and evaluationData Mining…extraction of new knowledge

Page 5: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002 jblake-Manchester BioInform Wk

Challenges

Genotype• Mouse and Human genome sequences• Integrating genes/models with existing

biological information• Updates, emerging knowledge

Phenotype• Mega-mutagenesis programs• Phenome project / baselines• Standard screens• Integration of mutant information,

targeted mutations, transgenes, expression arrays

Page 6: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002 jblake Manchester BioInfor Wk

Numbers (20 March 2002)

No. of References 70,874

No. of Genes 35,404

No. of Markers 54,834

Genes w/ NT Seq 31,386

Genes w/ AA Seq 12,875

Genes w/ Orthologs 7,051

Genes Mapped 19,058

Page 7: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Strains and Polymorphisms

Access to MGI resources

Genes and Markers

Sequences and Maps

Embryonic Expression

Mammalian Homology

mouse BLAST, molecular segments

Alleles and Phenotypes

References, AccID,

Page 8: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002 jblake Manchester BioInfor Wk

“Show me all genes with their human orthologs located between cM 5 and 7 on Chr. 3 whose gene products localize to the mitochondrial membrane and whose associated mutant phenotypes include ‘skeletal dysmophology”

Enable Complex Queries

Page 9: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

GO annotations

Gene detail page in MGD for the vitamin D receptor gene, Vdr

Page 10: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Orthologs of Vdr

Sets of OrthologsData associations supported by evidence and citation

Page 11: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002 jblake Manchester BioInfor Wk

Multiple Keyword Sets

Gene/Marker TypeAllele TypeAssay Type• Expression• Mapping

Molecular MutationInheritance ModeNomenclature

Evidence CodesTissueCell LinesUnits • Cytogenetic• Molecular

ES Cell LineStrain

Page 12: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Controlled Vocabularies

for Describing Alleles

Allele Query Form

Page 13: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002 jblake Manchester BioInfor Wk

AnatomyGO: • Molecular function, • Biological process, • Cellular component

PhenotypesDisease Models

Structured Vocabularies and Ontologies

Page 14: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Stage 10* embryo

• cavity• proamniotic canal

• embryonic component• ectoderm

• primitive ectoderm• primitive streak

• node• endoderm

• primitive endoderm• mesoderm• nervous system

• neural ectoderm* extraembryonic tissue

• allantois• amniotic fold

• anterior amniotic fold• ectoderm• mesoderm

• posterior amniotic fold• ectoderm• mesoderm

• cavity• amniotic cavity• ectoplacental cavity• exocoelomic cavity• proamniotic canal

• extraembryonic component• yolk sac cavity

• ectoderm• endoderm

• parietal endoderm• visceral endoderm

• mesoderm• primordial germ cells

• Reichert's membrane• trophectoderm

• mural trophectoderm• primary trophoblast giant cells

• polar trophectoderm• cytotrophoblast• ectoplacental cone• syncytiotrophoblast

• yolk sac• endoderm

Anatomical Dictionary Theiler stage 10 (7 dpc)

http://genex.hgu.mrc.ac.uk/Databases/Anatomy/

Collaboration with MRC / Edinburgh 3D-Atlas project

Page 15: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Links between anatomical structures at successive stages of mouse development enable the analysis of differentiation pathways

Page 16: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Alternative anatomical hierarchies

- describe and view anatomy from different anatomical, physiological, and disease perspectives (not just ‘geographical location’, but systems (circulatory) that ‘span geography’

- integrated analysis of expression and phenotype / disease data

Page 17: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Consolidated Anatomical Dictionary| heart | %cardiogenic plate | %primitive heart tube| | <myocardium| | <endocardium| | <cardiac jelly| <aortic sinus | <atrio-ventricular canal (ependymal canal) | <atrio-ventricular cushion tissue (bulbar cushion,ependymal cushion tissue) | <atrium | | %primitive atrium | | %common atrial chamber | | | <common atrial chamber bulbous cordis | | | <common atrial chamber, left part | | | | <common atrial chamber, left part, cardiac muscle (myocardium) | | | | <common atrial chamber, left part, endocardial lining | | | | <common atrial chamber, left part, cardiac jelly | | | <common atrial chamber, right part | | | | <common atrial chamber, right part, cardiac muscle (myocardium) | | | | <common atrial chamber, right part, endocardial lining | | | | <common atrial chamber, right part, cardiac jelly | | <left atrium | | | < left atrium auricular region | | | | <left atrium auricular region cardiac muscle (myocardium) | | | | < left atrium auricular region endocardial lining | | | <left atrium cardiac muscle (myocardium) | | | <left atrium endocardial lining | | <right atrium | | | <right atrium auricular region | | | | <right atrium auricular region cardiac muscle (myocardium) | | | | <right atrium auricular region endocardial lining

| | | <right atrium cardiac muscle (myocardium) | | | <right atrium endocardial lining | | | <right atrium valve | | | | % right atrium venous valve | | < interatrial septum | | | < foramen ovale | | | < septum primum | | | | < foramen primum (ostium primum) | | | | < foramen secundum (ostium secundum) | | | < septum secundum | <endocardial tissue | | <endocardial cushion tissue (bulbar cushion) | | <bulboventricular groove| | <bulbus cordis | | | < bulbus cordis caudal half (myocardium) | | | | <bulbus cordis caudal half cardiac muscle (myocardium) | | | | <bulbus cordis caudal half endocardial lining | | | | <bulbus cordis caudal half cardiac jelly | | | < bulbus cordis rostral half (conotruncus) | | | | < bulbus cordis rostral half cardiac muscle (myocardium) | | | | < bulbus cordis rostral half endocardial lining | | | | < bulbus cordis rostral half cardiac jelly | < heart mesentery | | <dorsal mesocardium (dorsal mesentery of heart) | | | <dorsal mesocardium transverse pericardial sinus | <outflow tract | | <outflow tract aortic component | | <outflow tract aortico-pulmonary spiral septum | | | <outflow tract future ascending aorta | | <outflow tract pulmonary component

94 lines

Page 18: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Page 19: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002 jblake Manchester BioInfor Wk

Biol. Process

Anatomy

Phenotype

Gene expression

Page 20: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Data integration depends on indexing to defined sets of objects.Speaking the same language• ‘Developmen

t’• ‘Heart’

Comparisons between model organisms

Beyond mouse

From The Heart by Margaret Kirby in “Embryos, Genes and Birth Defects”. Edited by Peter Thorogood

Mouse Heart Development

Page 21: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

http://www.geneontology.org

Page 22: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Develop structured vocabularies (ontologies)• Unique ID, Definition, Defined

relationships

Annotate genes /gene products to vocabularies• Evidence and citation

Support common data resource for integrated queries across multiple organisms

Goals of the Consortium

Page 23: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Opens browse

r

Page 24: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Search returns children

Page 25: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Returns annotated

terms

Page 26: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002 jblake Manchester BioInfor Wk

First-Pass Phenotype Set

Page 27: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Ey

Query: genes with mutants classified with term ‘eye dysmorphology’

Page 28: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Page 29: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Genotype/Phenotype

Classification Term Ref

Allele Pair 1

Allele Pair 2

Background

Growth/weight abnormalitypostnatal

1 ApcTm1Rfc/

Tm1Rfc

B6;129F2

Survival: postnatal lethality

1 ApcTm1Rfc/

Tm1Rfc

B6;129F2

Reproductive system: dysmorphology

1 ApcTm1Rfc/

Tm1Rfc

B6;129F2

A genotype consists of zero, one or more allele pairs on a defined genetic background. The genetic background may be an inbred strain, or it may be unknown.

Page 30: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002 jblake Manchester BioInfor Wk

Some Definitions

Trait: measurable characteristic of individual or population• Blood pressure, coat color, % body fat• May be associated with anatomical

structure, e.g., an immune response with its site of action

Phenotype: name for a group of traits, syndrome, condition• e.g., type II diabetes, obesity,

lymphocytic leukemia

Page 31: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002 jblake Manchester BioInfor Wk

a phenotype can be characterized by many traits &a trait can help characterize many phenotypes

Phenotype a Phenotype b Phenotype c

Trait 1 Trait 2 ….. Trait n

Leprdb-3J/Leprdb-3J

Page 32: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002 jblake Manchester BioInfor Wk

• Use existing and develop new controlled vocabularies that cover orthogonal concepts

• Combine terms from these vocabularies to describe traits

• Assign phenotype (disease) terms for nomenclature ease

Joel Richardson, Michael Ashburner, Martin Ringwald

Developing structured descriptors for traits

Page 33: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

System: Immune system, cardiovascular system

Tissue: heart, lung, liver, eye, skin

Cell type: epithelial, fibroblast, myoblast, melanocyte

Age: E15, P25

Biol.Process: apoptosis, growth, cell differentiation, behavior

Metabolite: Glucose, Calcium

Qualifier: abnormal, absent, enlarged, increased, disrupted

Concept Examples

DCS = dolichostenomelia = disproportionally long limbs,

due to long bone overgrow

Page 34: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Relationships of Mouse Models to Human Diseases• Mouse gene ortholog, same mutation

– Same phenotype– Different phenotype

• Mouse gene ortholog, different or unknown mutations

– Same or different phenotypes

• Mouse phenotype same as human– Mouse gene ortholog– Another mouse gene– Gene unknown

• Mouse phenotype similar– Unknown genetic component

Gene same or different

Page 35: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Relationship to human genes and disease

Page 36: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Page 37: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Test Results1676 disease listings in OMIM• 382 have phenotype reports

3187 notated mouse/human orthologs

– 958 correspond to OMIM entries

• 305 have phenotype reports

8535 listings in MESH disease tree• 709 correspond to orthologs• 237 have phenotype reports

Goal: Query Mouse Data by Human

Disease

Page 38: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002 jblake Manchester BioInfor Wk

SummaryIntegration • Requires both manual and computational

approaches• Attention to data modeling, object

identity, data migration issues

Ontologies and standardized vocabularies • Integral component of integration effort• Essential for extracting knowledge

Parallel development • ontology representations • data acquisition and integration efforts

Page 39: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002

Acknowledgments - MGI

Carol BultBen KingRichard Baldarelli

Dirck BradtSridhar RamachandranDeborah ReedDiane DahmanSophia ZhuDonnie QiLongLong Yang

Pat GrantNancy Butler

Janan EppigJoel Richardson

Martin RingwaldJim Kadin

Lois MaltaisLouise McKenzieHarold DrabkinTom WeigersJon BealLori CorbaniCathy LutzCynthia SmithTeresa ChuSharon CousinsDonna BurkartIra LuLi NiCarroll GoldsmithMoyha Lennon-PierceAntonio Planchart

David HillDale BegleyTerry HayamizuIngeborg McCrightConnie Smith

Matt, Mike, Leslie, Jeff, Prita, Jill, Diane, DebbieK, Dieter, Lucette, Janice,

www.informatics.jax.org

Page 40: Working in Real Time:  Building Ontologies While Annotating the Mouse from Genotype to Phenotype

24Mar2002 jblake Manchester BioInfor Wk

Mouse Genome Informatics

http://www.informatics.jax.org

Gene Ontologyhttp://www.geneontology.org