Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes
description
Transcript of Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes
![Page 1: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/1.jpg)
Ontologies and vocabularies supporting data integration:
emphasis on mouse phenotypes and disease model
ControlC3H/HeJ
HomozygousFaslgld/Faslgld
The mouse generalized lymphoproliferative disease (gld) mutation in the FAS ligand (TNF superfamily, member 6) gene.
These mice model human Autoimmune Lymphoproliferative Syndrome; ALPS, type IB
Janan T. EppigPATO Meeting, Dec. 2006
![Page 2: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/2.jpg)
The genetic tools for mouse provide an ideal platform for
experimentation:
• genetic engineering
techniques to
specifically manipulate
the genome• sequenced genome
• Inbred strains
• high resolution maps
• Mammal : small, easy to breed and maintain, short lifespan• Similar to human genetically & physiologically
• human
disease model
• ES cell lines
![Page 3: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/3.jpg)
• short domed skull
• short-limbed dwarfism
• malocclusion
• bulging abdomen as adults
• respiratory problems
• shorted lifespan
Achondroplasia
Homozygous achondroplasia mouse mutant and control
…facilitating the use of the mouse as a model for human biology by providing integrated access to data on the genetics, genomics, and biology of the laboratory mouse.
www.informatics.jax.org
![Page 4: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/4.jpg)
…make phenotype and disease model data robust and
accessible to researchers and computational biologists
• semantically consistent search methods
• integrated access to all phenotypic variation sources (single-gene and genomic mutations, QTLs, strains)
• ability to query across sequence, orthology, expression, function, phenotype, disease
• data on human disease correlation
• access to mouse models from various approaches- Genetic- Phenotypic
- Computational
Objective
![Page 5: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/5.jpg)
Existing Wealth of Mouse Phenotype Data in MGI
>16,800 phenotypic alleles representing ≈6,830 unique genes.
>71,000 annotations associating MP terms to genotypes. >6,550 phenotype records for 3,210 QTL. >9,000 strains catalogued.
![Page 6: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/6.jpg)
A few of the challenges • alleles can produce pleiotropic phenotypic effects
• non-allelic mutations can produce indistinguishable
phenotypes
• modifiers and epistasis can influence mutant phenotypes
• alleles of different genes can interact to produce unique
phenotypes
• genetic background can greatly influence mutant
phenotypes
• imprinted genes/alleles influence phenotype
• quantitative trait loci (QTLs) can contribute unequally to
phenotypes
• genomic mutations can delete or disrupt multiple genes
• strains (“whole-genome”) have characteristic phenotypes
• complex genetically engineered and multiple mutation
stocks are often developed for disease models
• environmental influences and age can dramatically affect
phenotype
![Page 7: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/7.jpg)
Data Challenge
Mouse phenotype data from • publications • electronic submissions• mutagenesis (ENU centers)
(≈ 300 new alleles; ≈ 700 publications per month on phenotypes)
New initiatives to knock-out every gene in the mouse in next 5 years…
Need for efficiency, accuracy, full description of complex observations, storage/analysis of individual and population data
![Page 8: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/8.jpg)
Making semantic sense
Controlled vocabularies/nomenclatures• Strains• Genes• Alleles (phenotypic or variant)• Classes of genetic markers• Types of mutations• Types of assays• Developmental stages• Tissues• Clone libraries• ES cell lines
….. organized as lists or simple hierarchies
![Page 9: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/9.jpg)
CloneLibrary Names
Inbred Strain Names
Gene Symbols
![Page 10: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/10.jpg)
Hbp1 (high mobility group box transcription factor 1) gene expression differences in KitW-e/KitW-e homozygotes vs wild-type
AssayGene nomenclature
Results
Specimen
![Page 11: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/11.jpg)
Semantics plus relationship data
Ontologies/structured vocabularies
• Gene Ontology (GO)• Molecular function
• Biological process
• Cellular component
• Mouse Anatomy (MA)• Embryonic
• Adult
• Mammalian Phenotype (MP)
• Sequence Ontology (SO)
….. organized as directed acyclic graphs (DAGs)
DAGs
![Page 12: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/12.jpg)
Phenotype detail, including genotypes for mouse models of human diseases
Navigating the views of phenotypes & disease
Human/mouse disease
relationships
3.MP Ontolog
y
Summary: genotype, MP term, & ref
1.Gene Page
Summary: phenotype classes & human disease associated
4.Disease
vocabulary
2.Phenotype Query
5.Sequence
(GBrowse)
![Page 13: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/13.jpg)
enlarged brain ventricles
L1camtm1Mtei/Y 129/SvEv none affected
C57BL/6J high percentage affected
postnatal death Gnastm1Kel-pat/Gnas+ 129/Sv * C57BL/6J most die by P2; all by P9
129/Sv * C57BL/6J * CD-1
most die by P9; 10-20% survive past P21
TMEV viral susceptibility
Cd8atm1Mak/Cd8atm1Mak C57BL/6 Inflammation after infection resolves by 45 days; disease is absent by 10 mo.
PL/J viral infection persists
Genotype = allele combinations carried in the context of a specific genetic background (strain)
![Page 14: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/14.jpg)
Mammalian Phenotype Ontology
• Structured as DAG
• Over 4,500 terms covering physiological systems, behavior, development and survival
• Available in browser and OBO formats from MGI ftp and OBO sites
• Each term linked to all annotations to the term or its children
![Page 15: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/15.jpg)
Summary Results
• Genotypes that are annotated to a term or children of the term
• References supporting annotation
• Links to allele detail pages for full mutant phenotype
![Page 16: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/16.jpg)
Allele Detail Page
• full phenotype annotations (MP) for each genotype
• specific detail for MP terms
• each MP annotation referenced
• human diseases for which genotype is used as a model
![Page 17: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/17.jpg)
![Page 18: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/18.jpg)
Mouse model genotypes linked to phenotype details
Genes associated with phenotypes characteristic of a disease in human, mouse, or both
![Page 19: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/19.jpg)
osteopetrosis
Human-mouse disease relationshipsOMIM terms 6,113Genotypes associated w/ OMIM 1,847OMIM associated w/ genotypes 720
to Human Disease and Mouse Model Page
![Page 20: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/20.jpg)
Vocabularies in MGI
DAGs
DefinitionSynonyms
MP:1956
Strain: AEJ
Alleles:bd/bd
Genotype
Strain: C57BL/6
Alleles: Ppp1r3atm1Adpt/ Ppp1r3atm1Adpt
Terms
…
Respiratory failure
Postnatal lethality
Dilated renal tubules
Growth retardation
VocabularyNote
…
J:65378TAS
J:62648IDA
J:65322EE
Annotations
![Page 21: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/21.jpg)
Making Mammalian Phenotype Ontology Work
DAG
• accommodate bio-specific terms• computationally useful• human accessible• practical for curation• cross-reference to other ontologies
![Page 22: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/22.jpg)
Terms in MPMP term Entity Quality Other
Info
microphthalmia
eye small size
hydrocephaly
cerebro-spinal fluid
increased
excessive
brain large size
(dilated)
trauma observed
brain
increased blood pressure
? increased
![Page 23: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/23.jpg)
Future MP Ontology Development
• New terms from ongoing curation process
• Collaborative community efforts• identify new terms • revise organization of existing terms within particular branches
• Recruit domain experts for systematic review
• Cross-ref and comparison to other relevant ontologies (GO, Anatomy, Cell Type, Mpath, etc.)
MP Ontology Growth
0
500
1000
1500
2000
2500
3000
3500
4000
4500
1/1/00 1/2/00 1/3/00 1/4/002003 2004 2005 2006
![Page 24: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/24.jpg)
Collaborators
…currently annotating with MP and contributing to MP development
• Rat Genome Database (RGD)• Mouse Mutagenesis Centers • Human (NCBI/dbSNP)• Online Mendelian Inheritance in Animals
(OMIA)
…under discussion• Teratology Society• Animal Traits
![Page 25: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/25.jpg)
Summary• Structured vocabularies and ontologies support semantic
integration for the MGI system and promote broader integration of mouse knowledge
• To meet community needs, practical implementations parallel formal ontological development
• MGI has implemented a generalized structure for vocabularies and ontologies in MGI
• The Mouse Genome Informatics group continues its strong interest and participation in community bio-ontology efforts
![Page 26: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/26.jpg)
Human FOXN1forkhead box N1
T-CELL IMMUNODEFICIENY, CONGENITAL ALOPECIA, AND NAIL DYSTROPHY
Frank J, et al. Nature 398, 473 - 474 (1999)
Mouse Foxn1Homozygous “nude” mouse. One of eight known phenotypic mutations in mouse (6 spontaneous; 2 engineered) for the forkhead box N1 gene.
www.informatics.jax.org
![Page 27: Ontologies and vocabularies supporting data integration: emphasis on mouse phenotypes](https://reader036.fdocuments.net/reader036/viewer/2022070418/56815966550346895dc6a494/html5/thumbnails/27.jpg)