Interoperability of large scale image data sets from ... · BioMedBridges Annual General...
Transcript of Interoperability of large scale image data sets from ... · BioMedBridges Annual General...
Interoperability of large scale image data sets
from different biological scales
BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence
Jan Ellenberg (WP Leader) and Tanja Ninkovic
on behalf of use case partners Gabriella Rustici
Simon Jupp Frauke Neff
Johan Lundin
Collaborating BMS RIs
2
Scientific problem
Human
Imaging
3
Imaging
Cell
Imaging
Mouse
ü Cellular phenotype ü Genetic information ü Molecular mechanism
ü Tissue phenotype ü Genetic information
ü Tissue phenotype
By linking these three different types of data sets, we can better understand diseases, predict novel drug targets and biomarkers
Genome
Make data interoperable
Predict disease gene/biomarker
Human Disease
Cell Gene knockdown
4
Matching phenotypes in cells and tissues
Cell line – gene knockdown
Human cancer tissue
State of the art: finding a match by chance
5
Prometaphase Metaphase Anaphase Graped micronucleus
To compare and integrate image data we need interoperable standards
Sample Assay Images
Different file formats
Different image
metadata
Zeiss LSM Leica LIF
DeltaVision DV
Volocity MVD2 Olympus OIB
Olympus OIF
OME-TIFF JPEG
HDF5
PNG
6
No consistent phenotype
annotation/ontology
Automated comparative analysis of image data sets was impossible
NDP
To compare and integrate image data we need interoperable standards
Images
Different file formats
Different image
metadata
Zeiss LSM Leica LIF
DeltaVision DV
Volocity MVD2 Olympus OIB
Olympus OIF
OME-TIFF JPEG
HDF5
PNG
7
No consistent phenotype
annotation/ontology NDP
• Inventory of image file formats
• Defined standard tools for interconversion
• Inventory of image metadata formats
• Defined standard tools for interconversion
?
What ontologies are already available?
Cultured human cells
Mouse Histology Tissue samples
Human Histology Tissue samples
Gene Ontology, Cell cycle ontology, Cell line ontology, Cell ontology, Cell culture ontology, Phenotypic Quality Ontology, Mammalian Phenotype Ontology, Fission Yeast Phenotype Ontology, Human Phenotype Ontology
Gene Ontology (BP), Cell ontology, Phenotypic Quality Ontology, Mammalian Phenotype Ontology, Mammalian pathology ontology (MPATH-Pathbase), Adult Mouse Anatomy Dictionary
Human Phenotype Ontology, Terminologica Histologica, Terminologica Embryologica, Human Developmental Anatomy, International Classification of Diseases (ICD), SNOMED CT, BRENDA Tissue Ontology
8
Existing ontologies are not enough
¡ Existing ontologies either lack coverage or are incomplete to describe cellular scale phenotypes
¡ No species neutral ontology for cellular phenotypes
¡ Such ontology is needed for data interoperability
Ø WP6 developed the Cellular Microscopy Phenotype Ontology (CMPO)
9
Building CMPO
Cellular component
Cell types
Size
Temporal quality
Shapes Biological Processes
Abnormal
Absent
Gene Ontology – Biological process
Gene Ontology – Cellular Component
Cell type ontology (CTO)
Phenotype and trait ontology (PATO)
10
Cellular phenotypes: entities, processes and qualities
¡ Phenotype: “Large nucleus” ¡ Entity: nucleus (GO_000xxxx) ¡ Quality: large (PATO_000xxxx)
¡ Phenotype: “Cells stuck in metaphase due to metaphase arrest” ¡ Entity: mitotic metaphase (GO_0000089) ¡ Quality: arrested (PATO_0000297)
11
Building CMPO Composing a phenotype description
Entity a bearer of some quality
Quality characteristic of the entity +
Examples:
Cellular Microscopy Phenotype Ontology (CMPO)
¡ Species neutral ontology
¡ Relating to the whole cell, cellular components, cellular processes and cell populations
¡ Compatible with related ontology efforts (Fission Yeast Phenotype Ontology, Ascomycete Phenotype Ontology, Mammalian Phenotype Ontology) allowing for future cross species integration of phenotypic data
¡ Released in October 2013
¡ Can be browsed at: the Ontology Lookup Service1, Bioportal2 and Github3
12
1 http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=CMPO 2 http://bioportal.bioontology.org/ontologies/CMPO?p=classes 3 https://github.com/EBISPOT/CMPO
Enabling standardised data generation Phenotator: user-friendly ontology annotation of image data
13
Original phenotypic description
Ontology based annotations
http://wwwdev.ebi.ac.uk/fgpt/phenotator/
CMPO term: graped micronucleus CMPO_0000156
CMPO term: graped micronucleus CMPO_0000156
Integrate file formats Integrate metadata
Apply phenotype ontology
Predict disease gene/biomarkers
Human Disease
Cell Gene knockdown
14
Annotation tool
Ontology Terms
Build the Cellular Microscopy Phenotype Ontology (CMPO)
1. Distribute tool to consortium members for phenotype annotation
2. Workshop on ontology development with WP6 partners
3. One to one sessions with data producers
Ontology building using Phenotator
15
Collect phenotype-ontology mappings provided by the users
User 1
User 2
User 3
User 4
Future plans Phenotator: Automation
¡ Semi-automated mapping of cellular phenotypes to CMPO terms
1 http://www.ebi.ac.uk/fgpt/zooma
16
List of phenotypes provided by the user
Zooma mappings to CMPO
Future plans CMPO: integration in existing applications
¡ Widgets1 (in collaboration with WP4)
¡ deployable in existing web applications
¡ autocomplete search boxes
¡ ontology terms are readily available in user-facing applications
¡ Integrating CMPO into FIMM’s Webmiscroscope Portal2 and EMBL’s CellBase.
1http://www.ebi.ac.uk/Tools/biojs/registry/ 2http://biomedbridges.webmicroscope.net/
17
Data producers can utilise ontologies for annotating their data sets already at the data production stage
18
Future plans Scientific Use Case: Correlative analysis and biomarkers prediction
Annotate cellular, mouse and human image datasets using CMPO
Correlative analysis of now interoperable cell and tissue image datasets to predict novel biomarker candidates
Novel candidate biomarker prediction Focus on cell cycle and cell division control genes
Data hosted by Webmicroscope
1 Mitocheck, including genetic information; www.mitocheck.org 2 Helmholtz’s mouse lines, cancer models by GMC, PREDECT, International Mouse Phenotyping Consortium 3 Webmicroscope cancer tissue collection
Cellular image
datasets1
Mouse image
datasets2
Human image
datasets3
Data hosted by EBI Cellular Phenotype Database/ EMBL CellBase
Candidate biomarker genes from cellular tumor suppressor screens have been identified:
¡ MLL3 ¡ PAPPA ¡ SF3B1 ¡ PRPF8 ¡ CENPE ¡ CIT ¡ ASPM ¡ ESPL1 ¡ DYNC1H1 ¡ ASCC3 ¡ KIF4A
Mouse and human tissue WP6 partners are looking for and/or generating the data relevant to these genes to be used for analysis.
19
Correlative analysis and biomarker prediction
Promising gene candidates from cellular screens
¡ MLL3 ¡ PAPPA ¡ SF3B1 ¡ PRPF8 ¡ CENPE ¡ CIT ¡ ASPM ¡ ESPL1 ¡ DYNC1H1 ¡ ASCC3 ¡ KIF4A
Mouse and human tissue WP6 partners are looking for and/or generating the data relevant to these genes to be used for analysis.
20
Cell line ASPM Knockdown
Mouse ASPM Mut
Mouse ASPM WT
Polylobed nucleus
Polylobed nucleus
CMPO_0000157
Correlative analysis and biomarker prediction
Deadline for the deliverable
Making large scale image data sets from different biological scales interoperable
01/2013
Start of WP6
12/2013 Identification of standards and ontologies used for cellular/mouse/human image data sets
-> Inventory of image file formats and ontologies -> Defined future standards
01/2015 12/2015 Set of predicted biomarkers
21
03/2013 12/2014 Mapping of standards and ontologies between the different image reference data sets
-> Cellular Phenotype Ontology and annotation tool
-> Predict new biomarker genes
Infrafrontier
Frauke Neff Philipp Gormanns
Elixir
Gabriella Rustici Simon Jupp
BMS RI partners
BBMRI
Johan Lundin Mikael Lundin
22
Jan Ellenberg Jean-Karim Heriche
Tanja Ninkovic Wolfgang Huber
Euro-BioImaging
Acknowledgments
¡ WP6 partners ¡ James Malone, Tony Burdett and Helen Parkinson, EMBL-EBI ¡ In particular, we wish to thank:
¡ Anna Melidoni, Ruth Lovering and Jennifer Rohn (UCL)
¡ Beate Neumann and Jean Karim Heriche (EMBL) ¡ Bob Van De Water (U. Leiden) ¡ Bram Herpers (OcellO)
¡ Claudia Lukas (U. Copenhagen) ¡ Greg Pau (Genentech)
¡ Sylvia Le Dévédec (LUMC) ¡ Thomas Walter (Institut Curie) ¡ Wies Roosmalen (U. Twente)
¡ Zvi Kam (Weizmann Institute)
23
Thank you for your attention.
24