Interoperability of large scale image data sets from ... · BioMedBridges Annual General...

Post on 18-Aug-2020

0 views 0 download

Transcript of Interoperability of large scale image data sets from ... · BioMedBridges Annual General...

Interoperability of large scale image data sets

from different biological scales

BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence

Jan Ellenberg (WP Leader) and Tanja Ninkovic

on behalf of use case partners Gabriella Rustici

Simon Jupp Frauke Neff

Johan Lundin

Collaborating BMS RIs

2

Scientific problem

Human

Imaging

3

Imaging

Cell

Imaging

Mouse

ü  Cellular phenotype ü  Genetic information ü  Molecular mechanism

ü  Tissue phenotype ü  Genetic information

ü  Tissue phenotype

By linking these three different types of data sets, we can better understand diseases, predict novel drug targets and biomarkers

Genome

Make data interoperable

Predict disease gene/biomarker

Human Disease

Cell Gene knockdown

4

Matching phenotypes in cells and tissues

Cell line – gene knockdown

Human cancer tissue

State of the art: finding a match by chance

5

Prometaphase Metaphase Anaphase Graped micronucleus

To compare and integrate image data we need interoperable standards

Sample Assay Images

Different file formats

Different image

metadata

Zeiss LSM Leica LIF

DeltaVision DV

Volocity MVD2 Olympus OIB

Olympus OIF

OME-TIFF JPEG

HDF5

PNG

6

No consistent phenotype

annotation/ontology

Automated comparative analysis of image data sets was impossible

NDP

To compare and integrate image data we need interoperable standards

Images

Different file formats

Different image

metadata

Zeiss LSM Leica LIF

DeltaVision DV

Volocity MVD2 Olympus OIB

Olympus OIF

OME-TIFF JPEG

HDF5

PNG

7

No consistent phenotype

annotation/ontology NDP

•  Inventory of image file formats

•  Defined standard tools for interconversion

•  Inventory of image metadata formats

•  Defined standard tools for interconversion

?

What ontologies are already available?

Cultured human cells

Mouse Histology Tissue samples

Human Histology Tissue samples

Gene Ontology, Cell cycle ontology, Cell line ontology, Cell ontology, Cell culture ontology, Phenotypic Quality Ontology, Mammalian Phenotype Ontology, Fission Yeast Phenotype Ontology, Human Phenotype Ontology

Gene Ontology (BP), Cell ontology, Phenotypic Quality Ontology, Mammalian Phenotype Ontology, Mammalian pathology ontology (MPATH-Pathbase), Adult Mouse Anatomy Dictionary

Human Phenotype Ontology, Terminologica Histologica, Terminologica Embryologica, Human Developmental Anatomy, International Classification of Diseases (ICD), SNOMED CT, BRENDA Tissue Ontology

8

Existing ontologies are not enough

¡ Existing ontologies either lack coverage or are incomplete to describe cellular scale phenotypes

¡ No species neutral ontology for cellular phenotypes

¡ Such ontology is needed for data interoperability

Ø WP6 developed the Cellular Microscopy Phenotype Ontology (CMPO)

9

Building CMPO

Cellular component

Cell types

Size

Temporal quality

Shapes Biological Processes

Abnormal

Absent

Gene Ontology – Biological process

Gene Ontology – Cellular Component

Cell type ontology (CTO)

Phenotype and trait ontology (PATO)

10

Cellular phenotypes: entities, processes and qualities

¡  Phenotype: “Large nucleus” ¡  Entity: nucleus (GO_000xxxx) ¡  Quality: large (PATO_000xxxx)

¡  Phenotype: “Cells stuck in metaphase due to metaphase arrest” ¡  Entity: mitotic metaphase (GO_0000089) ¡  Quality: arrested (PATO_0000297)

11

Building CMPO Composing a phenotype description

Entity a bearer of some quality

Quality characteristic of the entity +

Examples:

Cellular Microscopy Phenotype Ontology (CMPO)

¡  Species neutral ontology

¡  Relating to the whole cell, cellular components, cellular processes and cell populations

¡  Compatible with related ontology efforts (Fission Yeast Phenotype Ontology, Ascomycete Phenotype Ontology, Mammalian Phenotype Ontology) allowing for future cross species integration of phenotypic data

¡  Released in October 2013

¡  Can be browsed at: the Ontology Lookup Service1, Bioportal2 and Github3

12

1 http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=CMPO 2 http://bioportal.bioontology.org/ontologies/CMPO?p=classes 3 https://github.com/EBISPOT/CMPO

Enabling standardised data generation Phenotator: user-friendly ontology annotation of image data

13

Original phenotypic description

Ontology based annotations

http://wwwdev.ebi.ac.uk/fgpt/phenotator/

CMPO term: graped micronucleus CMPO_0000156

CMPO term: graped micronucleus CMPO_0000156

Integrate file formats Integrate metadata

Apply phenotype ontology

Predict disease gene/biomarkers

Human Disease

Cell Gene knockdown

14

Annotation tool

Ontology Terms

Build the Cellular Microscopy Phenotype Ontology (CMPO)

1.  Distribute tool to consortium members for phenotype annotation

2.  Workshop on ontology development with WP6 partners

3.  One to one sessions with data producers

Ontology building using Phenotator

15

Collect phenotype-ontology mappings provided by the users

User 1

User 2

User 3

User 4

Future plans Phenotator: Automation

¡  Semi-automated mapping of cellular phenotypes to CMPO terms

1 http://www.ebi.ac.uk/fgpt/zooma

16

List of phenotypes provided by the user

Zooma mappings to CMPO

Future plans CMPO: integration in existing applications

¡  Widgets1 (in collaboration with WP4)

¡  deployable in existing web applications

¡  autocomplete search boxes

¡  ontology terms are readily available in user-facing applications

¡  Integrating CMPO into FIMM’s Webmiscroscope Portal2 and EMBL’s CellBase.

1http://www.ebi.ac.uk/Tools/biojs/registry/ 2http://biomedbridges.webmicroscope.net/

17

Data producers can utilise ontologies for annotating their data sets already at the data production stage

18

Future plans Scientific Use Case: Correlative analysis and biomarkers prediction

Annotate cellular, mouse and human image datasets using CMPO

Correlative analysis of now interoperable cell and tissue image datasets to predict novel biomarker candidates

Novel candidate biomarker prediction Focus on cell cycle and cell division control genes

Data hosted by Webmicroscope

1 Mitocheck, including genetic information; www.mitocheck.org 2 Helmholtz’s mouse lines, cancer models by GMC, PREDECT, International Mouse Phenotyping Consortium 3 Webmicroscope cancer tissue collection

Cellular image

datasets1

Mouse image

datasets2

Human image

datasets3

Data hosted by EBI Cellular Phenotype Database/ EMBL CellBase

Candidate biomarker genes from cellular tumor suppressor screens have been identified:

¡  MLL3 ¡  PAPPA ¡  SF3B1 ¡  PRPF8 ¡  CENPE ¡  CIT ¡  ASPM ¡  ESPL1 ¡  DYNC1H1 ¡  ASCC3 ¡  KIF4A

Mouse and human tissue WP6 partners are looking for and/or generating the data relevant to these genes to be used for analysis.

19

Correlative analysis and biomarker prediction

Promising gene candidates from cellular screens

¡  MLL3 ¡  PAPPA ¡  SF3B1 ¡  PRPF8 ¡  CENPE ¡  CIT ¡  ASPM ¡  ESPL1 ¡  DYNC1H1 ¡  ASCC3 ¡  KIF4A

Mouse and human tissue WP6 partners are looking for and/or generating the data relevant to these genes to be used for analysis.

20

Cell line ASPM Knockdown

Mouse ASPM Mut

Mouse ASPM WT

Polylobed nucleus

Polylobed nucleus

CMPO_0000157

Correlative analysis and biomarker prediction

Deadline for the deliverable

Making large scale image data sets from different biological scales interoperable

01/2013

Start of WP6

12/2013 Identification of standards and ontologies used for cellular/mouse/human image data sets

-> Inventory of image file formats and ontologies -> Defined future standards

01/2015 12/2015 Set of predicted biomarkers

21

03/2013 12/2014 Mapping of standards and ontologies between the different image reference data sets

-> Cellular Phenotype Ontology and annotation tool

-> Predict new biomarker genes

Infrafrontier

Frauke Neff Philipp Gormanns

Elixir

Gabriella Rustici Simon Jupp

BMS RI partners

BBMRI

Johan Lundin Mikael Lundin

22

Jan Ellenberg Jean-Karim Heriche

Tanja Ninkovic Wolfgang Huber

Euro-BioImaging

Acknowledgments

¡  WP6 partners ¡  James Malone, Tony Burdett and Helen Parkinson, EMBL-EBI ¡  In particular, we wish to thank:

¡  Anna Melidoni, Ruth Lovering and Jennifer Rohn (UCL)

¡  Beate Neumann and Jean Karim Heriche (EMBL) ¡  Bob Van De Water (U. Leiden) ¡  Bram Herpers (OcellO)

¡  Claudia Lukas (U. Copenhagen) ¡  Greg Pau (Genentech)

¡  Sylvia Le Dévédec (LUMC) ¡  Thomas Walter (Institut Curie) ¡  Wies Roosmalen (U. Twente)

¡  Zvi Kam (Weizmann Institute)

23

Thank you for your attention.

24