Interoperability of large scale image data sets from ... · BioMedBridges Annual General...

24
Interoperability of large scale image data sets from different biological scales BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic on behalf of use case partners Gabriella Rustici Simon Jupp Frauke Neff Johan Lundin

Transcript of Interoperability of large scale image data sets from ... · BioMedBridges Annual General...

Page 1: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Interoperability of large scale image data sets

from different biological scales

BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence

Jan Ellenberg (WP Leader) and Tanja Ninkovic

on behalf of use case partners Gabriella Rustici

Simon Jupp Frauke Neff

Johan Lundin

Page 2: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Collaborating BMS RIs

2

Page 3: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Scientific problem

Human

Imaging

3

Imaging

Cell

Imaging

Mouse

ü  Cellular phenotype ü  Genetic information ü  Molecular mechanism

ü  Tissue phenotype ü  Genetic information

ü  Tissue phenotype

By linking these three different types of data sets, we can better understand diseases, predict novel drug targets and biomarkers

Genome

Page 4: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Make data interoperable

Predict disease gene/biomarker

Human Disease

Cell Gene knockdown

4

Page 5: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Matching phenotypes in cells and tissues

Cell line – gene knockdown

Human cancer tissue

State of the art: finding a match by chance

5

Prometaphase Metaphase Anaphase Graped micronucleus

Page 6: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

To compare and integrate image data we need interoperable standards

Sample Assay Images

Different file formats

Different image

metadata

Zeiss LSM Leica LIF

DeltaVision DV

Volocity MVD2 Olympus OIB

Olympus OIF

OME-TIFF JPEG

HDF5

PNG

6

No consistent phenotype

annotation/ontology

Automated comparative analysis of image data sets was impossible

NDP

Page 7: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

To compare and integrate image data we need interoperable standards

Images

Different file formats

Different image

metadata

Zeiss LSM Leica LIF

DeltaVision DV

Volocity MVD2 Olympus OIB

Olympus OIF

OME-TIFF JPEG

HDF5

PNG

7

No consistent phenotype

annotation/ontology NDP

•  Inventory of image file formats

•  Defined standard tools for interconversion

•  Inventory of image metadata formats

•  Defined standard tools for interconversion

?

Page 8: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

What ontologies are already available?

Cultured human cells

Mouse Histology Tissue samples

Human Histology Tissue samples

Gene Ontology, Cell cycle ontology, Cell line ontology, Cell ontology, Cell culture ontology, Phenotypic Quality Ontology, Mammalian Phenotype Ontology, Fission Yeast Phenotype Ontology, Human Phenotype Ontology

Gene Ontology (BP), Cell ontology, Phenotypic Quality Ontology, Mammalian Phenotype Ontology, Mammalian pathology ontology (MPATH-Pathbase), Adult Mouse Anatomy Dictionary

Human Phenotype Ontology, Terminologica Histologica, Terminologica Embryologica, Human Developmental Anatomy, International Classification of Diseases (ICD), SNOMED CT, BRENDA Tissue Ontology

8

Page 9: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Existing ontologies are not enough

¡ Existing ontologies either lack coverage or are incomplete to describe cellular scale phenotypes

¡ No species neutral ontology for cellular phenotypes

¡ Such ontology is needed for data interoperability

Ø WP6 developed the Cellular Microscopy Phenotype Ontology (CMPO)

9

Page 10: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Building CMPO

Cellular component

Cell types

Size

Temporal quality

Shapes Biological Processes

Abnormal

Absent

Gene Ontology – Biological process

Gene Ontology – Cellular Component

Cell type ontology (CTO)

Phenotype and trait ontology (PATO)

10

Cellular phenotypes: entities, processes and qualities

Page 11: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

¡  Phenotype: “Large nucleus” ¡  Entity: nucleus (GO_000xxxx) ¡  Quality: large (PATO_000xxxx)

¡  Phenotype: “Cells stuck in metaphase due to metaphase arrest” ¡  Entity: mitotic metaphase (GO_0000089) ¡  Quality: arrested (PATO_0000297)

11

Building CMPO Composing a phenotype description

Entity a bearer of some quality

Quality characteristic of the entity +

Examples:

Page 12: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Cellular Microscopy Phenotype Ontology (CMPO)

¡  Species neutral ontology

¡  Relating to the whole cell, cellular components, cellular processes and cell populations

¡  Compatible with related ontology efforts (Fission Yeast Phenotype Ontology, Ascomycete Phenotype Ontology, Mammalian Phenotype Ontology) allowing for future cross species integration of phenotypic data

¡  Released in October 2013

¡  Can be browsed at: the Ontology Lookup Service1, Bioportal2 and Github3

12

1 http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=CMPO 2 http://bioportal.bioontology.org/ontologies/CMPO?p=classes 3 https://github.com/EBISPOT/CMPO

Page 13: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Enabling standardised data generation Phenotator: user-friendly ontology annotation of image data

13

Original phenotypic description

Ontology based annotations

http://wwwdev.ebi.ac.uk/fgpt/phenotator/

Page 14: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

CMPO term: graped micronucleus CMPO_0000156

CMPO term: graped micronucleus CMPO_0000156

Integrate file formats Integrate metadata

Apply phenotype ontology

Predict disease gene/biomarkers

Human Disease

Cell Gene knockdown

14

Page 15: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Annotation tool

Ontology Terms

Build the Cellular Microscopy Phenotype Ontology (CMPO)

1.  Distribute tool to consortium members for phenotype annotation

2.  Workshop on ontology development with WP6 partners

3.  One to one sessions with data producers

Ontology building using Phenotator

15

Collect phenotype-ontology mappings provided by the users

User 1

User 2

User 3

User 4

Page 16: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Future plans Phenotator: Automation

¡  Semi-automated mapping of cellular phenotypes to CMPO terms

1 http://www.ebi.ac.uk/fgpt/zooma

16

List of phenotypes provided by the user

Zooma mappings to CMPO

Page 17: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Future plans CMPO: integration in existing applications

¡  Widgets1 (in collaboration with WP4)

¡  deployable in existing web applications

¡  autocomplete search boxes

¡  ontology terms are readily available in user-facing applications

¡  Integrating CMPO into FIMM’s Webmiscroscope Portal2 and EMBL’s CellBase.

1http://www.ebi.ac.uk/Tools/biojs/registry/ 2http://biomedbridges.webmicroscope.net/

17

Data producers can utilise ontologies for annotating their data sets already at the data production stage

Page 18: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

18

Future plans Scientific Use Case: Correlative analysis and biomarkers prediction

Annotate cellular, mouse and human image datasets using CMPO

Correlative analysis of now interoperable cell and tissue image datasets to predict novel biomarker candidates

Novel candidate biomarker prediction Focus on cell cycle and cell division control genes

Data hosted by Webmicroscope

1 Mitocheck, including genetic information; www.mitocheck.org 2 Helmholtz’s mouse lines, cancer models by GMC, PREDECT, International Mouse Phenotyping Consortium 3 Webmicroscope cancer tissue collection

Cellular image

datasets1

Mouse image

datasets2

Human image

datasets3

Data hosted by EBI Cellular Phenotype Database/ EMBL CellBase

Page 19: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Candidate biomarker genes from cellular tumor suppressor screens have been identified:

¡  MLL3 ¡  PAPPA ¡  SF3B1 ¡  PRPF8 ¡  CENPE ¡  CIT ¡  ASPM ¡  ESPL1 ¡  DYNC1H1 ¡  ASCC3 ¡  KIF4A

Mouse and human tissue WP6 partners are looking for and/or generating the data relevant to these genes to be used for analysis.

19

Correlative analysis and biomarker prediction

Page 20: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Promising gene candidates from cellular screens

¡  MLL3 ¡  PAPPA ¡  SF3B1 ¡  PRPF8 ¡  CENPE ¡  CIT ¡  ASPM ¡  ESPL1 ¡  DYNC1H1 ¡  ASCC3 ¡  KIF4A

Mouse and human tissue WP6 partners are looking for and/or generating the data relevant to these genes to be used for analysis.

20

Cell line ASPM Knockdown

Mouse ASPM Mut

Mouse ASPM WT

Polylobed nucleus

Polylobed nucleus

CMPO_0000157

Correlative analysis and biomarker prediction

Page 21: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Deadline for the deliverable

Making large scale image data sets from different biological scales interoperable

01/2013

Start of WP6

12/2013 Identification of standards and ontologies used for cellular/mouse/human image data sets

-> Inventory of image file formats and ontologies -> Defined future standards

01/2015 12/2015 Set of predicted biomarkers

21

03/2013 12/2014 Mapping of standards and ontologies between the different image reference data sets

-> Cellular Phenotype Ontology and annotation tool

-> Predict new biomarker genes

Page 22: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Infrafrontier

Frauke Neff Philipp Gormanns

Elixir

Gabriella Rustici Simon Jupp

BMS RI partners

BBMRI

Johan Lundin Mikael Lundin

22

Jan Ellenberg Jean-Karim Heriche

Tanja Ninkovic Wolfgang Huber

Euro-BioImaging

Page 23: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Acknowledgments

¡  WP6 partners ¡  James Malone, Tony Burdett and Helen Parkinson, EMBL-EBI ¡  In particular, we wish to thank:

¡  Anna Melidoni, Ruth Lovering and Jennifer Rohn (UCL)

¡  Beate Neumann and Jean Karim Heriche (EMBL) ¡  Bob Van De Water (U. Leiden) ¡  Bram Herpers (OcellO)

¡  Claudia Lukas (U. Copenhagen) ¡  Greg Pau (Genentech)

¡  Sylvia Le Dévédec (LUMC) ¡  Thomas Walter (Institut Curie) ¡  Wies Roosmalen (U. Twente)

¡  Zvi Kam (Weizmann Institute)

23

Page 24: Interoperability of large scale image data sets from ... · BioMedBridges Annual General Meeting/Mid-term Review 11 March 2014, Florence Jan Ellenberg (WP Leader) and Tanja Ninkovic

Thank you for your attention.

24