Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen...

14
Summary of Data Dissemination Working Group 22Jan2016

Transcript of Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen...

Page 1: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

Summary of Data Dissemination Working Group

22Jan2016

Page 2: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

Overview

•  Review of Data Dissemination Working Group – Strategy for data dissemination

•  Testing of model and submission process – Systems Biology Centers test submissions – Current work converting IRD/ViPR to SysBio v2.0

Page 3: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

DDWG Background and objective

•  DDWG started fall 2014 •  Tasked with developing a data dissemination

strategy for all five systems biology centers. •  Key issues: – What types of data should be disseminated? – Where should the data go? – How should the metadata be represented?

Page 4: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

Projected data types for dissemination Original listorder

ExperimentType Analyte Methodology FluDyNeMo Flu-OMICS MaHPIC Omics-4TB OMICS-LHV

# of SysBioCenters

Currentlysupported? Data archives

Disseminationpriority

1 OMICS Type mRNA (transcriptome) microarray No No No Yes Yes 2 Y GEO & BRC 1

2 OMICS Type miRNA microarray No No No Yes Yes 2 Y GEO & BRC 1

3 OMICS Type mRNA (transcriptome) RNA-seq Yes Yes Yes Yes Yes 5 N 1

4 OMICS Type miRNA RNA-seq Yes No Yes 2 N 1

5 OMICS Typemicrobial RNA(metatranscriptome) RNA-seq Yes No No No 1 N 3

6 OMICS Type influenza metagenome RNA-seq Yes No No No 1 N 3

7 OMICS Type bacterial 16S profiling targeted sequencing Yes No No No 1 N 3

8 OMICS Type mRNA (transcriptome)Microfluidic multiplexqRT-PCR No Yes Yes No 2 N 2

9 OMICS Type protein-DNA interactions ChIP-seq No Yes No Yes Yes 3 N 2

10 OMICS Type open chromatin Faire-SEQ No No No No Yes 1 N 2

11 OMICS Type DNA methylation No Yes No No Yes 2 N 2

12 OMICS Type protein (proteome) mass spectrometry No Yes Yes Yes Yes 4 Y Peptide Atlas & BRC 1

13 OMICS Typephosphoproteins(phosphoproteome) mass spectrometry No Yes Yes Yes Yes 4 N 1

14 OMICS Type metabolites (metabolome) mass spectrometry No Yes Yes Yes Yes 4 Y Metabolites & BRC 1

15 OMICS Type lipids (lipidome) mass spectrometry No Yes Yes Yes Yes 4 Y BRC 1

16 OMICS Type protein-protein interactions yeast two hybrid No No No No No 0 N 4

17 OMICS Type protein-protein interactions co-immunoprecipitation No Yes No No Yes 2 N 2

18 Phenotypic Weight Yes Yes Yes No Yes 4 N 1

19 Phenotypic Body Temperature No No Yes No No 1 N 3

20 Phenotypic Virus Titers plaque assay Yes Yes No No Yes 3 N 2

21 Phenotypic Virus genomic RNA levels qPCR No No No No Yes 1 N 3

22 Phenotypic Virus mRNA levels qPCR No No No No Yes 1 N 3

23 Phenotypic Hematology (??)CBC (manual &automated) No No Yes No No 1 N 3

24 Phenotypic Lung Function (??) No No No? No No 0 N 4

25 Phenotypic Clinical Score Direct Observation Yes No No ? No Yes 2 N 2

26 Phenotypic tissue architecture histology with H&E stain Yes Yes? Yes ? Yes Yes 5 N 1

27 Phenotypic protein tissue expression immunohistochemistry Yes No Yes ? No Yes 3 N 2

28 Phenotypic serum antibody ELISA Yes No No Yes No 2 N 2

29 Phenotypic cellular cytotoxicity Cell Titer Go (Promega) No No No No Yes 1 N 3

30 Phenotypic cytokine protein levels cytokine bead arrays Yes No Yes Yes Yes 4 N 1

31 Phenotypic cytokine protein levels ELISA Yes No Yes? No Yes 3 N 2

32 Phenotypic cytokine protein levels Bioplex assay Yes No No No Yes 2 N 2

33 Phenotypic cytokine protein secretion ELISPOT Yes No No No 1 N 3

34 Phenotypic parasitemiathin and thick smearslides No No Yes No No 1 N 3

35 Phenotypic

(MPSS) MacaquePhysiological ScoringSystem [numeric value 0-16] No No Yes No No 1 N 3

36 Phenotypic serum chemical levels iSTAT chem profile No No Yes No No 1 N 3

Page 5: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

Leveraging public archives to store raw and processed data

•  Primary “omics” type data and unstructured metadata to public archives – GEO / SRA / Array Express – PeptideAtlas / Metabolites / massIVE

•  Derived “omics” data and structured metadata to BRCs

•  Phenotypic data –  If no archive exists, BRC will accept data

•  where possible, SysBio metadata standards should be used

Page 6: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

Derived data from SBCs to respective Bioinformatics Resource Centers (BRCs)

Flu-Omics

Page 7: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

Derived data in the form of biosets

– Biosets are interesting interpreted results from an experiment

– Biosets can be directly provided by the SBCs to BRCs or BRCs may choose to generate from processed data

– Bioset example – genes/proteins that are differentially expressed in a: •  comparison of human mock infected and influenza infected

cells after 7 HPI •  comparison of influenza infected wild-type mice and CXCR3

KO mice after 2 days of infection •  comparison of H5N1 infected wild-type mice to H1N1

infected wild-type mice •  comparison of H5N1 at 5 MOI to H5N1 at 1 MOI in human

cells

Page 8: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

Metadata representation

•  Enhancements of SysBio v1.0 in SysBio v2.0 – Added experimental time line using a “Reference

Time Zero (T0)” to support multiple treatment, multiple sampling and complex study designs

– Added “Analysis Workflows” and “Data Processing Events” to capture data transformation and relationships between data

– Added “Disease” and “Disease Course Stage” objects to explicitly capture disease manifestation (previously associated with viral agent)

Page 9: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

Data model and submission process testing

Page 10: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

Getting started

•  One-on-one calls between System Centers and BRCs identified use cases for initial test of metadata standard and submission process

•  Testing results and potential issues to be presented later by individual centers

•  Converting IRD/ViPR previous contract data from SysBio v1.0 to SysBio v2.0 underway

Page 11: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

IRD/ViPR update

•  Have begun implementing data model based on SysBio v2.0 at IRD/ViPR

•  Converting data from previous SBC contracts •  Preparing loading and validation submission

infrastructure •  Updates to UI pending

Page 12: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

IRD/ViPR data model

Study/Experiment Assay Data Analysis

Page 13: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

Conclusion

•  SysBio v2.0 adopted in summer 2015 –  Testing of new data types may require revisions

•  Submissions to begin in 2016 •  Areas still under consideration

•  Controlled vocabulary •  Data formatting •  Data archive selection

– Unified approach?

•  Stable & unique entity identifiers (post-translational modifications, metabolites, etc.)

Page 14: Summary of Data Dissemination Working Group...Abhishek Pratap Omics-4TB Serdar Turkasian Micheleen Harris PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla Created Date

Acknowledgement

EupathDB Brian Brunk Omar Harb Jessica Kissinger

MaHPIC Jessica Kissinger Mary Galinkski Suman Pakala Mustafa Veysi Nural Regina C Joice

Omics-LHV Michelle Craft Kelly Stratton Katrina Waters Amie Eisfeld Miron Livny Allison Thompson

Data Dissemination Working Group

NIAID Vivian Dugan Alison Yao Megan Hoffmann Eric Choi

ViPR/IRD Richard Scheuermann Brian Aevermann

Flu-Omics Sumit Chandra Lars Pache Crystal Herndon Andre Gatarano

Flu-DyNeMo Elodie Ghedin Lauren Lashua Alan Twaddle Abhishek Pratap

Omics-4TB Serdar Turkasian Micheleen Harris

PATRIC Rebecca Will Tom Brettin Rebecca Wattam Maulik Shukla