M Reich - GenomeSpace

Post on 10-May-2015

627 views 2 download

Tags:

description

Presentation at BOSC2012 by M Reich - GenomeSpace

Transcript of M Reich - GenomeSpace

Michael ReichBroad Institute of Harvard and MIT

July 14, 2012

Gene mutation causing Leigh Syndrome French Canadian Type (LFSC) and 8 other mitochondrial diseases

Integrate: candidate genomic region, mitochondrial proteomic data, and cancer expression compendium

Authors: Mootha et al. 2003, Calvo et al. 2006Subtle repression of oxphos genes in diabetic muscle: role for mitochondrial dysregulation in diabetes pathology; new computational approach Gene Set Enrichment Analysis

Integrate: Gene sets/pathways & processes with expression profiles

Authors: Mootha et al. 2003, Subramanian & Tamayo et al. 2005

IKBKE as a new breast cancer oncogene

Integrate: RNAi screens, transformation of activated kinases, and copy number from SNP arrays of cell linesAuthors: Boehm et al. 2007

~3000 novel, large non-coding RNAs with functions in development, the immune response and cancer

Integrate: Genome sequences from 21 mammals, epigenomic maps, and expression profiles Authors: Guttman, Rinn et al. 2009

Discovery of 3 new genes involved in Glioblastoma Multiforme (NF1, ERRB2, PIK3R1); Confirmation of TP53, PTEN, EGFR, RB1, PIK3CA

Integrate: DNA sequence, copy number, methylation aberrations and expression profiles in 206 glioblastomasAuthors: TCGA Research Network 2008

Characterization of disease subtypes and improved risk stratification for medulloblastoma patients

Integrate: Copy number, expression, clinical data for 96 medulloblastoma patientsAuthors: Tamayo et al., Cho et al. 2011

Need: Insights through integrative studies

Translational Research ExampleGenePattern Cytoscape IGV/UCSC Genomica

Network

Compendium

Expression

Alterations

atcgcgtttattcgataaggatcgcgttttttcgataagg

CMAP

Add Transcription Factor track from UCSC

6

Looks close to p53 site

7 Test for similarity of

p53 and gene location

8

Extractmodule

ii

Learn p53 site/score on

promoteriv

Load compendium Show module

mapi

Show Chromosome

5

Expand +1 (include

neighbors)

4

Show network

3

Differentially Expressed

Genes

1

Idea

GSEA test enrichment

2

iii

Arrests G2/M

Conclusion

vi

Pathwayactivation

Added to GenePattern

v

ODF GMT

HTML

SIF

GCT GXP

NA gene list

GXC GRP

GFF GXA

GenePattern

Cytoscape

IGV

Genomica

CMAP

UCSC Browser

Analysis step

Analysis conclusion

Within tool

Across tools

12 steps, 6 tools, 7 transitions

6 8

2 3 ii iii

4 5 iv v

1 i

• A lightweight “connection layer” for a wide variety of integrative genomic analyseso Support for all types of resource: Web-based,

desktop, etc.o Automatic conversion of data formats between toolso Easy access to data from any locationo Any tool that joins is automatically connected to the

community of toolso Ease of entry into the environment

The Need

Cloud-based storage

API connectivity layer

3 Driving BiologicalProjects

lincRNAs

Cancer stem cells

Patient Stratification

6 Seed Tools

CytoscapeGalaxy

GenePatternGenomica

IGVUCSC Browser

New tools New Biological Projects

Online community to share diverse computational tools

www.genomespace.org

• Aimed towards non-programming users

• Support interoperability through automatic cross-tool data transfer

• Requires minimal changes to tools

www.genomespace.org

GenomeSpace Principles

GS Enabled Tools

Integrative Genomics ViewerCytoscapeGalaxyGenePattern

GenomeSpace Components

Authentication and Authorization

Genome Space Server

Data ManagerAnalysis and Tool Manager

GenomeSpace Project Data

1

2 3

geWorkbench

External Data Sources & Tools

Seed Tools

Cytoscape Galaxy GenePattern

Genomica IGV UCSC Genome Browser

New Tools

InSilicoDB(University of Brussels)

Cistrome(Dana-Farber Cancer Institute)

ArrayExpress(EBI)

geWorkbench(Columbia University)

Reactome(Ontario Institute of Cancer Research)

Recently added

In development

Using GenomeSpace

Cloud-based

filesystem

Tools and Data

SourcesActions

GenomeSpace Actions

GenomeSpace Tool Enablement: IGV

GenomeSpace Tool Enablement: GenePattern

GenomeSpace Tool Enablement: GenePattern

GenomeSpace Data Source Integration: InSilico DB

Other collaborating projects

• Taverna/MyExperiment (University of Manchester)

• National Center for Biomedical Ontology (Stanford University)

DBP3: Studying the regulatory control of human hematopoiesis

DBP3: Studying the regulatory control of human hematopoiesis – Overview

42

2

1 1 1 1 1

2

1

Part 1: Data pre-processingand quality control

Part 3: Studying thetranscriptional program

Part 2: Basic analysis

2

To part 2

From part 11

1

3

2

Genomica

GenePattern

IGV

Cytoscape

4Analysis step (# steps)

Analysis conclusion

Currently integrated

Not yet integrated

Analysis section

Manual step

Galaxy

Optional choices

12

To part 3

2

2

2

3

1

4

1

1

1

2

3 2

1

1

3

Part 4: cis-regulatorysite analysis

2

1

To part 4

From part 2

Frompart 2

1

2

From part 3

To part 5

Part 5: Finding newtranscription factor

regulators

1

2

From part 4

2

2

23

1

1 3+

Part 5: Finding newtranscription factor

regulators

2

From part 4

2

2

3Step 1

Step 2

Step 3

Step 4

Step 1: Create transcription factor dataset in Genomica and save to GenomeSpace

Step 2: Send transcription factor datasets into GenePattern

Step 2: Perform differential expression analysis in GenePattern

Step 3: Send differentially expressed genes to Genomica

Step 3: perform module network analysis in Genomica

Step 4: Visualize regulators with known SNPs and linkage regions

Step 4: Visualize regulators with known SNPs and linkage regions

Clustered

30

Deployment ArchitectureAmazon

ExternalData Sources(e.g., Arrary Express)

Data Manager (DM)

Analysis Task Manager (ATM)

SimpleDB

S3 File transfers

GS ClientsClustered

IdentityService (OpenID)

RE

ST

RE

ST

Pro

ven

an

ce

Deployment Architecture and APIs

GS UI

IGV

RE

ST

RE

ST

Genomica

CD

K

Gene-Pattern

CD

K

Galaxy

Cytoscape

RE

ST

CD

K

UCSC

RE

ST

Join the GenomeSpace community

• Researchers with biological projects• Developers

– Add your tools– Contribute format converters– Build new infrastructure

• Data portals and repositories– Link your resources

Acknowledgements

Broad InstituteTed LiefeldHelga ThorvaldsdottirJim RobinsonMarco OcanaEliot PolkJill Mesirov, PI

GenomeSpace CollaboratorsCytoscape: Trey Ideker Lab, UCSDGalaxy: Anton Nekrutenko Lab, Penn State University Genomica: Eran Segal Lab, Weizmann InstituteUCSC Browser TeamGenePattern TeamIGV Team

Driving Biological ProjectsHoward Chang Lab – Stanford UniversityAviv Regev Lab – Broad Institute Funding

gs-help@broadinstitute.orgwww.genomespace.org