© 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr....

45
© 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse 6, D-80339 München http://www.genomatix.de

Transcript of © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr....

Page 1: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

Microarray Evaluation for

Gene Regulation Analysis

Dr. Martin Seifert

Genomatix Software GmbH

Landsberger Strasse 6, D-80339 München

http://www.genomatix.de

Page 2: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

The general goal in microarray analysis

Biological functionality is not directly evident from microarrays

Classification / Diagnostics

Metabolicpathways

Regulatory networks

Disease mechanisms

Microarrays today

?

CellMicroarray experiment

Page 3: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

How to reach the general goal in microarray analysis?

Methods for microarray data analysis

Statistic analysisCellular processes

Literature analysis

Sequence analysis (Genome annotation and promoter analysis)

Genomatix knowledge transfer approach

Page 4: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

Statistical analysis; clustering

What is the biological functionality behind the chip data?

PDGF stimulation of fibroblasts (Demoulin et al. JBC 279, No. 34, 2004; 35392–35402)

Microarray experiment

Evaluation of the role of PDGF in fibroblasts

A real life example

Chip data Cluster Genomatix

Evaluation of chip clusters

PDGF

Page 5: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Technology

Linking genomic sequence analysis and literature mining

Automatic evaluation of gene relationships

Promoter source for functional promoter analysis

Analysis of promoter sequences/

database scans

Page 6: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Analysis strategy

2 Project statistical clusters onto biology and categorization of results by z-scoring (BiblioSphere)

1 Find statistical clusters

3 Analyze functional groups for co-regulation (ElDorado & GEMS)

and find additional potentially co-regulated genes (ModelInspector)

4 Carry out additional statistical analysis

5 Merge results into biological context

Workflow of the project

Analysis Strategy

Page 7: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

Statistic analysisCellular processes

Literature analysis

Sequence analysis

Step 1: Statistical Analysis

Methods for microarray data analysis

Page 8: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Cluster Analysis

1 4 10 24

Significance Analysis for Microarrays (SAM; FDR: 4,3%)

105 of 9928 gene spots are significantly up regulated (Chip: Hver1.2.1)hours PDGF induction

Statistical analyzed microarray data data

Page 9: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

2 Project statistical clusters onto biology and categorization of results by z-scoring (BiblioSphere)

Workflow

Statistic analysisCellular processes

Literature analysis

Sequence analysis

Page 10: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

cluster contains 107 genes

Too many genes for biologicalmeaningful co-regulation

Strategy: knowledge drivensub-clustering

Find functional correlations

Gene Cluster

BiblioSphere: Large Cluster Query

Functional correlations are retrieved by categorization

Characterisation of experimental cluster with BiblioSphere

Page 11: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Knowlege driven sub-clustering

Ontology based functional ranking: Genomatix z-scoring

highest z-score

Page 12: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Knowlege driven sub-clustering

Ontology based functional ranking: Genomatix z-scoring

retrieval of genes overrepresented in the GO-category sterol biosynthesis

Page 13: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

BiblioSphere subgroup analysis: connecting TFs

re-enter the sixoverrepresentdgenes intoBiblioSphere

Gene group analysis

Page 14: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

Towards regulatory networks: connecting TFs

Knowlege driven sub-clustering

Co-citation for HMGCS1, HMGCR, SC4MOL, DHCR7 with SREBF1

Bibliosphere on sentence level; at least 4 co-citations with input genes

Prediction of SREBF1 (EBOX) binding sites in the promoters of HMGCS1, HMGCR andDHCR7

ElDorado

Page 15: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

SREBP1 (=SREBF1) expression is experimentally confirmed

Experimental verification

Page 16: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

3 Analyze functional groups for co-regulation (Gene2promoter & GEMS)and find additional potentially co-regulated genes (ModelInspector)

Genomics subtitle

Workflow

Statistic analysisCellular processes

Literature analysis

Sequence analysis

Page 17: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Sequence analysis

Promoter analysis by GEMS based on ElDorado data

Results from literature analysis are used to guide sequence analysis

Literature analysis Promoter analysis

GEMS

ElDorado + Gene2Promoter

Page 18: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

human

mouse

rat

Comparative genomics of promoters -> phylogenetic conservation

Comparative analysis of promoters within one species -> co-regulation

Sequence analysis

Analysis strategies: Inter-genomic and intra-genomic

107 genes

6 genes sterol synthesis

DHCR24 DHCR7 EBP HMGCR HMGCS1 SC4MOL

Page 19: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Intra-genomic approach

Extraction of the promoters of DHCR24, DHCR7, EBP, HMGCR, HMGCS1, and SC4MOL

ElDorado + Gene2Promoter

Analysis of the promoters of DHCR24, DHCR7, EBP, HMGCR, HMGCS1, and SC4MOL with FrameWorker

GEMS

Comparative promoter analysis (intra-genomic co-regulation)

Frameworks underly functional conservation of promoters

Page 20: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Regulatory genome annotation

Promoter resource ElDorado / Gene2Promoter

ElDorado

Alternative promoters/transcripts

Interconnected to: BiblioSphere GEMS

Regulatory SNPs

Regulatory regions

promoter

Promoter modules

Page 21: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Regulatory genome annotation

Promoter retrieval ElDorado / Gene2Promoter

Page 22: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Regulatory genome annotation

Promoter retrieval ElDorado / Gene2Promoter

Page 23: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Regulatory genome annotation

Promoter retrieval ElDorado / Gene2Promoter

Page 24: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Analysis of promoter organization

Promoter analysis with FrameWorker

Page 25: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

EBOX ECAT ZBPF

Genes sharing framework: DHCR7, EBP, HMGCS1

EBOX (SREBF1) frameworks are found in a subset of the genes

Analysis of promoter organization

Frameworks are conserved in order and distance of TFBSs

Page 26: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

EBOX ECAT ZBPF

EBOX (SREBF1) frameworks are found in a subset of the genes

Analysis of promoter organization

EBOX ECAT ZBPF

EBOX ECAT ZBPF

Page 27: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

ModelInspector search

Beyond the microarray

EBOX ECAT ZBPF

frameworkGenomatix Human promoter database GPD

Page 28: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

Framework # of hits in human promoters

steroid

biosynthesis

z-score

EBOX-ECAT-ZBPF 10 3 13.55

ModelInspector results

Results of database search

highly selective model

no Additional found genes for steroid metabolism so fare...

The selectivity is reduced by modification of the model byincreasing of the distance variability (application of FastM)

Page 29: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

modification of the model with FastM

Model modification

distance variability is increased to 5-100 bp

Page 30: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

additional ModelInspector search

Beyond the microarray

EBOX ECAT ZBPF

framework with modified distance variability

Genomatix Human promoter database GPD

Page 31: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

ModelInspector results

Results of database search

Additional found genes related to steroid metabolism: LSS, MVK, SC5DL, SREBF2

Possibility to re-evaluate statistical results

Framework # of hits in human promoters

four categories related to “steroid

metabolism”

z-score

EBOX-ECAT-ZBPF 389 7 4.43 - 6.35

LSS and MVK are present on chip, up-regulated but not statistically significantSC5DL, is not present on microarray

Page 32: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

Additional framework analysis

All sterol-metabolism related genes identified by microarray analysis, and Modelinspector are included: HMGCS1, MVK, SC5DL, DHCR7, EBP, SREBF2, LSS, HMGCR, SC4MOL, DHCR24

ECAT EGRF ZBPF

Re-analysis of promoter organization

A additional framework consisting of three TFBSs found

It matches 8 of 10 genes input genes: HMGCS1, DHCR7, HMGCR, EBP, LSS; MVK, SC5DL, SREBF2

Page 33: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

Second framework is searched in human promoters by ModelInspector

Is the framework also part of other human Promoters?

ECAT EGRF ZBPF

Genomatix Human promoter database GPD

Several frameworks may be important for sterol-related pathways/networks

Matches may overlap with first framework but are basically distinct

Beyond the microarray

Page 34: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

CYP46A1, FDPS, HMGCR, HSD17B8, OPRS1, SREBF1!, STARD5

ModelInspector results

Results of second database search

SREBF1/2 are potential regulators of the previous framework!

SREBF1/2 may be mediators between the two frameworks identified so far

Framework # of hits in human promoters

four categories related to “steroid

metabolism”

z-score

EBOX-ECAT-ZBPF 961 16 4.36 - 6.25

Page 35: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

4 Carry out additional statistical analysis

Workflow

Statistic analysisCellular processes

Literature analysis

Sequence analysis

Page 36: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

Expression cluster is extended by Pavlidid Template Matching (PTM)

Cluster of 105 significant regulated genes is taken as template

The threshold p-value is 0.1

Cluster is extended to 798 genes (including all 105 initial genes)

Relaxed statistics requires cross-validation by second evidence

Clustering by profile of the initially selected 105 genes

Relaxed statistical approach

Initial profile

Profile cluster

Page 37: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

5 Merge results into biological context

Workflow

Statistic analysisCellular processes

Literature analysis

Sequence analysis

Page 38: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

Comparison of ModelInspector results with profile cluster

52 genes share a common framework and are co-expressed

8 genes belong to the GO-category "steroid biosynthesis":DHCR24, DHCR7, EBP, HMGCR, HMGCS1, LSS, MVK, SC4MOL

Eight genes are associated with steroid metabolism are supported by three linesof evidence:1. Common up-regulation2. Common framework3. Common functional class (GO-annotation)

Merging profile and database searches

Page 39: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Sterol biosynthesis

and regulatory networks

ECAT EGRF ZBPF

EBOX ECAT ZBPF

Acetyl-CoA +AcetoacetylCoA

Acetyl-CoA +AcetoacetylCoA

HMG-CoAHMG-CoA

MevalonatMevalonat

LanosterolLanosterol

CholesterolCholesterol

Page 40: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

Confirmation of results by GNF tissue profiles

ECAT EGRF ZBPFECAT EGRF ZBPF

Example: profile of HMGCS1

Find correlates with cut-off 0.6

Page 41: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Sterol biosynthesis

and regulatory networks

ECAT EGRF ZBPF

EBOX ECAT ZBPF

GNF profile

Page 42: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

Additional gene group: Tubulins1 4 10 241 4 10 241 4 10 24

CDEF EGRF MAZF

Page 43: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix Sterol biosynthesis / cell structure proteins

and regulatory networks

ECAT EGRF ZBPF

EBOX ECAT ZBPF

CDEF EGRF MAZF

Page 44: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

However, the final focus usually is on a few genes (30 or less usually)

Genomatix technology elucidates the biology behind the chip data!

No individual method can reveal networks and pathway mechanisms

An alternating combinatorial approach can achieve this

Evaluation of microarray data

Conclusions

Several independent functional groups may be derived from one chip

All of this is possible based on available tools

Page 45: © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr. Martin Seifert Genomatix Software GmbH Landsberger Strasse.

© 2005 by Genomatix Software GmbH

Genomatix

Let’s have a break…