© 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr....
-
Upload
christian-potter -
Category
Documents
-
view
222 -
download
0
Transcript of © 2005 by Genomatix Software GmbH Genomatix Microarray Evaluation for Gene Regulation Analysis Dr....
© 2005 by Genomatix Software GmbH
Genomatix
Microarray Evaluation for
Gene Regulation Analysis
Dr. Martin Seifert
Genomatix Software GmbH
Landsberger Strasse 6, D-80339 München
http://www.genomatix.de
© 2005 by Genomatix Software GmbH
Genomatix
The general goal in microarray analysis
Biological functionality is not directly evident from microarrays
Classification / Diagnostics
Metabolicpathways
Regulatory networks
Disease mechanisms
Microarrays today
?
CellMicroarray experiment
© 2005 by Genomatix Software GmbH
Genomatix
How to reach the general goal in microarray analysis?
Methods for microarray data analysis
Statistic analysisCellular processes
Literature analysis
Sequence analysis (Genome annotation and promoter analysis)
Genomatix knowledge transfer approach
© 2005 by Genomatix Software GmbH
Genomatix
Statistical analysis; clustering
What is the biological functionality behind the chip data?
PDGF stimulation of fibroblasts (Demoulin et al. JBC 279, No. 34, 2004; 35392–35402)
Microarray experiment
Evaluation of the role of PDGF in fibroblasts
A real life example
Chip data Cluster Genomatix
Evaluation of chip clusters
PDGF
© 2005 by Genomatix Software GmbH
Genomatix Technology
Linking genomic sequence analysis and literature mining
Automatic evaluation of gene relationships
Promoter source for functional promoter analysis
Analysis of promoter sequences/
database scans
© 2005 by Genomatix Software GmbH
Genomatix Analysis strategy
2 Project statistical clusters onto biology and categorization of results by z-scoring (BiblioSphere)
1 Find statistical clusters
3 Analyze functional groups for co-regulation (ElDorado & GEMS)
and find additional potentially co-regulated genes (ModelInspector)
4 Carry out additional statistical analysis
5 Merge results into biological context
Workflow of the project
Analysis Strategy
© 2005 by Genomatix Software GmbH
Genomatix
Statistic analysisCellular processes
Literature analysis
Sequence analysis
Step 1: Statistical Analysis
Methods for microarray data analysis
© 2005 by Genomatix Software GmbH
Genomatix Cluster Analysis
1 4 10 24
Significance Analysis for Microarrays (SAM; FDR: 4,3%)
105 of 9928 gene spots are significantly up regulated (Chip: Hver1.2.1)hours PDGF induction
Statistical analyzed microarray data data
© 2005 by Genomatix Software GmbH
Genomatix
2 Project statistical clusters onto biology and categorization of results by z-scoring (BiblioSphere)
Workflow
Statistic analysisCellular processes
Literature analysis
Sequence analysis
© 2005 by Genomatix Software GmbH
Genomatix
cluster contains 107 genes
Too many genes for biologicalmeaningful co-regulation
Strategy: knowledge drivensub-clustering
Find functional correlations
Gene Cluster
BiblioSphere: Large Cluster Query
Functional correlations are retrieved by categorization
Characterisation of experimental cluster with BiblioSphere
© 2005 by Genomatix Software GmbH
Genomatix Knowlege driven sub-clustering
Ontology based functional ranking: Genomatix z-scoring
highest z-score
© 2005 by Genomatix Software GmbH
Genomatix Knowlege driven sub-clustering
Ontology based functional ranking: Genomatix z-scoring
retrieval of genes overrepresented in the GO-category sterol biosynthesis
© 2005 by Genomatix Software GmbH
Genomatix
BiblioSphere subgroup analysis: connecting TFs
re-enter the sixoverrepresentdgenes intoBiblioSphere
Gene group analysis
© 2005 by Genomatix Software GmbH
Genomatix
Towards regulatory networks: connecting TFs
Knowlege driven sub-clustering
Co-citation for HMGCS1, HMGCR, SC4MOL, DHCR7 with SREBF1
Bibliosphere on sentence level; at least 4 co-citations with input genes
Prediction of SREBF1 (EBOX) binding sites in the promoters of HMGCS1, HMGCR andDHCR7
ElDorado
© 2005 by Genomatix Software GmbH
Genomatix
SREBP1 (=SREBF1) expression is experimentally confirmed
Experimental verification
© 2005 by Genomatix Software GmbH
Genomatix
3 Analyze functional groups for co-regulation (Gene2promoter & GEMS)and find additional potentially co-regulated genes (ModelInspector)
Genomics subtitle
Workflow
Statistic analysisCellular processes
Literature analysis
Sequence analysis
© 2005 by Genomatix Software GmbH
Genomatix Sequence analysis
Promoter analysis by GEMS based on ElDorado data
Results from literature analysis are used to guide sequence analysis
Literature analysis Promoter analysis
GEMS
ElDorado + Gene2Promoter
© 2005 by Genomatix Software GmbH
Genomatix
human
mouse
rat
Comparative genomics of promoters -> phylogenetic conservation
Comparative analysis of promoters within one species -> co-regulation
Sequence analysis
Analysis strategies: Inter-genomic and intra-genomic
107 genes
6 genes sterol synthesis
DHCR24 DHCR7 EBP HMGCR HMGCS1 SC4MOL
© 2005 by Genomatix Software GmbH
Genomatix Intra-genomic approach
Extraction of the promoters of DHCR24, DHCR7, EBP, HMGCR, HMGCS1, and SC4MOL
ElDorado + Gene2Promoter
Analysis of the promoters of DHCR24, DHCR7, EBP, HMGCR, HMGCS1, and SC4MOL with FrameWorker
GEMS
Comparative promoter analysis (intra-genomic co-regulation)
Frameworks underly functional conservation of promoters
© 2005 by Genomatix Software GmbH
Genomatix Regulatory genome annotation
Promoter resource ElDorado / Gene2Promoter
ElDorado
Alternative promoters/transcripts
Interconnected to: BiblioSphere GEMS
Regulatory SNPs
Regulatory regions
promoter
Promoter modules
© 2005 by Genomatix Software GmbH
Genomatix Regulatory genome annotation
Promoter retrieval ElDorado / Gene2Promoter
© 2005 by Genomatix Software GmbH
Genomatix Regulatory genome annotation
Promoter retrieval ElDorado / Gene2Promoter
© 2005 by Genomatix Software GmbH
Genomatix Regulatory genome annotation
Promoter retrieval ElDorado / Gene2Promoter
© 2005 by Genomatix Software GmbH
Genomatix Analysis of promoter organization
Promoter analysis with FrameWorker
© 2005 by Genomatix Software GmbH
Genomatix
EBOX ECAT ZBPF
Genes sharing framework: DHCR7, EBP, HMGCS1
EBOX (SREBF1) frameworks are found in a subset of the genes
Analysis of promoter organization
Frameworks are conserved in order and distance of TFBSs
© 2005 by Genomatix Software GmbH
Genomatix
EBOX ECAT ZBPF
EBOX (SREBF1) frameworks are found in a subset of the genes
Analysis of promoter organization
EBOX ECAT ZBPF
EBOX ECAT ZBPF
© 2005 by Genomatix Software GmbH
Genomatix
ModelInspector search
Beyond the microarray
EBOX ECAT ZBPF
frameworkGenomatix Human promoter database GPD
© 2005 by Genomatix Software GmbH
Genomatix
Framework # of hits in human promoters
steroid
biosynthesis
z-score
EBOX-ECAT-ZBPF 10 3 13.55
ModelInspector results
Results of database search
highly selective model
no Additional found genes for steroid metabolism so fare...
The selectivity is reduced by modification of the model byincreasing of the distance variability (application of FastM)
© 2005 by Genomatix Software GmbH
Genomatix
modification of the model with FastM
Model modification
distance variability is increased to 5-100 bp
© 2005 by Genomatix Software GmbH
Genomatix
additional ModelInspector search
Beyond the microarray
EBOX ECAT ZBPF
framework with modified distance variability
Genomatix Human promoter database GPD
© 2005 by Genomatix Software GmbH
Genomatix
ModelInspector results
Results of database search
Additional found genes related to steroid metabolism: LSS, MVK, SC5DL, SREBF2
Possibility to re-evaluate statistical results
Framework # of hits in human promoters
four categories related to “steroid
metabolism”
z-score
EBOX-ECAT-ZBPF 389 7 4.43 - 6.35
LSS and MVK are present on chip, up-regulated but not statistically significantSC5DL, is not present on microarray
© 2005 by Genomatix Software GmbH
Genomatix
Additional framework analysis
All sterol-metabolism related genes identified by microarray analysis, and Modelinspector are included: HMGCS1, MVK, SC5DL, DHCR7, EBP, SREBF2, LSS, HMGCR, SC4MOL, DHCR24
ECAT EGRF ZBPF
Re-analysis of promoter organization
A additional framework consisting of three TFBSs found
It matches 8 of 10 genes input genes: HMGCS1, DHCR7, HMGCR, EBP, LSS; MVK, SC5DL, SREBF2
© 2005 by Genomatix Software GmbH
Genomatix
Second framework is searched in human promoters by ModelInspector
Is the framework also part of other human Promoters?
ECAT EGRF ZBPF
Genomatix Human promoter database GPD
Several frameworks may be important for sterol-related pathways/networks
Matches may overlap with first framework but are basically distinct
Beyond the microarray
© 2005 by Genomatix Software GmbH
Genomatix
CYP46A1, FDPS, HMGCR, HSD17B8, OPRS1, SREBF1!, STARD5
ModelInspector results
Results of second database search
SREBF1/2 are potential regulators of the previous framework!
SREBF1/2 may be mediators between the two frameworks identified so far
Framework # of hits in human promoters
four categories related to “steroid
metabolism”
z-score
EBOX-ECAT-ZBPF 961 16 4.36 - 6.25
© 2005 by Genomatix Software GmbH
Genomatix
4 Carry out additional statistical analysis
Workflow
Statistic analysisCellular processes
Literature analysis
Sequence analysis
© 2005 by Genomatix Software GmbH
Genomatix
Expression cluster is extended by Pavlidid Template Matching (PTM)
Cluster of 105 significant regulated genes is taken as template
The threshold p-value is 0.1
Cluster is extended to 798 genes (including all 105 initial genes)
Relaxed statistics requires cross-validation by second evidence
Clustering by profile of the initially selected 105 genes
Relaxed statistical approach
Initial profile
Profile cluster
© 2005 by Genomatix Software GmbH
Genomatix
5 Merge results into biological context
Workflow
Statistic analysisCellular processes
Literature analysis
Sequence analysis
© 2005 by Genomatix Software GmbH
Genomatix
Comparison of ModelInspector results with profile cluster
52 genes share a common framework and are co-expressed
8 genes belong to the GO-category "steroid biosynthesis":DHCR24, DHCR7, EBP, HMGCR, HMGCS1, LSS, MVK, SC4MOL
Eight genes are associated with steroid metabolism are supported by three linesof evidence:1. Common up-regulation2. Common framework3. Common functional class (GO-annotation)
Merging profile and database searches
© 2005 by Genomatix Software GmbH
Genomatix Sterol biosynthesis
and regulatory networks
ECAT EGRF ZBPF
EBOX ECAT ZBPF
Acetyl-CoA +AcetoacetylCoA
Acetyl-CoA +AcetoacetylCoA
HMG-CoAHMG-CoA
MevalonatMevalonat
LanosterolLanosterol
CholesterolCholesterol
© 2005 by Genomatix Software GmbH
Genomatix
Confirmation of results by GNF tissue profiles
ECAT EGRF ZBPFECAT EGRF ZBPF
Example: profile of HMGCS1
Find correlates with cut-off 0.6
© 2005 by Genomatix Software GmbH
Genomatix Sterol biosynthesis
and regulatory networks
ECAT EGRF ZBPF
EBOX ECAT ZBPF
GNF profile
© 2005 by Genomatix Software GmbH
Genomatix
Additional gene group: Tubulins1 4 10 241 4 10 241 4 10 24
CDEF EGRF MAZF
© 2005 by Genomatix Software GmbH
Genomatix Sterol biosynthesis / cell structure proteins
and regulatory networks
ECAT EGRF ZBPF
EBOX ECAT ZBPF
CDEF EGRF MAZF
© 2005 by Genomatix Software GmbH
Genomatix
However, the final focus usually is on a few genes (30 or less usually)
Genomatix technology elucidates the biology behind the chip data!
No individual method can reveal networks and pathway mechanisms
An alternating combinatorial approach can achieve this
Evaluation of microarray data
Conclusions
Several independent functional groups may be derived from one chip
All of this is possible based on available tools
© 2005 by Genomatix Software GmbH
Genomatix
Let’s have a break…