Identification of Circadian Clock Genes by datamining Microarray data
-
Upload
atreyi-banerjee -
Category
Documents
-
view
229 -
download
1
Transcript of Identification of Circadian Clock Genes by datamining Microarray data
![Page 1: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/1.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
The Identification of Circadian Clock GenesBy Data Mining Microarray Data
Atreyi Banerjee and Martin Hunt
The University of Leicester
June 27, 2008
![Page 2: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/2.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Outline
• Introduction
• How to find circadian clock genes
• Promotor Analysis
![Page 3: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/3.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Outline
• Introduction
• How to find circadian clock genes
• Promotor Analysis
![Page 4: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/4.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
What is circadian rhythm?
Circadian circa (about) + dies (a day) Circadian rhythm is theself-sustained cycle with 24 hour period that controls rest/activitytime awareness, photosynthesis, etc. Common among eukaryotes(Neurospora, Drosophila, Mammals) Reserved for living organisms(daily traffic congestions is not a circadian rhythm) Circannual 1year period(e.g. migration)
![Page 5: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/5.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Circadian rhythm properties
Circadian rhythm properties are conserved across plant and animalkingdom Basic properties of circadian rhythm: Endogenous freerunning period of 24 hours Synchronization of stimuli Period isunchanged with temperature Advantage: learn from studyingsimple organisms (Drosophila, Neurospora, Mouse) Mechanismsare similar but the genes are different The main cycling genes:PER, TIM, CLK, CYC, BMAL
![Page 6: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/6.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Drosophila
Affymetrix gene chip (Drosgenome 1) assay Identifying circadiangenes Clustering and Heatmap Promoter analysis
![Page 7: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/7.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Drosophila circadian oscillator
![Page 8: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/8.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Circadian clock control in Drosophila
ADD REFERENCE
![Page 9: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/9.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Experimentations
Drosophila entrained in 12:12 hour light dark (LD) cycle Then leftin complete darkness and analysed every 4 hours The final datasetincluded replicas of 4 chips CT0, CT4, CT8, CT12, CT16 andCT20
![Page 10: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/10.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Outline
• Introduction
• How to find circadian clock genes
• Promotor Analysis
![Page 11: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/11.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Outline
• Introduction
• How to find circadian clock genes
• Promotor Analysis
![Page 12: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/12.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Promoter analysis
To detect genes having same regulatory mechanism Extracting the5’ untranslated region of the genes Finding out the overrepresented motifs in the sequences Finding out the cis-regulatorymodules (combination of binding sites) in sets of co-expressed orcoregulated genes Getting the putative transcription factor bindingsites (TFBS) Functional analysis
![Page 13: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/13.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Effects of clock mutations on enhancers regulatingcircadian gene expression
Stempfl, T. et al. Genetics 2002;160:571-593
![Page 14: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/14.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
TOUCAN software
An interactive java display Map genes onto the Sequence set spaceFlexibilty of using any identifier(Affy ID, EMBL, Refseq etc)Perform statistical tests for finding regulatory sequences, selectingparts of sequences, finding CpG islands in metazoan genome
![Page 15: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/15.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Predict instances of known motifs with MotifScanner
![Page 16: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/16.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
The Significant motifs found in each cluster
![Page 17: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/17.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Predict cis-regulatory modules with MotifSampler
The co-expression of Dorsal 2 and Myf showing
![Page 18: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/18.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
The cis-regulatory modules in each cluster
![Page 19: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/19.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
The cis-regulatory module in genes listed with p-values
![Page 20: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/20.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Genscan output of cluster 1
![Page 21: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/21.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
List of unknown TFBS found in each cluster
![Page 22: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/22.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
de novo discovery of unknown TFBS
MotifSampler tool in TOUCAN used to find unknown motifs whichcould be novel transcription factors The 5’UTR sequences alsoextracted from Ensembl Biomart The over represented TFBS wereextracted from MATCH and OTFBS Dorsal 2 and Myf were overrepresented modules ARNT also found in cycle an important clockgene, was located Genscan predicted genes in each cluster
![Page 23: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/23.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Outline
• Introduction
• Identifying circadian clock genes
• Promotor Analysis
![Page 24: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/24.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Outline
• Introduction
• Identifying circadian clock genes
• Promotor Analysis
![Page 25: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/25.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Identifying circadian genes: an outline
Microarray experiment
?
Data (spreadsheet)
?
Process data in R
?
Data analysis in R
?List of circadian genes
![Page 26: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/26.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Identifying circadian genes: an outline
Microarray experiment
?
Data (spreadsheet)
?
Process data in R
?
Data analysis in R
?List of circadian genes
![Page 27: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/27.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Identifying circadian genes: an outline
Four methods considered, all of which were implemented in R:
GeneCycle based
• The Fisher Method (Wichert et al. 2004)
• The Robust Method (Ahdesmaki et al. 2005)
“Sine wave” based
• The M&R Method (McDonald & Rosbash 2001)
• The Sine Method
![Page 28: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/28.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
The Fisher Method
Implemented by the R package GeneCycle, based on Fouriermethods and Fisher’s g test
Time Series:
CT0 = 1.2
CT4 = 4.9
CT8 = 9.5
CT12 = 0.4
CT16 = 1.5
CT20 = −42
- Fisher’s g test - p-value = 0.3213
Repeat this process for each time series
![Page 29: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/29.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
The Fisher Method: FDR
Oops! We’ve carried out over 6000 multiple tests.The solution: false discovery rate (FDR) control, implemented bythe R package fdrtool
Definition
The FDR value is the percentage of false-positives we expect to befound in our results
0.011, 0.021, 0.042, 0.045, 0.056, 0.065, 0.066, . . .
![Page 30: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/30.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
The Robust Method
Also implemented by the R package GeneCycle
![Page 31: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/31.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
The M&R Method
The M&R Method
![Page 32: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/32.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
The Sine Method
The Sine Method
![Page 33: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/33.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Heatmap: The Fisher Method
heatmap of Fisher method
![Page 34: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/34.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Heatmap: The Robust Method
![Page 35: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/35.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
The Numbers
How many in genes in common between methods etc
![Page 36: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/36.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Fisher Vs Sine Methods
what’s so different about them?
![Page 37: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/37.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Conclusions
• Why only use sine waves as a model?
• Is FDR really better than multiple testing?
• Why use GeneCycle?
![Page 38: Identification of Circadian Clock Genes by datamining Microarray data](https://reader038.fdocuments.net/reader038/viewer/2022110117/552600584a7959a7488b4de0/html5/thumbnails/38.jpg)
Introduction Promotor Analysis Identifying genes: Methods Identifying genes: Results
Conclusions
• All methods find some circadian clock genes
• . . . and some false positives
• Best approach: use many methods
• There is always a new, better method around the corner . . .