Chris J. Sullivan, Ph.D. Department of Biological Sciences

48
Chris J. Sullivan, Ph.D. Chris J. Sullivan, Ph.D. Department of Biological Sciences Department of Biological Sciences Microarrays: Gene Expression Data Microarrays: Gene Expression Data to Biological Insight to Biological Insight

description

Microarrays: Gene Expression Data to Biological Insight. Chris J. Sullivan, Ph.D. Department of Biological Sciences. The Scientific Method. Observation. Before a testable hypothesis and experiments comes?. Microarrays - Global Gene Expression Hypothesis Generation. - PowerPoint PPT Presentation

Transcript of Chris J. Sullivan, Ph.D. Department of Biological Sciences

Page 1: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Chris J. Sullivan, Ph.D.Chris J. Sullivan, Ph.D.Department of Biological SciencesDepartment of Biological Sciences

Microarrays: Gene Expression Data to Microarrays: Gene Expression Data to Biological InsightBiological Insight

Page 2: Chris J. Sullivan, Ph.D. Department of Biological Sciences

The Scientific Method

Before a testable hypothesis and experiments comes?

Observation

Microarrays - Global Gene ExpressionHypothesis Generation

Page 3: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Microarrays: tools for gene expression

A microarray is a solid support (such as a membraneor glass microscope slide) on which DNA of knownsequence is deposited in a grid-like array.

RNA is isolated from matched samples of interest.The RNA is typically converted to cDNA, labeled withfluorescence (or radioactivity), then hybridized tomicroarrays in order to measure the expression levelsof thousands of genes.

Page 4: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Fast Data on 20-50,000 genes in days

Comprehensive Entire genome represented on 1-2 chip(s)

Flexible • Countless organisms available• Custom arrays can be made

to represent genes of interest

Easy You can submit RNA samples to a core facility for analysis

Cheap? Chip set representing 47,000 genes for $350 Robotic spotter/scanner cost $100,000In-house much cheaper, time consuming

Advantages of microarray experiments

Page 5: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Observation

Microarrays - Global Gene ExpressionHypothesis Generation

Generate hypotheses about the mechanisms underlying observed phenotypes (disease)

Ability to uncover unanticipated connections

Page 6: Chris J. Sullivan, Ph.D. Department of Biological Sciences

What can you do with information about the expression of 10,000’s of genes?

Examples?

•Breast cancer samples that appear the same in tissue appearance but why different survival of patients?

•Genes involved in biological processes

•Genes involved in disease pathogenesis

•Pathways for drug targets; Pathways targeted by drugs!

Page 7: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Cost Many researchers can’t afford to doappropriate controls, replicates

RNA Do mRNA levels reflect Protein expression?significance

Quality Cross hybridizationcontrol* Imperfections on arrays leading to error

Difficulty of data analysis: statistics to evaluateIn-house; repeatability by others?

Disadvantages of microarray experiments

*this is less of an issue as the technology matures and becomes more common place: use of commercial arrays

Page 8: Chris J. Sullivan, Ph.D. Department of Biological Sciences

GeneChip is GeneChip is a brand a brand

microarray microarray made by made by

AffymetrixAffymetrix

A microarray is a tool to rapidly evaluate gene expression A microarray is a tool to rapidly evaluate gene expression (mRNA level) for tens of thousands of genes in a sample(mRNA level) for tens of thousands of genes in a sample

Rat GeneChip RAE 230A has over 15,000 genes and transcripts represented on the array

1.3cm x 1.3cm

Page 9: Chris J. Sullivan, Ph.D. Department of Biological Sciences
Page 10: Chris J. Sullivan, Ph.D. Department of Biological Sciences
Page 11: Chris J. Sullivan, Ph.D. Department of Biological Sciences
Page 12: Chris J. Sullivan, Ph.D. Department of Biological Sciences
Page 13: Chris J. Sullivan, Ph.D. Department of Biological Sciences
Page 14: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Control Sample #1Control Sample #1

Diabetic Sample #1Diabetic Sample #1

Low High

Page 15: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Stage 1: Experimental design

[1] Biological samples: technical vs biological replicates(technical- repetition of same samples; biological- use multiple biological sources)

[2] RNA extraction, conversion, labeling, hybridization

[3] Microarray platform (dual color or single color)

Pooling of samples and mRNAX

Page 16: Chris J. Sullivan, Ph.D. Department of Biological Sciences

RNA: purify, label

Microarray: hybridize,wash, image

Biological insight

SampleSampleacquisitionacquisition

DataDataacquisitionacquisition

Data Data analysisanalysis

Data Data confirmationconfirmation(validation)(validation)

Dual color (two Dual color (two samples on one samples on one

microarray)microarray)

Page 17: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Dual color: two samples one microarrayDual color: two samples one microarray

15,000 gene cDNA microarray15,000 gene cDNA microarray

green = mRNAs unique to WT ischemic tissue (+ FGF2)green = mRNAs unique to WT ischemic tissue (+ FGF2)red = mRNAs unique to KO ischemic tissue red = mRNAs unique to KO ischemic tissue ((-- FGF2)FGF2)

yellow = mRNAs present in both conditionsyellow = mRNAs present in both conditionsblack = mRNAs absent from both conditionsblack = mRNAs absent from both conditions

mRNAmRNA mRNAmRNA

WT ischemic tissueWT ischemic tissue

cDNAcDNA cDNAcDNA

KO ischemic tissueKO ischemic tissue

Page 18: Chris J. Sullivan, Ph.D. Department of Biological Sciences

RNA: purify, label

Microarray: hybridize,wash, image

SampleSampleacquisitionacquisition

DataDataacquisitionacquisition

Data Data analysisanalysis

Data Data confirmationconfirmation(validation)(validation)

Biological insight

Single color (one Single color (one sample on one sample on one

microarray)microarray)

Page 19: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Stage 2: RNA and sample preparation

For Affymetrix chips, need total RNA (about 2-10 ug)

Confirm purity by running agarose gel

Measure a260/a280 to confirm purity, quantity

“Garbage in = Garbage out” RNA quality is key!

Page 20: Chris J. Sullivan, Ph.D. Department of Biological Sciences

18S

28S

Fluo

resc

ence

Time (seconds)

0

5

10

15

20

19 24 29 34 39 44 49 54 59 64 69

18s

28s

Baseline is relatively flat

Gel image

Page 21: Chris J. Sullivan, Ph.D. Department of Biological Sciences

18S

28S

Fluo

resc

ence

Time (seconds)

0

5

10

15

20

19 24 29 34 39 44 49 54 59 64 69

18S

28S

Fluo

resc

ence

Time (seconds)

0.0

0.5

1.0

1.5

2.0

2.5

3.0

3.5

4.0

4.5

5.0

5.5

6.0

19 24 29 34 39 44 49 54 59 64 69

18S

28S

Fluo

resc

ence

Time (seconds)

0

5

10

15

20

25

19 24 29 34 39 44 49 54 59 64 69

1

Most Degraded

Most Intact

21 2 3 4 5Tissue

Cells

5

Page 22: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Stage 3: hybridization to DNA arrays

The array consists of cDNA or oligonucleotides

Oligonucleotides can be deposited by photolithography

The sample is converted to cRNA or cDNA

-------------------Hybridization for hours or overnight… sample bind to complimentary sequences on microarray

Page 23: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Total RNATotal RNA

RNARNA

RNA

Processing, amplification Processing, amplification and labeling of RNA and labeling of RNA

samplessamples

cRNA

cRNA

cRNA

cRNAcRNA

cRNA

cRNA

cRNAcRNA

cRNA

cRNA

cRNA

RNARNA

Steps for Microarray Experiment Steps for Microarray Experiment

Single color Single color (one sample per (one sample per

microarray)microarray)

RNA

RNA

RNA

RNA

Page 24: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Stage 4: Image analysis

mRNA expression levels are quantitated

Fluorescence intensity is measured with a scanner,or radioactivity with a phosphorimager

Page 25: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Control Sample #1Control Sample #1

Diabetic Sample #1Diabetic Sample #1

Low High

Page 26: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Stage 5: Data analysis

• What genes were expressed (Present call)

•Differential gene expression? (ANOVA analysis)

•What are the relative differences in expression

(Ratio Analysis)

• What are the criteria for statistical significance?

•Are there meaningful patterns in the data

(such as groups)?

Page 27: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Microarray data analysis

preprocessing

inferential statistics

exploratory statistics

t-tests ANOVARatio

global normalizationlocal normalizationscatter plots

clustering

Page 28: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Rattus norvegicus Ceruloplasmin (ferroxidase) (Cp), mRNA.

0

200

400

600

800

1000

1200

1400

1600

1800

2000

Control Diabetic

ANOVA analysis, P = 0.00000566RATIO ANALYSIS, fold change 4.3 upregulated in Diabetic Group

Average Expression Intensity(n=5, biological replicates)

Page 29: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Differentially Expressed GenesDifferentially Expressed Genes (Based on p-value and fold change)(Based on p-value and fold change)

Quantified Gene ExpressionQuantified Gene Expression

Biological InterpretationBiological Interpretation(List of 529 “significant” genes)(List of 529 “significant” genes)

BLAST BLAST ESTsESTs

Gene Gene OntologyOntology Pathways Pathways

(KEGG)(KEGG)

Literature Literature MiningMining

(Pubmatrix)(Pubmatrix)

ClusteringClusteringgroupinggrouping

Page 30: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Unsupervised hierarchical clustering using expression values for ALL of the ~22,000 transcripts on the HG-U133A_2 GeneChip.

Clustering: Unique Expression ProfilesClustering: Unique Expression ProfilesMolecular Phenotyping Molecular Phenotyping

Page 31: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Two-dimensional hierarchical clustering using complete link and Pearson correlation using only those genes with comparison p-value 0.01 between at least two groups.

Identifying Genes Selectively Expressed in a group Identifying Genes Selectively Expressed in a group

Page 32: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Matrix of genes versus samples

Metric (define distance)

supervised,unsupervised

analyses

clusteringTrees(hierarchical,k-means)

self-organizing

maps

principalcomponentsanalysis

Page 33: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Stage 6: Confirmation and Validation

The differential up- or down-regulation of specificgenes can be measured using independent assayssuch as

-- Northern blots (does anybody do these???)

-- Polymerase chain reaction (Realtime RT-PCR)

-- In situ hybridization--Western blot--Immunohistochemistry

Page 34: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Stage 7: Microarray databases

There are two main repositories:

Gene expression omnibus (GEO) at NCBI

ArrayExpress at the European Bioinformatics Institute (EBI)

Page 35: Chris J. Sullivan, Ph.D. Department of Biological Sciences
Page 36: Chris J. Sullivan, Ph.D. Department of Biological Sciences
Page 37: Chris J. Sullivan, Ph.D. Department of Biological Sciences

http://www.dnachip.org

Page 38: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Microarray Analysis of Diabetes-Induced Microarray Analysis of Diabetes-Induced Erectile DysfunctionErectile Dysfunction in the Rat in the Rat

Page 39: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Control Group Control Group (n=5)(n=5)

12 weeks of diabetes12 weeks of diabetes

Experimental Design Experimental Design

Diabetic Group Diabetic Group (n=5)(n=5)

STZ

Single injection of Single injection of streptozotocin causes loss of streptozotocin causes loss of insulin producing Beta cells in insulin producing Beta cells in pancreaspancreas

Physiology to Physiology to confirm EDconfirm ED

Tissue Harvest Tissue Harvest for Gene for Gene

Expression Expression (Microarrays)(Microarrays)

Page 40: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Total RNATotal RNA

RNARNA

RNA

Processing, amplification Processing, amplification and labeling of RNA and labeling of RNA

samplessamples

cRNA

cRNA

cRNA

cRNAcRNA

cRNA

cRNA

cRNAcRNA

cRNA

cRNA

cRNA

RNARNA

Steps for Microarray Experiment Steps for Microarray Experiment

Single color Single color (one sample per (one sample per

microarray)microarray)

RNA

RNA

RNA

RNA

Page 41: Chris J. Sullivan, Ph.D. Department of Biological Sciences

cRNA

cRNA

cRNA

cRNAcRNA

cRNA

cRNA

cRNAcRNA

cRNA

cRNA

cRNA

Labeled RNA Labeled RNA samplesample

Into GeneChip Into GeneChip (microarray)(microarray) HybridizatioHybridizatio

nn

Scanning and Imaging Scanning and Imaging the GeneChipthe GeneChip

Quantification of Quantification of Gene Expression Gene Expression

for each Chip for each Chip

continuedcontinued

Page 42: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Making Meaning of Array DataMaking Meaning of Array Data

Data filtered using pData filtered using p0.01 and at least 0.01 and at least 1.5 fold change1.5 fold change in expressionin expression

622 genes differentially expressed 622 genes differentially expressed Control vs. DiabeticControl vs. Diabetic

Page 43: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Differentially Expressed GenesDifferentially Expressed Genes (Based on p-value and fold change)(Based on p-value and fold change)

Quantified Gene ExpressionQuantified Gene Expression

Biological InterpretationBiological Interpretation(List of 529 “significant” genes)(List of 529 “significant” genes)

BLAST ESTsBLAST ESTs

Gene Gene OntologyOntology Pathways Pathways

(KEGG)(KEGG)

Literature Literature MiningMining

(Pubmatrix)(Pubmatrix)

Page 44: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Literature Mining with PubMatrix Literature Mining with PubMatrix

529529 differentially expressed genes differentially expressed genes Control vs. DiabeticControl vs. Diabetic

presenilin-2prostatic steroid binding protein C1prostatic steroid binding protein 1protease, serine, 11phosphoribosyl pyrophosphate synthetase 1protein kinase C-etaprotein kinase C, alphaprotein phosphatase 1, regulatory (inhibitor) subunit 1Aputative protein phosphatase 1 nuclear targeting subunitpleiomorphic adenoma gene-like 1phospholipase A2, group 5phospholipase A2, group IIA (platelets, synovial fluid)protein kinase inhibitor, alphaprotein kinase inhibitor, alphaphosphoglycerate mutase 2profilin IIperiod homolog 2pyruvate dehydrogenate kinase 4phosphodiesterase 4Aprogrammed cell death 6 interacting proteinphosphorylase B kinase alpha subunitphosphorylase B kinase alpha subunitPAK-interacting exchange factor betapregnancy-induced growth inhibitorO linked N-acetylglucosamine transferaseornithine decarboxylase 1NTE-related proteinNAD(P)H dehydrogenase, quinone 1nerve growth factor, gammamyosin, heavy polypeptide 9myosin, heavy polypeptide 8, skeletal muscle, perinatalMYB binding protein 1amitochondrial ribosomal protein S18Amatrix metalloproteinase 3membrane metallo endopeptidasemalonyl-CoA decarboxylaseMARCKS-like proteinMIRO2 proteinMIPP65 proteinmicrosomal glutathione S-transferase 1monocarboxylate transportermethionine adenosyltransferase I, alphamannose-binding protein associated serine protease-1mitogen-activated protein kinase 6mitogen-activated protein kinase 12mitogen-activated protein kinase kinase 6mal, T-cell differentiation protein 2MAD homolog 3 (Drosophila)LRP16 proteinleukemia/lymphoma related factorlipoprotein lipaselysyl oxidaselysyl oxidasesperm membrane protein (YWK-II)hypothetical proteinhypothetical protein LK44

Gene names or symbols Gene names or symbols Various search terms of interest Various search terms of interest

DiabetesEndothelialSmooth MuscleVascularAortaBlood vesselvasodilationEndothelial Dysfunction

http://pubmatrix.grc.nia.nih.govhttp://pubmatrix.grc.nia.nih.gov

Automated online search tool to query 100 Automated online search tool to query 100 search terms by 10 modifier terms in the search terms by 10 modifier terms in the PubMed database (National Library of PubMed database (National Library of Medicine)Medicine)

Page 45: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Rattus norvegicus Ceruloplasmin (ferroxidase) (Cp), mRNA.

0

200

400

600

800

1000

1200

1400

1600

1800

2000

Control Diabetic

ANOVA analysis, P = 0.00000566RATIO ANALYSIS, fold change 4.3 upregulated in Diabetic Group

Average Expression Intensity(n=5, biological replicates)

Page 46: Chris J. Sullivan, Ph.D. Department of Biological Sciences

Array: 4.3 fold 1.9 foldPCR: 16 fold 2.4 fold

ABI systems real time PCR

Ceruloplasmin splice variants upregulated in diabetesCeruloplasmin splice variants upregulated in diabetes

Page 47: Chris J. Sullivan, Ph.D. Department of Biological Sciences

What about humans with diabetes? Is Cp upregulated?

Cp expression based on PCR using human erectile tissue diabetic patients versus healthy brain dead organ donors

2 fold upregulation

Page 48: Chris J. Sullivan, Ph.D. Department of Biological Sciences

PredictionPrediction: Lack of ceruloplasmin will be protective : Lack of ceruloplasmin will be protective (reduced or no diabetic ED in knockout mice)(reduced or no diabetic ED in knockout mice)

Cp +/+ Cp -/-

Wildtype miceWildtype mice Ceruloplasmin Ceruloplasmin knockout miceknockout mice

Give mice diabetesGive mice diabetes

HypothesisHypothesis: Ceruloplasmin contributes to the : Ceruloplasmin contributes to the pathogenesis of diabetic ED: vascular dysfunction pathogenesis of diabetic ED: vascular dysfunction