Genetic Regulatory Networks and...

42
Genetic Regulatory Networks and microarrays Julin Maloof Plant Biology

Transcript of Genetic Regulatory Networks and...

Genetic Regulatory Networks and

microarrays

Julin Maloof

Plant Biology

Central Dogma

(protein)

How many genes?

wikimedia.org/ Botaurus stellaris/

wikimedia.org/Alberto Salguero

wikimedia.org/Luc Viatour

Questions

• 5,000-50,000 genes

– What does each gene do?

– How do they interact? (what kind of network do they form)

• Traditional approaches: gene-by-gene

– not practical for large genomes

– 30-75% of genes have no obvious effect when mutated

• Why no obvious effect?

– most genes do nothing?

– lab vs. real world?

– robust networks?

Gene regulatory networks

• Can think of genes as being linked through regulatorynetworks

transcription factor 2 transcription factor 3

transcription factor 1

RNAse

Downstream effectors

• How can this network be discerned?

– One way is by looking at gene expression

Microarrays: genome-wide transcript levels

• tens of thousands of genes

• would like to be able to examine all at once

• microarrays make this possible

• take advantage of specificity of DNA hybridization to

have a unique “probe” for each gene

• use photo-lithography for manufacturing

The Dimensions of a GeneChip55

””

55””

Up to ~6,500,000 Up to ~6,500,000

featuresfeatures // chipchip

1.28cm1.28cm

1.28cm1.28cm

5!m5!m

5!m 5!m

*** ***

Millions of identical Millions of identical probesprobes // featurefeature

http://www.affymetrix.com/corporate/outreach/lesson_plan/educator_resources.affx

CTAAGAGC

GATTCTCG

C : GT : AA : TA : TG : CA : TG : CC : G

Hybridization of matching DNA or RNA sequences Complementary DNA:DNA or DNA:RNA

sequences hybridize

http://www.affymetrix.com/corporate/outreach/lesson_plan/educator_resources.affx

Hybridization

Probe Array

Tagged RNA Target

Hybridized Array

FluorescentStain

Detect

http://www.affymetrix.com/corporate/outreach/lesson_plan/educator_resources.affx

Details of a gene chip

http://www.affymetrix.com/corporate/outreach/lesson_plan/educator_resources.affx

Hybridization of RNA to probes on chip

http://www.affymetrix.com/corporate/outreach/lesson_plan/educator_resources.affx

Specificity of hybridization

http://www.affymetrix.com/corporate/outreach/lesson_plan/educator_resources.affx

Specificity of hybridization

http://www.affymetrix.com/corporate/outreach/lesson_plan/educator_resources.affx

Fluorescence indicates amount of

RNA in sample

http://www.affymetrix.com/corporate/outreach/lesson_plan/educator_resources.affx

Expression array after scanning

http://www.affymetrix.com/corporate/outreach/lesson_plan/educator_resources.affx

How can microarrays help us build GRNs?

• Co-expression or Relevance Network

– measure gene expression across multiple samples

• after perturbation

• time course

• different individuals

• mutants

– Create correlation matrix

– Edges connect genes with correlation > threshold

Good Review: Markowetz, F. & Spang, R. Inferring cellular networks--a review. BMC Bioinformatics 8 Suppl 6, S5 (2007).

co-expression network

EDCBA

.1

A

.7.3.3E

.6.7.7D

.2.8C

.9B

A

CB

D E

co-expression network: limitations

• which connection are direct?

A

CB

D E

Alternative: Graphical Gaussian Models (GGMs)

• Make edges based on partial correlations

• correlations between each gene pair are conditioned on

all other genes in network

• challenges:

– number of genes >> samples

– sparse regression, other

A

CB

D E1. Schäfer, J. & Strimmer, K. An empirical Bayes approach to inferring

large-scale gene association networks. Bioinformatics 21, 754-64

(2005).

2. Opgen-Rhein, R. & Strimmer, K. From correlation to causation

networks: a simple approximate learning algorithm and its application to

high-dimensional plant gene expression data. BMC Syst Biol 1, 37(2007).

Adding direction

• It is important to have directional edges

• one approach is to use time-series data

• Examine (partial) correlation between

– each gene at t=i and the matrix of

gene expression at t=i-1,i-2…

use vector autoregressive (VAR) process

• If genes B and C at t=2 correlated with

gene A at t=1, then can infer direction

A

CB

D E

1. Opgen-Rhein, R. & Strimmer, K. Learning causal networks from systems biology time course data: an

effective model selection procedure for the vector autoregressive process. BMC Bioinformatics 8 Suppl 2,

S3 (2007).

2. Jie Peng, UC Davis.

Adding direction

• An alternative approach is experimental:

– determine which promoters are being bound by

transcription factors

www.chiponchip.org

Adding direction

• If gene A is a transcription factor and binds to the

promoters of B and C, then we can add direction to the

edges

A

CB

D E

Adding direction--using genetics

• If the genotype at gene A correlates expression of gene B then canadd direction.

• an extension of “genetical genomics”

A

CB

D E

100600225150500200expression

B

redblueredredblueredgenotype

A

Review: Li J, Burmeister M. Genetical genomics: combining genetics

with gene expression analysis. Hum Mol Genet. 2005 Oct 15;14 Spec

No. 2:R163-9

Example: Flower Development

Flower Development

• How well do we understand the GRN controlling

flowers?

• How robust is the GRN to perturbation?

Flower development model

• Base topology on prior data

• Develop set of logical rules for interactions

Flower development model

• Model reproduces experimental gene expression

profiles

• Model is robust to starting conditions

• Model is robust to many single parameter changes

• Model predicts novel interactions which can be

experimentally tested.

Example: Circadian rhythms

• Most organisms display circadian rhythms

• controlled by biological clocks with ~ 24 hour

periodicity.

• How are biological clocks constructed?

rhythmic gene expression

Simple clock model

• does not fit data

clock model 2

• For each edge include ordinary differential equations (ODEs) that

describe reaction rates

• estimate ODE parameters by solving for best fit to experimental

data.

• run simulation to compare model to reality

single loop model does not fit data

simulated (dashed) and

experimental (solid) datarhythms persist in lhy mutant

two-loop model comes closer

Light = Food

Animals can walk to better foraging

Plants must grow to better foraging

As a consequence, plant development is extensivelyregulated by the environment

Growth regulated by shade

“sun” “shade”(foraging)

Maloof lab, unpublished

Plant growth is regulated by light

light dark

photo courtesy of Xing-Wang Deng

A complex network regulates growth

Brassino-steroid

light

ethylene auxin

clock

GA

Growth

Nozue, K., Maloof, J.N. (2006) Diurnal regulation of plant growth. Plant Cell Environ 29, 396-408.

Light regulated growth

How is growth controlled?

• Hypothesis: transcriptional regulation is involved in gating of dark-induced elongation.

• Experiment: Expression profiling to find genes whose expressioncorrelates with growth.

Nozue K, Covington MF, Duek PD, Lorrain S, Fankhauser C, Harmer SL, Maloof JN. (2007) Rhythmic growth explained by coincidence betweeninternal and external cues. Nature 448, 358-361.

838 genes are up-regulated

in growth phase

Top 10

cycling Dof-factor 3 (CDF3)hydrolasegigantea (GI)salt-tolerance protein (STO)phytochrome-interacting factor 4 (PIF4)phytochrome-interacting factor 5 (PIF5)expressed proteindentinGH3-3expressed protein

Nozue K, Covington MF, Duek PD, Lorrain S, Fankhauser C, Harmer SL, Maloof JN. (2007) Rhythmic growth explained by coincidence betweeninternal and external cues. Nature 448, 358-361.

visualization of network

PIL6

PIF4

UPG; green

UPNG; magentaNozue and Maloof, unpublished

PIF4/5 regulated growth network

growth hormone enriched

cell wall

PIF4/5 UP

vs

PIF4/5 DOWN

Nozue and Maloof, unpublished