Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

44
Functional Genomics and Bioinformatics Applied to Understanding Applied to Understanding Oxidative Stress Resistance Oxidative Stress Resistance in Plants in Plants Ruth Grene Alscher Lenwood S. Heath Virginia Tech December 14, 2001

description

Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance in Plants. Ruth Grene Alscher Lenwood S. Heath Virginia Tech December 14, 2001. Overview. Organization of our group About environmental stress and reactive oxygen species (ROS) - PowerPoint PPT Presentation

Transcript of Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Page 1: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Functional Genomics and Bioinformatics

Applied to UnderstandingApplied to UnderstandingOxidative Stress ResistanceOxidative Stress Resistance

in Plantsin Plants

Ruth Grene AlscherLenwood S. Heath

Virginia TechDecember 14, 2001

Page 2: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Overview• Organization of our group• About environmental stress and reactive oxygen

species (ROS)• Plant responses to ROS• Analysis of responses to stress on a chip -

microarray technology• Expresso: management system for microarrays

– Managing expression experiments– Analyzing expression data– Reaching conclusions

• Where do we go from here?

Page 3: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Boris Chevone

Ron Sederoff NCSU

Dawei Chen

Ruth AlscherLenny Heath Naren Ramakrishnan,

Keying Ye

Len van Zyl

NCSU

Carol Loopstra

Texas A and M

Jonathan Watkinson

Margaret Ellis

Logan Hanks

Senior Collaborators

Students: VT

Cecilia Vasquez

Page 4: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Detection of stress -mediated gene expression effects on microarrays

Computational tools to infer interaction among genes, pathways

Revised / New Tools and

Experiments

Genetic Regulatory Networks

Test inferences with varying conditions

and genotypes

1

2

3

4

Iterative strategy for detection of stress -mediated effects on gene expression using microarrays

and CS expertise

Page 5: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Expresso

Page 6: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

• Plants adapt to changing environmental conditions through global cellular responses involving successive changes in, and interactions among, expression patterns of numerous genes.

• Our group studies these changes through a combination of bioinformatics and genomic techniques.

Plant Response to Stress

Page 7: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

• Biological: To identify molecular stress resistance mechanisms in tree and crop species.

•Bioinformatic: To support iterative experimentation in plant genomics, capture and analyze experimental data, integrate biological information from diverse sources, and close the experimental loop.

Long Term Goals

Page 8: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

The Paradox of Aerobiosis

• Oxygen is essential, but toxic.

• Aerobic cells face constant danger from reactive oxygen species (ROS).

• ROS can act as mutagens, they can cause lipid peroxidation and denature proteins.

Page 9: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

ROS Arise as a Result of Exposure to:

• Ozone

• Sulfur dioxide

• High light

• Herbicides

• Extremes of temperature

• Salinity

• Drought

Page 10: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Free Radicals

Page 11: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Responses to Environmental Signals

Page 12: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Redox Regulation of Cellular Systems

Membrane Receptors

Environmental Stress

Metabolite Defense

Protein kinases; phosphatases

Transcription factors

Gene Expression

Defense, Repair, Apoptosis

Prooxidants (ROS)Antioxidants

Page 13: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Scenarios for Effects of Abiotic Stress on Gene Expression in Plants

Page 14: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Drought Stress Responses in Loblolly Pine: Questions to be

Addressed• Can a hierarchy of drought stress resistance mechanisms be identified ?

• Can a clear distinction be made between rapidly responding and long term adaptational mechanisms?

• Can particular subgroups within gene families be associated with drought tolerance?

Page 15: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Hypotheses

• There is a group of genes whose expression confers resistance to drought stress.

• Based on previous work increased expression of defense genes is co-regulated and is correlated with resistance to oxidative stress. Failure to cope is correlated with little or no defense gene activation. Candidate resistance genes follow this pattern of expression.

• A common core of defense genes exists, which responds to several different stresses.

Page 16: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Components of Stress StudyPine Drought

Stress Experiments

Expresso Prototype

Design and Print Microarrays

Select Pine cDNAs 384, 2400 (1999, 2001)

Design Functional Hierarchy

Capture Spot Intensities

Integrate and Analyze

Inductive Logic Programming (ILP)

Page 17: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Imposition of Successive Cycles of Mild or Severe Drought Stresson 1-year-old Loblolly Pine Seedlings

0

-2

-10

-15

DAYS

wat

er p

oten

tial

(b

ars)

RNAHarvest

I

RNAHarvest

II

RNAHarvest

III

RNAHarvest

IV

Cycles ofMild

DroughtStress

DR

Y D

OW

N

DR

Y D

OW

N

DR

Y D

OW

N

DR

Y D

OW

N

= PS (photosynthesis)

0

-2

-10

-15

DAYS

wat

er p

oten

tion

al (

bar

s)

Cycle

ICycle

IICycle

III

RNAHarvest

I

RNAHarvest

II

RNAHarvest

III

Cycles ofSevere

DroughtStress

DRY DOW

N

DRY DOW

N

DRY DOW

N

Water withheld

Water given

Water given

Water given

Water given

Water withheld

Water withheld

Water withheld

Water given

Water given

Water given

Water withheld

Water withheld

Water withheld

RE

CO

VE

RY

RE

CO

VE

RY

RE

CO

VE

RY

RE

CO

VE

RY

RE

CO

VE

RY

RE

CO

VE

RY

RE

CO

VE

RY

Page 18: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Categories within Protective and Protected Processes

Plant Growth Regulation

Environmental

Change

GeneExpression

SignalTransduction

ProtectiveProcesses

ProtectedProcesses

ROS and Stress

Cell Wall Related

PhenylpropanoidPathway

Development

Metabolism

Chloroplast Associated

Carbon Metabolism

Respiration and Nucleic Acids

Mitochondrion

Cells

Tissues

Cytoskeleton

Secretion

Trafficking

Nucleus

Protease-associated

Page 19: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

ProtectiveProcesses

Stress

Cell Wall Related

PhenylpropanoidPathway

AbioticBiotic

Antioxidant Processes

Drought

HeatNon-Plant

Xenobiotics

NADPH/Ascorbate/GlutathioneScavenging Pathway

Cytosolicascorbateperoxidase

Dehydrins, Aquaporins

Heat shock proteins(Chaperones)

superoxidedismutase-Fe

superoxidedismutase-Cu-Zn

glutathionereductase

Sucrose Metabolism

Cellulose

Arabionogalactan proteins

Hemicellulose

Pectins

Xylose

Other Cell Wall Proteins

isoflavone reductases

phenylalanine ammonia-lyases

S-adenosylmethionine decarboxylases

glycine hydromethyltransferases

Lignin Biosynthesis CCoAOMTs

4-coumarate-CoAligases

cinnamyl-alcoholdehydrogenase

Chaperones“IsoflavoneReductases”

GSTs

Extensins and proline rich proteins

Categorieswithin

“Protective Processes”

Page 20: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Hypotheses versus Results –1999 Expt

o Among the genes responding positively to mild stress, there exists a population of genes whose expression is negative or unchanged under severe stress. – Candidate stress resistance genes. Genes in 69

categories ( e.g. HSP70s and 100s, aquaporins, but not HSP80s) responded positively to mild stress. Effect of severe stress was not detectable or negative.

Page 21: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Genes associated with other stresses responded to drought stress

–Isoflavone reductase homologs and GSTs responded positively to mild drought stress.

–These categories are previously documented to respond to biotic stress and xenobiotics, respectively.

–However, both isoflavone reductase homologs and GSTs responded positively also to severe drought stress. Thus, they do not fall into the category of candidate stress resistance genes.

Hypotheses versus Results –1999 Experiment

Page 22: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Candidate Categories

• Include– Aquaporins– Dehydrins– Heat shock proteins/chaperones

• Exclude– Isoflavone reductases

Page 23: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Flow of a Microarray Experiment

Hypotheses

Select cDNAs

PCR

Test of Hypotheses

Extract RNA

Replication and Randomization

Reverse Transcription and

Fluorescent Labeling

Robotic Printing

Hybridization

Identify Spots

Intensities

Statistics

Clustering

Data Mining, ILP

Page 24: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

• Selected 384 archived ESTs

• Organized into four 96-well microtitre source plates after PCR

• Pipetted into 8 sets of four randomized microtitre plates

• Each set is a different randomized arrangement of the 384 ESTs

Design of Microarrays I ---Randomization

Page 25: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

• Printed type A microarrays from first four sets (16 plates); printed type B microarrays from second four sets

• Each array type has four replicates of each EST, randomly placed

• Each comparison was performed with four different hybridizations, with dyes reversed in two

• Total of 16 replicates of each EST in each comparison

Design of Microarrays II ---Replication

Page 26: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

• Image Analysis: gridding, spot identification, intensity and background calculation, normalization

• Statistics:• Fold or ratio estimation• Combining replicates

• Higher-level Analysis:• Clustering methods• Inductive logic programming (ILP)

Spot and Clone Analysis

Page 27: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Spot Identification and Intensity Analysis

• Microarray Suite: Manual grid; extract intensities for each spot; compute ratios; compute calibrated ratios

• Spot Statistics: – Every calibrated ratio is divided by the mean of all

the uncalibrated ratios; the result is simply that the mean of the calibrated ratios is 1.0

– Our tools use the logarithm of each calibrated ratio– Positive: expression increase– Negative: expression decrease– Zero: no change in expression

Page 28: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Analysis of Expression Data

• The multiple (typically 16) log calibrated ratios for a replicated clone do NOT follow a normal distribution.

• Distribution is spread relatively evenly over a large range.

• Statistical analysis based on mean and standard deviation will be overly pessimistic in identifying clones that are up- or down-expressed.

• From the observation of an even spread of the log ratios, we assume that a clone whose expression is not different from a probe pair will show a distribution centered at a mean log ratio of 0.0.

Page 29: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Computational Methods ---Alternate Assumptions

• Our more general assumption avoids the trap of having to classify the response of each SPOT; rather, we classify the response of an EST as one of

– Up-regulated

– Down-regulated

– No clear change

• Response CLASSIFICATION rather than QUANTIFICATION allows us to develop unified relationships among genes and among treatments.

• Provides sufficient results for the use of inductive logic programming (ILP).

Page 30: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Data Mining:Inductive Logic Programming

• ILP is a data mining algorithm expressly designed for inferring relationships.

• By expressing relationships as rules, it provides new information and resultant testable hypotheses.

• ILP groups related data and chooses in favor of relationships having short descriptions.

• ILP can also flexibly incorporate a priori biological knowledge (e.g., categories and alternate classifications).

Page 31: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

• Infers rules relating gene expression levels to categories, both within a probe pair and across probe pairs, without explicit direction

• Example Rule:[Rule 142] [Pos cover = 69 Neg cover = 3]

level(A,moist_vs_severe,not positive) :- level(A,moist_vs_mild,positive).

• Interpretation:“If the moist versus mild stress comparison was positive for some clone named A, it was negative or unchanged in the moist versus severe comparison for A, with a confidence of 95.8%.”

Rule Inference in ILP

Page 32: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

ILP subsumes two forms of reasoning

• Unsupervised learning– “Find clusters of genes that have similar/consistent

expression patterns”

• Supervised learning– “Find a relationship between a priori functional

categories and gene expression”

• Hybrid reasoning: Information Integration– “Is there a relationship between genes in a given

functional category and genes in a particular expression cluster?”

– ILP mines this information in a single step

Page 33: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

NSF-Supported Work of 2001: Expresso Progress to Date

Margaret Ellis and Logan Hanks (computer science graduate students):• MEL: Semistructured data model for experiment capture• Parsing: Automatic parser generators to drive archival storage• Database: Loading and cataloging MEL data in a Postgres RDBMS• Pipeline: Linkages to data analysis and data mining software

Page 34: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Imposition of Successive Cycles of Mild or Severe Drought Stresson 1-year-old Loblolly Pine Seedlings

0

-2

-10

-15

DAYS

wat

er p

oten

tial

(b

ars)

RNAHarvest

I

RNAHarvest

II

RNAHarvest

III

RNAHarvest

IV

Cycles ofMild

DroughtStress

DR

Y D

OW

N

DR

Y D

OW

N

DR

Y D

OW

N

DR

Y D

OW

N

= PS (photosynthesis)

0

-2

-10

-15

DAYS

wat

er p

oten

tion

al (

bar

s)

Cycle

ICycle

IICycle

III

RNAHarvest

I

RNAHarvest

II

RNAHarvest

III

Cycles ofSevere

DroughtStress

DRY DOW

N

DRY DOW

N

DRY DOW

N

Water withheld

Water given

Water given

Water given

Water given

Water withheld

Water withheld

Water withheld

Water given

Water given

Water given

Water withheld

Water withheld

Water withheld

RE

CO

VE

RY

RE

CO

VE

RY

RE

CO

VE

RY

RE

CO

VE

RY

RE

CO

VE

RY

RE

CO

VE

RY

RE

CO

VE

RY

Page 35: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Cy3 TIFF Image

Final Harvest; Control versus Mild Stress; 2001

Cy5 TIFF Image

Rep

lica

tion

Dif

fere

nti

al

Exp

ress

ion

Page 36: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Final Harvest; Control versus Mild Stress; 2001

Cy5 to Cy3 ratios. Final harvest after four drought cycles. RNA harvested 24 hours after final watering.

Cy5 = treated; Cy3 = control.Aquaporins responded positively. HSP 80’s were

unaffected (same as in 1999 results).

Page 37: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Drought Stress Responses in Loblolly Pine: Questions to be

Addressed• Can a hierarchy of drought stress resistance mechanisms be identified ?

• Can a clear distinction be made between rapidly responding and long term adaptational mechanisms?

• Can particular subgroups within gene families be associated with drought tolerance?

Page 38: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Proposed Project: 2002-2005

Plant Biology (with co-PIs: Ron Sederoff, NCSU; Carol Loopstra, TAMU)

• An investigation of drought stress responses in lobolly pine in a variety of provenances.

• Quantitative RT-PCR to confirm and expand results obtained with microarrays.

• In situ hybridization to stressed and unstressed cell and tissue types.

Page 39: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Proposed Project: 2002-2005Sources of cDNAs for 2002-2005 arrays

• NCSU ESTs selected on the basis of function.

• Stressed cDNA libraries from roots and stems of drought tolerant families from East Texas and Lost Pines, and from the Atlantic Coastal Plain (humid conditions).

• Homologs of drought-responsive Arabidopsis genes.

Page 40: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Drought Stress Responses in Loblolly Pine: Future Bioinformatics Goals

• Support incorporation of biological information in the form of functional hierarchies and gene families.

• Close the computational and experimental loop to support iterative experimental regimes.

• Integrate information from multiple experiments involving multiple provenances, drought stresses, and EST sets.

Page 41: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Gene Discovery in the Arabidopsis Transcriptome

Data Capture

Pos

tgre

s D

atab

ase

Database Queries

Statistical Analysis and Clustering

Data Mining, ILP

Possible Identification of Novel Drought

Responsive Genes in Arabidopsis

Drought Stress (short and long

term)

Hybridize to Arabidopsis

Transcriptome

Scanning, Image Processing

Page 42: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Select Pine cDNAs Via Contigs

Robotic Replication and Printing

Identification of Drought Responsive Genes and Pathways Across Provenances in Loblolly Pine

Data CaptureP

ostg

res

Dat

abas

e

Database Queries

Statistical Analysis and Clustering

Data Mining, ILP

Drought Stress Experiments on

NC, TX Pine

Hybridization

Scanning, Image Processing

Identification of Drought

Responsive Pine Genes

Close The Loop

Arabidopsis Drought

Responsive genes

Page 43: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Proposed Project: 2002-2005

Bioinformatics I (Alscher, Heath, Ramakrishnan)

• Constraint-based selection of cDNAs, including intelligent use of contigs.

• Assignment of pine ESTs to subgroups within protein families (ProDom, Pfam).

• Extend information integration in ILP to include Mendel classification of gene families.

• Integrating data across provenances and known degrees of drought tolerance.

Page 44: Functional Genomics and Bioinformatics Applied to Understanding Oxidative Stress Resistance

Proposed Project: 2002-2005

Bioinformatics II (Ramakrishnan, Heath)

• Specialize ILP for particular biological information sources.

• Automatic tuning of ILP parameters.

• Pushing data mining functionality into the database.

• Interleaving and iteration of query, data analysis, and data mining operations.