Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG)

Rafael A. IrizarryDepartment of Biostatistics, JHU

(joint work with Bridget Hobbs and Terry Speed,

Walter & Eliza Hall Institute of Medical Research)

Summary

• Summarize the expression level of a probe set by Average Log2 (PM-BG)

• PMs need to be normalized • Background makes no use of probe-specific MM• Evaluate and compare through bias, variance and

model fit to AvDiff and the Li & Wong algorithm• Use Gene Logic spike-in and dilution study• All three expression measures performed well• AvLog(PM-BG) is arguably the best of the three

SD vs. Avg of Defective Probes

Normalization at Probe Level

Expression after Normalization

Background Distribution

Average Log2(PM-BG)

• Normalize probe level data

• Compute BG = background mean by estimating the mode of the MM distribution

• Subtract BG from each PM

• If PM-BG < 0 use minimum of positives divided by 2

• Take average

Spike-In Experiments

• Add concentrations (0.5pM – 100 pM) of 11 foreign species cRNAs to hybridization mixture

• Set A: 11 control cRNAs were spiked in, all at the same concentration, which varied across chips.

• Set B: 11 control cRNAs were spiked in, all at different concentrations, which varied across chips. The concentrations were arranged in 12x12 cyclic Latin square (with 3 replicates)

Why Remove Background?

Probe Level Data (12 chips)

What Did We Learn?

• Don’t subtract or divide by MM

• Probe effect is additive on log scale

• Take logs

Expression Level

Spike-In BGene Conc 1 Conc 2 Rank

BioB-5 100 0.5 1

BioB-3 0.5 25.0 2

BioC-5 2.0 75.0 3

BioB-M 1.0 35.7 4

BioDn-3 1.5 50.0 5

DapX-3 35.7 3.0 6

CreX-3 50.0 5.0 7

CreX-5 12.5 2.0 8

BioC-3 25.0 100 9

DapX-5 5.0 1.5 10

DapX-M 3.0 1.0 11

Later we consider 24 different combinations of concentrations

Differential Expression

Observed vs True Ratio

Dilution Experiment• cRNA hybridized to human chip (HGU_95) in

range of proportions and dilutions• Dilution series begins at 1.25 g cRNA per

GeneChip array, and rises through 2.5, 5.0, 7.5, 10.0, to 20.0 g per array. 5 replicate chips were used at each dilution

• Normalize just within each set of 5 replicates• For each probe set compute expression, average

and SD over replicates, and fit a line to log expression vs. log concentration

• Regression line should have slope 1 and high R2

Dilution Experiment Data

Expression and SD

Slope Estimates and R2

Model check

• Compute observed SD of 5 replicate expression estimates

• Compute RMS of 5 nominal SDs

• Compare by taking the log ratio

• Closeness of observed and nominal SD taken as a measure of goodness of fit of the model

Observed vs. Model SE

Conclusion

• Take logs• PMs need to be normalized • Using global background improves on use of

probe-specific MM• Gene Logic spike-in and dilution study show all

three expression measures performed very well• AvLog(PM-BG) is arguably the best in terms of

bias, variance and model fit• Future: better BG; robust/resistant summaries

Acknowledgements

• Gene Brown’s group at Wyeth/Genetics Institute, and Uwe Scherf’s Genomics Research & Development Group at Gene Logic, for generating the spike-in and dilution data

• Gene Logic for permission to use these data • Francois Collin (Gene Logic)• Ben Bolstad (UC Berkeley)• Magnus Åstrand (Astra Zeneca Mölndal)

Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG)

Documents

Transcript of Bias, Variance, and Fit for Three Measures of Expression: AvDiff, Li &Wong’s, and AvLog(PM-BG)

Linear Models Of Regression: Bias-Variance Decomposition ...

Bias and Variance in Continuous EDA: massively parallel continuous optimization

Introduction to Predictive Models The Bias Variance Tradeo ... · Introduction to Predictive Models The Bias Variance Tradeo Cross Validation Some of the gures in this presentation

Bias-Variance Analysis of Support Vector Machines for the ......BIAS-VARIANCE ANALYSIS OF SVMS As brieﬂy outlined, these decompositions suffer of signiﬁcant shortcomings: in particular

Common Method Variance & Bias Dalam Penelitian Psikologi

Bias and variance reduction in estimating the proportion ...tongt/papers/Biostatistics2015.pdf · Bias and variance reduction in estimating the proportion of true-null hypotheses

Disentangling Bias and Variance in Election Pollsgelman/research/unpublished/pollposition_v5.pdfDisentangling Bias and Variance in Election Polls ... the huge but uncontrolled Literary

Bias-variance decomposition in Random Forests

Bias-Variance in Machine Learning. Bias-Variance: Outline Underfitting/overfitting: –Why are complex hypotheses bad? Simple example of bias/variance Error.

Approximating the Bias and Variance of Chain Ladder ...

Gaussians Linear Regression Bias-Variance TradeoffLinear Regression Bias-Variance Tradeoff Machine Learning – 10701/15781 Carlos Guestrin Carnegie Mellon University January 22nd,

Bias-Variance Analysis of Ensemble Learning

A Uni ed Bias-Variance Decompositionhomes.cs.washington.edu/~pedrod/bvd.pdf · A Uni ed Bias-Variance Decomposition Pedro Domingos Department of Computer Science and Engineering University

BIAS, VARIANCE , AND ARCING CLASSIFIERSdocs.salford-systems.com/BIAS_VARIANCE_ARCING.pdf · 1 BIAS, VARIANCE , AND ARCING CLASSIFIERS Leo Breiman leo@stat.berkeley.edu Statistics

A Uni ed Bias-Variance Decompositionpedrod/bvd.pdf · 2005. 2. 17. · A Uni ed Bias-Variance Decomposition Pedro Domingos Department of Computer Science and Engineering University

DIGITAL TWIN AI and Machine Learning: The Bias-Variance ... · Introduction The Bias-Variance Decomposition Cross-validation Hyperparameter Selection Reﬂections Robust Evaluation

The Bias-Variance Trade-Off

New Bias, Variance and Parsimony in Regression Analysis ECS 256 …heather.cs.ucdavis.edu/~matloff/256/Slides/Chris.pdf · 2014. 3. 12. · Bias, Variance and Parsimony in Regression

What Causes the Test Error? Going Beyond Bias-Variance via ...

Linear Regression, Regularization Bias-Variance Tradeoffrgreiner/C-466/SLIDES/3b-Regression.pdf · Linear Regression, Regularization Bias-Variance Tradeoff HTF: Ch3, 7 B: Ch3 Thanks