Lecture 13: Multiple linear regression

2001

Bio 4118 Applied BiostatisticsL13.1

Université d’Ottawa / University of Ottawa

Lecture 13: Multiple linear Lecture 13: Multiple linear regressionregression

Lecture 13: Multiple linear Lecture 13: Multiple linear regressionregression

When and why we use it The general multiple regression model Hypothesis testing in multiple regression The problem of multicollinearity Multiple regression procedures Polynomial regression Power analysis in multiple regression

When and why we use it The general multiple regression model Hypothesis testing in multiple regression The problem of multicollinearity Multiple regression procedures Polynomial regression Power analysis in multiple regression

2001



Some GLM proceduresSome GLM proceduresSome GLM proceduresSome GLM procedures

ProcedureDependentvariable

Independent variable(s)

Simpleregression

1 continuous 1 continuous

SingleclassificationANOVA

1 continuous 1 categorical*

Multiple-classificationANOVA

1 continuous 2 or more categorical*

ANCOVA 1 continuousAt least 1 categorical*, atleast 1 continuous

Multipleregression

1 continuous 2 or more continuous

*either categorical or treated as a categorical variable

2001



When do we use When do we use multiple regression?multiple regression?

When do we use When do we use multiple regression?multiple regression?

to compare the relationship between a continuous dependent (Y) variable and several continuous independent (X1, X2, …) variables

e.g. relationship between lake primary production, phosphorous concentration and zooplankton abundance

to compare the relationship between a continuous dependent (Y) variable and several continuous independent (X1, X2, …) variables

e.g. relationship between lake primary production, phosphorous concentration and zooplankton abundance

Log [P]

Lo

g P

rod

uct

ion

Log [P]

Lo

g P

rod

uct

ion

Log [Zoo]

2001



The general model is:

which defines a k-dimensional plane, where = intercept, j = partial regression coefficient of Y on Xj, Xij is value of ith observation of dependent variable Xj, and i is the residual of the ith observation.

The general model is:

which defines a k-dimensional plane, where = intercept, j = partial regression coefficient of Y on Xj, Xij is value of ith observation of dependent variable Xj, and i is the residual of the ith observation.

The multiple regression model: general The multiple regression model: general formform

The multiple regression model: general The multiple regression model: general formform

Y Xi jj

k

ij i

1

X2

X1

Y

X2

X1

Y, X1, X2^

Y, X1, X2

Y X , X 1 2.

2001



What is the partial regression coefficient What is the partial regression coefficient anyway?anyway?

What is the partial regression coefficient What is the partial regression coefficient anyway?anyway?

j is the rate of change in Y per change in Xj with all other variables held constant; this is not the slope of the regression of Y on Xj, pooled over all other variables!

j is the rate of change in Y per change in Xj with all other variables held constant; this is not the slope of the regression of Y on Xj, pooled over all other variables!

Partial regression

Simple (pooled)regression -4 -2 0 2 4

-8

-4

0

4

8

X1

Y

X2 = 3

X2 = 1

X2 = -1

X2 = -3

2001



The effect of scaleThe effect of scaleThe effect of scaleThe effect of scale

Two independent variables on different scales will have different slopes, even if the proportional change in Y is the same.

So, if we want to measure the relative strength of the influence of each variable on Y, we must eliminate the effect of different scales.

Two independent variables on different scales will have different slopes, even if the proportional change in Y is the same.

So, if we want to measure the relative strength of the influence of each variable on Y, we must eliminate the effect of different scales.

Y j = 2

4

2

01 2

Xj

Y j = .02

4

2

0100 200

2001



Since j depends on the size of Xj, to examine the relative effect of each independent variable we must standardize the regression coefficients by first transforming all variables and fitting the regression model based on the transformed variables.

The standardized coefficients j* estimate the relative strength of the influence of variable Xj on Y.

Since j depends on the size of Xj, to examine the relative effect of each independent variable we must standardize the regression coefficients by first transforming all variables and fitting the regression model based on the transformed variables.

The standardized coefficients j* estimate the relative strength of the influence of variable Xj on Y.

The multiple regression model: The multiple regression model: standardized formstandardized form

The multiple regression model: The multiple regression model: standardized formstandardized form

YY Y

sX

X X

s

Y X

s

s

ii

Yij

ij j

X

i jj

k

ij i

j jX

Y

j

j

* *

* * *

*

,

1

2001



Regression coefficients: summaryRegression coefficients: summary

Partial regression coefficient: equals the slope of the regression of Y on Xj when all other independent variables are held constant.

Standardized partial regression coefficient: the rate of change of Y in standard deviation units per one standard deviation of Xj with all other independent variables held constant.

2001



AssumptionsAssumptions

independence of residuals homoscedasticity of residuals linearity (Y on all X) no error on independent variables normality of residuals

2001



Hypothesis testing in simple linear Hypothesis testing in simple linear regression: partitioning the total sums of regression: partitioning the total sums of

squaressquares

Hypothesis testing in simple linear Hypothesis testing in simple linear regression: partitioning the total sums of regression: partitioning the total sums of

squaressquares

Total SS Model (Explained) SS Unexplained (Error) SS

( )Y Yii

N

1

2 ( )Y Yii

N

1

2 ( )Y Yii

N

i

1

2

Y

Y = +

2001



Partition total sums of squares into model and residual SS:

Partition total sums of squares into model and residual SS:

Hypothesis testing in multiple regression Hypothesis testing in multiple regression I: partitioning the total sums of squaresI: partitioning the total sums of squares

Hypothesis testing in multiple regression Hypothesis testing in multiple regression I: partitioning the total sums of squaresI: partitioning the total sums of squares

SS Y Yii

N

Total ( )

1

2

SS Y Yii

N

model ( )

1

2

SS Y Yii

N

ierror ( )

1

2X2

X1

Y

Model SS

Total SS

Residual SS

2001



Hypothesis testing I: partitioning the Hypothesis testing I: partitioning the total sums of squarestotal sums of squares

Hypothesis testing I: partitioning the Hypothesis testing I: partitioning the total sums of squarestotal sums of squares

So, MSmodel = s2Y and

MSerror = 0 if observed = expected for all i.

Calculate F = MSmodel/MSerror and compare with F distribution with 1 and N - 2 df.

H0: F = 1

So, MSmodel = s2Y and

MSerror = 0 if observed = expected for all i.

Calculate F = MSmodel/MSerror and compare with F distribution with 1 and N - 2 df.

H0: F = 1

MSY Yi

i

N

model

( )

1

2

1

MSY Y

N

ii

N

i

error

( )

1

2

2

2001



Hypothesis testing II: Hypothesis testing II: testing individual testing individual partial regression partial regression

coefficientscoefficients

Hypothesis testing II: Hypothesis testing II: testing individual testing individual partial regression partial regression

coefficientscoefficients Test each hypothesis by a t-

test:

Note: these are 2-tailed hypotheses!

Test each hypothesis by a t-test:

Note: these are 2-tailed hypotheses!

ts

ts

j

j

j

,

YY

H02: 2 = 0,accepted

X2, X1 fixed

X1 = 2

X1 = 3

YY

X1, X2 fixed

H01: = 0,rejected

X2 = 1

X2 = 2

2001



MulticollinearityMulticollinearityMulticollinearityMulticollinearity Independent variables are

correlated, and therefore, not independent: evaluate by looking at covariance or correlation matrix.

Independent variables are correlated, and therefore, not independent: evaluate by looking at covariance or correlation matrix.

Variable X1 X2 X3

X1

2 12

13

X2

21

2 23

X3

31

32

2

Variance

Covariance

X1

colinear

X2

independent

X3

X2

2001



Multicollinearity: problemsMulticollinearity: problemsMulticollinearity: problemsMulticollinearity: problems

If two independent variables X1 and X2 are uncorrelated, then the model sums of squares for a linear model with both included equals the sum of the SSmodel for each considered separately.

But if they are correlated, the former will be less than the latter.

So, the real question is: given a model with X1 included, how much does SSmodel increase when X2 is also included (or vice versa)?

If two independent variables X1 and X2 are uncorrelated, then the model sums of squares for a linear model with both included equals the sum of the SSmodel for each considered separately.

But if they are correlated, the former will be less than the latter.

So, the real question is: given a model with X1 included, how much does SSmodel increase when X2 is also included (or vice versa)?

SS SS SS

if

X X X X

X X

model model model1 2 1 2

1 2

2 0

,

,

SS SS SS

if

X X X X

X X

model model model1 2 1 2

1 2

2 0

,

,

2001



Multicollinearity: consequencesMulticollinearity: consequences

inflated standard errors for regression coefficients sensitivity of parameter estimates to small

changes in data But, estimates of partial regression coefficients

remain unbiased. One or more independent variables may not

appear in the final regression model not because they do not covary with Y, but because they covary with another X.

2001



Detecting multicollinearityDetecting multicollinearity

high R2 but few or no significant t-tests for individual independent variables

high pairwise correlations between X’s high partial correlations among regressors

(independent variables are a linear combination of others)

Eigenvalues, condition index, tolerance and variance inflation factors

2001



Quantifying the Quantifying the effect of effect of

multicollinearitymulticollinearity



Eigenvectors: a set of “lines” 1, 2,…, k in a k-dimensional space which are orthogonal to each other

Eigenvalue: the magnitude (length) of the corresponding eigenvector

Eigenvectors: a set of “lines” 1, 2,…, k in a k-dimensional space which are orthogonal to each other

Eigenvalue: the magnitude (length) of the corresponding eigenvector

X2

X1

1

2

X2X

1

1

2

1

2

2001






multicollinearitymulticollinearity Eigenvalues: if all k

eigenvalues are approximately equal, multicollinearity is low.

Condition index: sqrt(l /s); near 1 indicates low multicollinearity.

Tolerance: 1 - proportion of variance in each independent variable accounted for by all other independent variables: near 1 indicates low multicollinearity.

Eigenvalues: if all k eigenvalues are approximately equal, multicollinearity is low.

Condition index: sqrt(l /s); near 1 indicates low multicollinearity.

Tolerance: 1 - proportion of variance in each independent variable accounted for by all other independent variables: near 1 indicates low multicollinearity.

X2

X1

Low correlation 1 = 2

X2X

1

High correlation 1 >> 2

2001



Remedial measuresRemedial measures

Get more data to reduce correlations. Drop some variables. Use principal component or ridge regression,

which yield biased estimates but with smaller standard errors.

2001



Multiple regression: the general ideaMultiple regression: the general ideaMultiple regression: the general ideaMultiple regression: the general idea Evaluate significance of a

variable by fitting two models: one with the term in, the other with it removed.

Test for change in model fit ( MF) associated with removal of the term in question.

Unfortunately, M F may depend on what other variables are in model if there is multicollinearity!

Evaluate significance of a variable by fitting two models: one with the term in, the other with it removed.

Test for change in model fit ( MF) associated with removal of the term in question.

Unfortunately, M F may depend on what other variables are in model if there is multicollinearity!

Model A(X1 in)

Model B(X2 out)

M F(e.g. R2)

Delete X1

( small)

Retain X1

( large)

2001



Fitting multiple regression modelsFitting multiple regression models

Goal: find the “best” model, given the available data.

Problem 1: what is “best”? highest R2? lowest RMS? highest R2 but contains only individually

significant independent variables? maximizes R2 with minimum number of

independent variables?

2001



Selection of independent variables Selection of independent variables (cont’d)(cont’d)

Problem 2: even if “best” is defined, by what method do we find it?

Possibilities: compute all possible models (2k -1) and

choose the best one. use some procedure for winnowing down the

set of possible models.

2001



Strategy I: computing all possible Strategy I: computing all possible modelsmodels

Strategy I: computing all possible Strategy I: computing all possible modelsmodels

Compute all possible models and choose the “best” one.

cons: time-consuming leaves definition of

“best” to researcher

pros: if the “best” model is

defined, you will find it!

Compute all possible models and choose the “best” one.

cons: time-consuming leaves definition of

“best” to researcher

pros: if the “best” model is

defined, you will find it!

{X1, X2, X3}

{X2}

{X1}

{X3}

{X1, X2}

{X2, X3}

{X1, X3}

{X1, X2, X3}

2001



Strategy II: Strategy II: forward selectionforward selection

Strategy II: Strategy II: forward selectionforward selection Start with variable that has

highest (significant) R2, i.e. highest partial correlation coefficient r.

Add others one at a time until no further significant increase in R2 with js recomputed at each step.

problem: if Xj is included, it stays in even if it contributes little to the SSmodel once other variables are included.

Start with variable that has highest (significant) R2, i.e. highest partial correlation coefficient r.

Add others one at a time until no further significant increase in R2 with js recomputed at each step.

problem: if Xj is included, it stays in even if it contributes little to the SSmodel once other variables are included.

{X1, X2, X3}

{X2}

r2 > r1 > r3

{X1, X2, X3}

{X1, X2}

RR2

RR21

R21R2

R21R2

{X2}

{X1, X2, X3}

Finalmodel

R123R21

{X1, X2}

R123R21

2001



Forward selection: Forward selection: order of entryorder of entry

Forward selection: Forward selection: order of entryorder of entry

Begin with the variable with the highest partial correlation coefficient.

Next entry is that variable which gives largest increase in overall R2 by F-test of significance of increase, above some specified F-to-enter (below specified p to enter) value.

Begin with the variable with the highest partial correlation coefficient.

Next entry is that variable which gives largest increase in overall R2 by F-test of significance of increase, above some specified F-to-enter (below specified p to enter) value.

{X1, X2, X3, X4}

{X2}

r2 > r1 > r3 > r4

{X2, X1}

{X2, X4}

p[F(X2, X4)] = .55

X4 eliminated

p to enter = .05

{X2, X3} {X2, X1}

p[F(X2)] = .001

p[F(X2, X1)] = .002p[F(X2, X3)] = .04

...

{X2, X3}

2001



Strategy III: Strategy III: backward selectionbackward selection

Strategy III: Strategy III: backward selectionbackward selection

Start with all variables. Drop variables whose

removal does not significantly reduce R2, one at a time, starting with the one with the lowest partial correlation coefficient.

But, once Xj is dropped, it stays out even if it explains a significant amount of the remaining variability once other variables are excluded.

Start with all variables. Drop variables whose

removal does not significantly reduce R2, one at a time, starting with the one with the lowest partial correlation coefficient.

But, once Xj is dropped, it stays out even if it explains a significant amount of the remaining variability once other variables are excluded.

{X1, X2, X3}

{X3}

r2 < r1 < r3

{X1, X3} RR13

R3R13

R13R123

{X3}

{X1, X2, X3}

Finalmodel

RR123

R13R123

R3R13

{X1, X3}

2001



Backward selection: Backward selection: order of entryorder of entry

Backward selection: Backward selection: order of entryorder of entry

Begin with the variable with the smallest partial correlation coefficient.

Next removal is that variable which gives the smallest increase in overall R2 by F-test of significance of increase, below some specified F-to-remove (above specified p to remove) value.

Begin with the variable with the smallest partial correlation coefficient.

Next removal is that variable which gives the smallest increase in overall R2 by F-test of significance of increase, below some specified F-to-remove (above specified p to remove) value.

{X1, X2, X3, X4}

{X2, X1, X3}

r2 > r1 > r3 > r4

{X2, X1}

p[F(X2, X1)] = .25

p to remove = .10

p[F(X2, X3)] = .001

...

p[F(X2, X1, X3)] = .44

X4 removed

X3 removed X1 , X2 still in

X2, X3, X1 still in

{X1, X3}{X2, X3}

p[F(X1, X3)] = .009

2001



Strategy IV: stepwise Strategy IV: stepwise selectionselection

Strategy IV: stepwise Strategy IV: stepwise selectionselection

Once a variable is included (removed), set of remaining variables is scanned for other variables that should now be deleted (included), including those added (removed) at earlier stages.

To avoid infinite loops, we usually set p to enter > p to remove.

Once a variable is included (removed), set of remaining variables is scanned for other variables that should now be deleted (included), including those added (removed) at earlier stages.

To avoid infinite loops, we usually set p to enter > p to remove.

{X1, X2, X3, X4}

{X2}

r2 > r1 > r4 > r3

{X1, X2, X3}

{X2, X4}

p[F(X2, X4)] = .03

p to enter = .10p to remove = .05

{X2, X3} {X2, X1}

p[F(X2)] = .001

p[F(X2, X1)] = .002p[F(X2, X3)] = .09

{X1, X2, X4}

p[F(X1, X2, X4)] = .02 p[F(X1, X2, X3)] = .19{X1, X4}

2001



ExampleExample

log of herptile species richness (logherp) as a function of log wetland area (logarea), percentage of land within 1 km covered in forest (cpfor2) and density of hard-surface roads within 1 km (thtdens)

2001



Example (all variables)Example (all variables)

DEP VAR: LOGHERP N: 28 MULTIPLE R: 0.740SQUARED MULTIPLE R: 0.547ADJUSTED SQUARED MULTIPLE R: .490STANDARD ERROR OF ESTIMATE: 0.162

VARIABLE COEFF. SE STD COEF. TOL. T P

CONSTANT 0.285 0.191 0.000 . 1.488 0.150 LOGAREA 0.228 0.058 0.551 0.978 3.964 0.001 CPFOR2 0.001 0.001 0.123 0.744 0.774 0.447 THTDEN -0.036 0.016 -0.365 0.732 -2.276 0.032

2001



Example (cont’d)Example (cont’d)

ANALYSIS OF VARIANCE

SOURCE SS DF MS F-RATIO P

REGRESSION 0.760 3 0.253 9.662 0.000 RESIDUAL 0.629 24 0.026

2001



Example: forward stepwiseExample: forward stepwiseDEPENDENT VARIABLE LOGHERP MINIMUM TOLERANCE FOR ENTRY INTO MODEL = .010000 FORWARD STEPWISE WITH ALPHA-TO-ENTER= .10 AND ALPHA-TO-REMOVE= .05

STEP # 0 R= .000 RSQUARE= .000

VARIABLE COEFF. SE. STD COEF. TOL. F 'P' IN --- 1 CONSTANT OUT PART. CORR --- 2 LOGAREA 0.596 . . .1E+01 14.321 0.001 3 CPFOR2 0.305 . . .1E+01 2.662 0.115 4 THTDEN -0.496 . . .1E+01 8.502 0.007

2001



Forward stepwise (cont’d)Forward stepwise (cont’d)STEP # 1 R= .596 RSQUARE= .355TERM ENTERED: LOGAREA

VARIABLE COEFF. SE. STD COEF. TOL. F 'P'

IN --- 1 CONSTANT 2 LOGAREA 0.247 0.065 0.596 .1E+01 14.321 0.001

OUT PART. CORR --- 3 CPFOR2 0.382 . . 0.99 4.273 0.049 4 THTDEN -0.529 . . 0.98 9.725 0.005

2001



Forward stepwise (cont’d)Forward stepwise (cont’d)

STEP # 2 R= .732 RSQUARE= .536 TERM ENTERED: THTDEN

VARIABLE COEFF. SE. STD COEF .TOL. F 'P'

IN --- 1 CONSTANT 2 LOGAREA 0.225 0.057 0.542 0.98 15.581 0.001 4 THTDEN -0.042 0.013 -0.428 0.98 9.725 0.005

OUT PART. CORR --- 3 CPFOR2 0.156 . . 0.74380 0.599 0.447

2001



Forward stepwise: final modelForward stepwise: final model

FORWARD STEPWISE: P TO INCLUDE = .15 DEP VAR: LOGHERP N: 28 MULTIPLE R: 0.732SQUARED MULTIPLE R: 0.536ADJUSTED SQUARED MULTIPLE R: .490STANDARD ERROR OF ESTIMATE: 0.161


CONSTANT 0.376 0.149 0.000 . 2.521 0.018 LOGAREA 0.225 0.057 0.542 0.984 3.947 0.001 THTDEN -0.042 0.013 -0.428 0.984 -3.118 0.005

2001



Example: backward stepwise (final Example: backward stepwise (final model)model)

BACKWARD STEPWISE: P TO REMOVE = .15 DEP VAR: LOGHERP N: 28 MULTIPLE R: 0.732SQUARED MULTIPLE R: 0.536ADJUSTED SQUARED MULTIPLE R: .499STANDARD ERROR OF ESTIMATE: 0.161


CONSTANT 0.376 0.149 0.000 . 2.521 0.018 LOGAREA 0.225 0.057 0.542 0.984 3.947 0.001 THTDEN -0.042 0.013 -0.428 0.984 -3.118 0.005

2001



Example: subset modelExample: subset model



CONSTANT 0.027 0.167 0.000 . 0.162 0.872 LOGAREA 0.248 0.062 0.597 1.000 4.022 0.000 CPFOR2 0.003 0.001 0.307 1.000 2.067 0.049

2001



What if relationship between Y What if relationship between Y and one or more X’s is nonlinear?and one or more X’s is nonlinear?

Option 1: transform data. Option 2: use non-linear regression. Option 3: use polynomial regression.

2001



In polynomial regression, the regression model includes terms of increasingly higher powers of the dependent variable.

In polynomial regression, the regression model includes terms of increasingly higher powers of the dependent variable.

The polynomial regression modelThe polynomial regression modelThe polynomial regression modelThe polynomial regression model

Y Xi jj

k

ij

i

1

10

100

1000

10 30 50 70 90 110

Current velocity (cm/s)

Bla

ck f

ly b

iom

ass

(mg

DM

/m²)

Linear model2nd orderpolynomial model

2001



Fit simple linear model. Fit model with quadratic,

test for increase in SSmodel .

Continue with higher order (cubic, quartic, etc.) until there is no further significant increase in SSmodel .

Include terms of order up to the power of (number of points of inflexion plus 1).

Fit simple linear model. Fit model with quadratic,

test for increase in SSmodel .

Continue with higher order (cubic, quartic, etc.) until there is no further significant increase in SSmodel .

Include terms of order up to the power of (number of points of inflexion plus 1).

The polynomial regression model: The polynomial regression model: procedureprocedure

The polynomial regression model: The polynomial regression model: procedureprocedure

10

100

1000

10 30 50 70 90 110

Current velocity (cm/s)

Bla

ck f

ly b

iom

ass

(mg

DM

/m²)

Linear model2nd orderpolynomial model

2001



Polynomial regression: caveatsPolynomial regression: caveatsPolynomial regression: caveatsPolynomial regression: caveats

The biological significance of the higher order terms in a polynomial regression (if any) is generally not known.

By definition, polynomial terms are strongly correlated; hence, standard errors will be large (precision is low), and increase with the order of the term.

The biological significance of the higher order terms in a polynomial regression (if any) is generally not known.

By definition, polynomial terms are strongly correlated; hence, standard errors will be large (precision is low), and increase with the order of the term.

Extrapolation of polynomial models is always nonsense.

Extrapolation of polynomial models is always nonsense.

X1

Y

Y = X1- X12

2001



Power analysis Power analysis in GLM in GLM

(including MR)(including MR)

Power analysis Power analysis in GLM in GLM

(including MR)(including MR)

In any GLM, hypotheses are tested by means of an F-test.

Remember: the appropriate SSerror and dferror depends on the type of analysis and the hypothesis under investigation.

Knowing F, we can compute R2, the proportion of the total variance in Y explained by the factor (source) under consideration.

In any GLM, hypotheses are tested by means of an F-test.

Remember: the appropriate SSerror and dferror depends on the type of analysis and the hypothesis under investigation.

Knowing F, we can compute R2, the proportion of the total variance in Y explained by the factor (source) under consideration.

F

FR

df

df

SS

SS

dfSS

dfSS

MS

MSF

factor

error

error

factor

errorerror

factorfactor

error

factor

1

/

/

2

2001



Partial and total Partial and total RR22Partial and total Partial and total RR22

The total R2 (R2Y•B) is the

proportion of variance in Y accounted for (explained by) a set of independent variables B.

The partial R2 (R2Y•A,B- R2

Y•A ) is the proportion of variance in Y accounted for by B when the variance accounted for by another set A is removed.

The total R2 (R2Y•B) is the

proportion of variance in Y accounted for (explained by) a set of independent variables B.

The partial R2 (R2Y•A,B- R2

Y•A ) is the proportion of variance in Y accounted for by B when the variance accounted for by another set A is removed.

Proportion of varianceaccounted for by both A

and B (R2Y•A,B)

Proportion of variance

accounted for by A only

(R2Y•A)(total R2)

Proportion of variance accounted

for by Bindependent of A

(R2Y•A,B- R2

Y•A )(partial R2)

2001



Partial and total Partial and total RR22

Partial and total Partial and total RR22

The total R2 (R2Y•B) for

set B equals the partial R2 (R2

Y•A,B- R2Y•A ) with

respect to set B if either (1) the total R2 for A (R2

Y•A) is zero, or (2) if A and B are independent (in which case R2

Y•A,B= R2

Y•A + R2Y•B).

The total R2 (R2Y•B) for

set B equals the partial R2 (R2

Y•A,B- R2Y•A ) with

respect to set B if either (1) the total R2 for A (R2

Y•A) is zero, or (2) if A and B are independent (in which case R2

Y•A,B= R2

Y•A + R2Y•B).


accounted for by B

(R2Y•B)(total R2)


independent of A(R2

Y•A,B- R2Y•A )

(partial R2)

A

Y

B

A

Equal iff

2001



Partial and total Partial and total RR2 2 in multiple regressionin multiple regressionPartial and total Partial and total RR2 2 in multiple regressionin multiple regression

Suppose we have three independent variables X1 ,X2

and X3 .

Suppose we have three independent variables X1 ,X2

and X3 .

32321

32

1

321

,2

,,22

,2

,22

22

,,2

,2

321 ,,

XXYXXXYAYBAY

XXYBY

XYAY

XXXYBAY

RRRR

RR

RR

RR

XXBXA

Log [P]

Lo

g P

rod

uct

ion

Log [Zoo]

2001



Defining effect size in multiple Defining effect size in multiple regressionregression

Defining effect size in multiple Defining effect size in multiple regressionregression

The effect size, denoted f2 is given by the ratio of the factor (source) R2

factor and the appropriate error R2

error.

Note: both R2factor and

R2error depend on the

null hypothesis under investigation.

The effect size, denoted f2 is given by the ratio of the factor (source) R2

factor and the appropriate error R2

error.

Note: both R2factor and

R2error depend on the

null hypothesis under investigation.

22

2

factor

error

Rf

R

2001



Case 1: a set B of variables {X1, X2, …} is related to Y, and the total R2 (R2

Y•B) is determined. The error variance proportion

is then 1- R2Y•B .

H0: R2Y•B = 0

Example: effect of wetland area, surrounding forest cover, and surrounding road densities on herptile species richness in southeastern Ontario wetlands

B ={LOGAREA, CPFOR2,THTDEN }

Case 1: a set B of variables {X1, X2, …} is related to Y, and the total R2 (R2

Y•B) is determined. The error variance proportion

is then 1- R2Y•B .

H0: R2Y•B = 0

Example: effect of wetland area, surrounding forest cover, and surrounding road densities on herptile species richness in southeastern Ontario wetlands

B ={LOGAREA, CPFOR2,THTDEN }

Defining effect Defining effect size in multiple size in multiple

regression: regression: case 1case 1

Defining effect Defining effect size in multiple size in multiple

regression: regression: case 1case 1

22

2

factor

error

Rf

R

2001






22

2

.5471.21

1 .547

factor

error

Rf

R

2001



Defining effect size in multiple Defining effect size in multiple regression: case 2regression: case 2


Case 2: the proportion of variance of Y due to B over and above that due to A is determined (R2

Y•A,B- R2Y•A

). The error variance proportion is

then 1- R2Y•A,B .

H0: R2Y•A,B- R2

Y•A = 0 Example: herptile richness in

southeastern Ontario wetlands B ={THTDEN}, A = {LOGAREA,

CPFOR2},AB = {LOGAREA, CPFOR2, THTDEN}

Case 2: the proportion of variance of Y due to B over and above that due to A is determined (R2

Y•A,B- R2Y•A

). The error variance proportion is

then 1- R2Y•A,B .

H0: R2Y•A,B- R2

Y•A = 0 Example: herptile richness in

southeastern Ontario wetlands B ={THTDEN}, A = {LOGAREA,

CPFOR2},AB = {LOGAREA, CPFOR2, THTDEN}

22

2

factor

error

Rf

R

2001





CONSTANT 0.027 0.167 0.000 . 0.162 0.872 LOGAREA 0.248 0.062 0.597 1.000 4.022 0.000 CPFOR2 0.003 0.001 0.307 1.000 2.067 0.049




2001





The proportion of variance of LOGHERP due to THTDEN (B) over and above that due to LOGAREA and CPFOR2 (A) is R2

Y•A,B- R2Y•A =.098 .

The error variance proportion is then 1- R2

Y•A,B= 1 - .547 . So effect size for variable

THTDEN is 0.216 .

The proportion of variance of LOGHERP due to THTDEN (B) over and above that due to LOGAREA and CPFOR2 (A) is R2

Y•A,B- R2Y•A =.098 .

The error variance proportion is then 1- R2

Y•A,B= 1 - .547 . So effect size for variable

THTDEN is 0.216 .

216.547.1

.449.547.

1 2},2,{

2}2,{

2},2,{

2

THTDENCPFORLOGAREA

CPFORLOGAREA

THTDENCPFORLOGAREA

R

R

R

f

2001



Determining powerDetermining powerDetermining powerDetermining power Once f2 has been

determined, either a priori (as an alternate hypothesis) or a posteriori (the observed effect size), calculate non-central F parameter .

knowing and factor (source) (1) and error (2) degrees of freedom, we can determine power from appropriate tables for given .

Once f2 has been determined, either a priori (as an alternate hypothesis) or a posteriori (the observed effect size), calculate non-central F parameter .

knowing and factor (source) (1) and error (2) degrees of freedom, we can determine power from appropriate tables for given .

= .05)

= .01)

Decreasing 2

1-

1 = 2

= .05 = .01

2 3 4 51 1.5 2 2.5

)1( 212 f

2001



Example: herptile richness in Example: herptile richness in southeastern Ontario wetlandssoutheastern Ontario wetlandsExample: herptile richness in Example: herptile richness in

southeastern Ontario wetlandssoutheastern Ontario wetlands sample of 28 wetlands 3 variables (LOGAREA,

CPFOR2, THTDEN) Dependent variable is log10 of

the number of herptile species.

What is probability of detecting a true effect size for CPFOR2 equal to the estimated effect size once effects of LOGAREA and THTDEN have been controlled for, given = 0.05?

sample of 28 wetlands 3 variables (LOGAREA,

CPFOR2, THTDEN) Dependent variable is log10 of

the number of herptile species.

What is probability of detecting a true effect size for CPFOR2 equal to the estimated effect size once effects of LOGAREA and THTDEN have been controlled for, given = 0.05?

Variable t p

LOGAREA(1)

3.96 0.001

THTDEN (2) -2.28 .032

CPFOR2 (3) .774 .447

R2{1,2,3} 0.547

R2{1,2 } 0.536

2001



Example: herptile richness in Example: herptile richness in southeastern Ontario wetlandssoutheastern Ontario wetlandsExample: herptile richness in Example: herptile richness in

southeastern Ontario wetlandssoutheastern Ontario wetlands Sample effect size

f2 for CPFOR2 once effects of LOGAREA and THTDEN have been controlled for = .024 .

Source (CPFOR2) df = 1 = 1

Error df = 2 = 28 - 1 - 1 - 1 = 25

Sample effect size f2 for CPFOR2 once effects of LOGAREA and THTDEN have been controlled for = .024 .

Source (CPFOR2) df = 1 = 1

Error df = 2 = 28 - 1 - 1 - 1 = 25

),(..)(.

)(f

..

...R

RRf

},,{

},{},,{

21

212

2321

221

23212

, ,given tables,from2716481251024

1

0245471

5365471

Lecture 13: Multiple linear regression

Documents

Transcript of Lecture 13: Multiple linear regression