Regression with 2 IVs

Generalization of Regression from 1 to 2 Independent Variables

Questions

• Write a raw score regression equation with 2 IVs in it.

• What is the difference in interpretation of b weights in simple regression vs. multiple regression?

• What happens to b weights if we add new variables to the regression equation that are highly correlated with ones already in the equation?

• Why do we report beta weights (standardized b weights)?

More Questions

• Write a regression equation with beta weights in it.

• How is it possible to have a significant r-square and non-significant b weights?

• What are the three factors that influence the standard error of the b weight?

• Describe R-square in two different ways, that is, using two distinct formulas. Explain the formulas.

EquationsebXaY 1 IV. Define terms.

Y a b X b X b X ek k 1 1 2 2 ... Multiple IVs.

One score, 1 intercept; 1 error; many slopes.

Y a b X b X b Xk k1 1 2 2 ... Predicted value.

Recall slope and intercept for 1 IV:

YXY XbYa

))(( YYXXxy Sum of cross-products.

Equations (2)

bx x y x x x y

x x x x122

1 1 2 2

( )( ) ( )( )

( )( ) ( )

bx x y x x x y

x x x x212

2 1 2 1

( )( ) ( )( )

( )( ) ( )

Note: b weights use SSx1, SSx2, and all 3 cross products.

a Y b X b X 1 1 2 2Unlike slopes, the intercept is a simple extension of the 1 IV case.

Numerical ExampleChevy mechanics; mechanical aptitude & conscientiousness. Find sums.

Job Perf Mech Apt

Y X1 X2 X1*Y X2*Y X1*X2

1 40 25 40 25 1000

2 45 20 90 40 900

1 38 30 38 30 1140

3 58 38 174 114 2204

Y X1 X2 X1*Y X2*Y X1*X2

65 1038 655 3513 2219 34510 Sum

20 20 20 20 20 20 N

3.25 51.9 32.75 175.65 110.95 1725.5 M

1.25 7.58 5.24 84.33 54.73 474.60 SD

29.75 1091.8 521.75 USS

x y X YX Y

x x X XX X

1 2 1 21 2

( )( )

5.13920

)65)(1038(35131 yx

Note. Only some of the data are shown. Size problem in Powerpoint.

SSCP MatrixSSCP means sums of squares and cross-products.

Y X1 X2

Y 29.75

X1 139.5 1091.8

X2 90.25 515.5 521.75

Y X1 X2

yx2 21 xx

Find EstimatesY X1 (MA) X2 (Consc)

Y (Perf) 29.75

X1 139.5 1091.8

X2 90.25 515.5 521.75

bx x y x x x y

x x x x122

1 1 2 2

( )( ) ( )( )

( )( ) ( )

75 139 5 5 90 25

10918 75 5 5

(521. )( . ) (515. )( . )

( . )(521. ) (515. )(515. )

72784 13 4652388

7 265740 3

( . ) ( . )

(569646. ) ( . )

26260 25

303906 4086409 09

bx x y x x x y

x x x x212

2 1 2 1

( )( ) ( )( )

( )( ) ( )

10918 90 25 5 139 5

10918 75 5 5

( . )( . ) (515. )( . )

( . )(521. ) (515. )(515. )

26622 7

303906 4087602 09

a Y b X b X 1 1 2 2

a 3 25 086409 9 087602 32 75 4 10. . (51. ) . ( . ) .

21 09.09.1.4' XXY

Predicted job performance as a function of test scores.

Scatterplots

7060504030

Mechanical Aptitude

Scatterplot

50403020

Conscientiousness

Scatterplot

7060504030

Mechanical Aptitude

Scatterplot

Scatterplot 2

Scatterplot of Predicted vs. Actual Values

Scatterplot 33D Plot of Regression

Predicted Y is a plane.

R2 Y is linear function of Xs plus error.Y a b X b X b X ek k 1 1 2 2 ...

Y X1 X2 Y' Resid

2 45 20 1.54 0.46

1 38 30 1.81 -0.81

3 50 30 2.84 0.16

2 48 28 2.50 -0.50

3 55 30 3.28 -0.28

3 53 34 3.45 -0.45

4 55 36 3.80 0.20

4 58 32 3.71 0.29

3 40 34 2.33 0.67

5 55 38 3.98 1.02

3 48 28 2.50 0.50

3 45 30 2.41 0.59

2 55 36 3.80 -1.80

4 60 34 4.06 -0.06

5 60 38 4.41 0.59

5 60 42 4.76 0.24

5 65 38 4.84 0.16

4 50 34 3.19 0.80

3 58 38 4.24 -1.24

M = 3.25 51.9 32.75 3.25 0

V = 1.57 57.46 27.46 1.05 0.52

USS=29.83 19.95 9.42

Use capital R for multiple regression.2

R2 is the proportion of variance in Y due to regression.

61.~57.1

05.12 R

61.;78. 2', RrR YY

Note: N=19; lost 1.

Correlations Among Data

Y X1 X2 Pred Resid

X1 .73 1

X2 .68 .64 1

Pred .78 .94 .87 1

Resid .62 0 0 0 1

Excel Example

Grab from the web under Lecture, Excel Example.

Review

• Write a raw score regression equation with 2 IVs in it. Describe terms.

• Describe a concrete example where you would use multiple regression to analyze the data.

• What does R2 mean in multiple regression? • For your concrete example, what would an R2

of .15 mean?• With 1 IV, the IV and the predicted values

correlate 1.0. Not so with 2 or more IVs. Why?

Significance Test for R2

( ) / ( )When the null is true, result is distributed as F with k and (N-k-1) df.

In our example, R2 = .61, k = 2 and N = 20.

29.1317/39.

)1220/()61.1(

F(α=.05,2,17)=3.59.

The Problem of Variable Importance

With 1 IV, the correlation provides a simple index of the ‘importance’ of that variable. Both r and r2 are good indices of importance with 1 IV.

With multiple IVs, total R-square will be the sum of the individual IV r2 values, if and only if the IVs are mutually uncorrelated, that is, they correlate to some degree with Y, but not with each other.

When multiple IVs are correlated, there are many different statistical indices of the ‘importance’ of the IVs, and they do not agree with one another. There is no simple answer to questions about the importance of correlated IVs. Rather there are many reasonable answers depending on what you mean by importance.

Venn Diagrams {easy but not always right}

UY:X1 UY:X2

Shared Y

Shared X

Fig 1. IVs uncorrelated. Fig 2. IVs correlated.

r2 for X1, Y.R Y X1 X2

X1 .40 1

X2 .50 .30 1

R2=.52+.62=.61

R Y X1 X2

X1 .50 1

X2 .60 .00 1

R2=.32

R2 .16+.25 = .41What to do with shared Y?

More Venn Diagrams

Desired state Typical state

In a regression problem, we want to predict Y from X as well as possible (maximize R2). To do so, want X variables correlated with Y but not X. Hard to find, e.g., cognitive ability tests.

Raw & Standardized Regression Weights• Each X has a raw

score slope, b.• Slope tells expected

change in Y if X changes 1 unit*.

• Large b weights should indicate important variables, but b depends on variance of X.

• A b for height in inches would be 12 times larger than b for height in feet.

• If we standardize X and Y, all units of X are the same.

• Relative size of b now meaningful.

*strictly speaking, holding other X variables constant.

Computing Standardized Regression Weights

xxyy zrz ' Standardized regression weight aka beta weight. Poor choice of names & symbols.

xy zz 'With 1 IV, .r

If you have a correlation matrix, you can calculate beta weights (standardized regression weights).

Y x1 x2

x1 0.73 1

x2 0.68 0.64 1

11 2 12

22 1 12

What is r12? What impact?

50.64.1

)64(.68.73.21

36.64.1

)64(.73.68.22

Calculating R2

regSSR

R r r only if ry y2

12 0 ( )

R r ry y2

1 1 2 2

Sum of squared simple (zero order) r.

Product of standardized regression weight and r.

This is really interesting because the sum of products will add up to R2 and because r, , and the product of the two are all reasonable indices of the importance of the IV.

Calculating R2 (2) Y x1 x2

x1 0.73 1

x2 0.68 0.64 1

R r ry y2

1 1 2 2

61.24.365.)68(.36.)73(.50.2 R

07.168.

73. 39.1

50. 52.1

Rr r r r r

ry y y y2 12

1 2 12

50.64.1

)64(.68.73.21

36.64.1

)64(.73.68.22

61.64.1

)64)(.68)(.73(.268.73.2

Review

• What is the problem with correlated independent variables if we want to maximize variance accounted for in the criterion?

• Why do we report beta weights (standardized b weights)?

• Describe R-square in two different ways, that is, using two distinct formulas. Explain the formulas.

Tests of Regression Coefficients (b Weights)Each slope tells the expected change in Y when X changes 1 unit, but X is controlled for all other X variables. Consider Venn diagrams. Standard errors of b weights with 2 IVs:

Where S2y.12 is the variance of estimate (variance of

residuals), the first term in the denominator is the sum of squares for X1 or X2, and r2

12 is the squared correlation between predictors.

N kyres

Tests of b Weights (2)SSres=9.42 59.

SSresS y

8.109121x 75.5212

03.)64.1(8.1091

bS 04.)64.1(75.521

For significance of the b weight, compute a t:

bt 11.217,05. t

25.204.

Degrees of freedom for each t are N-k-1.

Tests of R2 vs Tests of b

• Slopes (b) tell about the relation between Y and the unique part of X. R2 tells about proportion of variance in Y accounted for by set of predictors all together.

• Correlations among X variables increase the standard errors of b weights but not R2.

• Possible to get significant R2, but no or few significant b weights (see Venn diagrams).

• Possible but unlikely to have significant b but not significant R2. Look to R2 first. If it is n.s., avoid interpreting b weights.

Review

• How is it possible to have a significant R-square and non-significant b weights?

• Write a regression equation with beta weights in it. Describe terms.

Testing Incremental R2

You can start regression with a set of one or more variables and then add predictors 1 or more at a time. When you add predictors, R2 will never go down. It usually goes up, and you can test whether the increment in R2 is significant or else if likely due to chance.

)1/()1(

)/()(2

kkRRF 2

=R-square for the larger model

=R-square for the smaller model

= number of predictors in the larger model

=number of predictors in the smaller model

Examples of Testing IncrementsSuppose we start with 1 variable and R-square is .52. We add a second variable and R-square increases to .67. We have 20 people. Then

73.717/33.

)1220/()67.1(

)12/()52.67(.

F 45.4)17,1,05.( F p<.05

Suppose we start with 3 IVs and R-square is .25. We add 2 more IVs in a block and R-square climbs to .35. We have 100 people. Then:

23.794/65.

)15100/()35.1(

)35/()25.35(.

F 09.3)94,2,05.( F

p <.05

Another Look at Importance• In regression problems, the most commonly used indices of importance are the correlation, r, and the increment to R-square when the variable of interest is considered last. The second is sometimes called a last-in R-square change. The last-in increment corresponds to the Type III sums of squares and is closely related to the b weight.• The correlation tells about the importance of the variable ignoring all other predictors.• The last-in increment tells about the importance of the variable as a unique contributor to the prediction of Y, above and beyond all other predictors in the model.•You can assign shared variance in Y to specific X by adding variable to equations in order, but then the importance is sort of arbitrary and under your influence.•“Importance” is not well defined statistically when IVs are correlated. Doesn’t include mediated models (path analysis).

Review

• Find data on website – Labs, then 2IV example

• Find r, beta, r*beta

• Describe importance

Regression with 2 IVs

Documents

Transcript of Regression with 2 IVs

Robust Regression. Regression Methods We are going to look at three approaches to robust regression: Regression with robust standard errors Regression.

Instructors’ Manual for Regression Modeling with …instruction.bus.wisc.edu/jfrees/jfreesbooks/Regression Modeling...Instructors’ Manual for Regression Modeling with Actuarial

Ivs Silicon Report

Regression with Panel Data. Panel Data Panel Data with Two Periods Fixed Effects Regression The Model Estimation Regression with Time Fixed Effects.

Regression With Stata

Regression with 2 IVs Generalization of Regression from 1 to 2 Independent Variables.

Title stata.com etregress — Linear regression with ... · etregress — Linear regression with endogenous treatment effects ... etregress— Linear regression with endogenous ...

Regression with Mathematica

Multiple Regression With Two Predictor Variables€¦ · 11 Multiple Regression With Two Predictor Variables 423 Research Situations Involving Regression With Two Predictor Variables

Ivs Presentation

Chapter 4 Linear Regression with One Regression. 2 Linear Regression with One Regressor (SW Chapter 4) Linear regression allows us to estimate, and.

International VLBI Service for Geodesy and Astrometry (IVS ...Figure 1: IVS components and their global distribution 3. IVS Products and Goals In 2001 the IVS Working Group 2 (WG2)

MD-3025-IVS - Airlivefs.airlive.com/manual/AirLive_MD-3025-IVS_Spec.pdf9F M- - D T 23 TIAN MD-3025-IVS 3-Megapixel IR Vandal Mini Dome IPCAM with Video Analytics irLive MD-3025-IVS

Basic Regression with Time Series Data - Purdue Universityweb.ics.purdue.edu/~bvankamm/Files/360 Notes/09 - Regression with... · Basic Regression with Time Series Data ... In order

IVS Distribution System IVS - Table of Contents - Eatonpub/@denmark/documents/content… · IVS Distribution System IVS - Table of Contents Overview Page xx System characteristics

International Valuation Standards (IVS) · International Valuation Standards (IVS) are a fundamental part of the financial system, along with high levels of professionalism in applying

Kguard Dvr2008 en Manual(With Ivs&Pos&Ddns)v2.0

Improvement of the IVS INT01 Sessions through Bayesian ......The IVS tries to improve IVS-INT01 UT1 estimates, e.g., by improving its accuracy with respect to the 24-hour sessions.

Stepwise Regression With PRESS and Rank Regression ...

Doing Multiple Regression with SPSS Multiple Regression ...math.ou.edu/~mcknight/4753/spss/SPSS9.pdf · 1 Doing Multiple Regression with SPSS Multiple Regression for Data Already