Multiple Independent Variables
-
Upload
illiana-kaufman -
Category
Documents
-
view
54 -
download
0
description
Transcript of Multiple Independent Variables
![Page 1: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/1.jpg)
Multiple Independent Variables
POLS 300
Butz
![Page 2: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/2.jpg)
Multivariate Analysis
• Problem with bivariate analysis in nonexperimental designs:– Spuriousness and Causality
• Need for techniques that allow the research to control for other independent variables
![Page 3: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/3.jpg)
Multivariate Analysis
• Employed to see how large sets of variables are interrelated.
• Idea is that if one can find a relationship between x and y after accounting for other variables (w and z) we may be able to make “causal inference”.
![Page 4: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/4.jpg)
Multivariate Analysis
• We know that both X and Y both may be caused by Z, spurious relationship.
• Multivariate Analysis allows for the inclusions of other variables and to test if there is still a relationship between X and Y.
![Page 5: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/5.jpg)
Multivariate Analysis
• Must ask if the possibility of a third variable (and maybe others) is the “true” cause of both the IV and DV
• Experimental analyses “prove” causation but only in Laboratory Setting…must use Multivariate Statistical Analyses in “real-world”
• Need to “Control” or “hold constant” other variables to isolate the effect of IV on DV!
![Page 6: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/6.jpg)
Controlling for Other Independent Variables
• Multivariate Crosstabulation – evaluate bivariate relationship within subsets of sample defined by different categories of third variable (“control by grouping”)
• At what level(s) of measurement would we use Multivariate Crosstabulation??
![Page 7: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/7.jpg)
Multivariate Crosstabulation
• Control by grouping: group the observations according to their values on the third variable and…
• then observe the original relationship within each of these groups.
• P. 407/506 – Spending Attitudes and Voting…controlling for Income! – spurious
• Occupational Status and Voter Turnout• P. 411/510…control for “education”!
![Page 8: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/8.jpg)
Quick Review: Regression
• In general, the goal of linear regression is to find the line that best predicts Y from X.
• Linear regression does this by estimating a line that minimizes the sum of the squared errors from the line
• Minimizing the vertical distances of the data points from the line.
![Page 9: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/9.jpg)
Regression vs. Correlation
• The purpose of regression analysis is to determine exactly what the line is (i.e. to estimate the equation for the line)
• The regression line represents predicted values of Y based on the values of X
![Page 10: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/10.jpg)
Equation for a Line (Perfect Linear Relationship)
Yi = a + BXi
a = Intercept, or Constant = The value of Y when X = 0
B = Slope coefficient = The change (+ or ‑) in Y given a one unit change in X
![Page 11: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/11.jpg)
Slope• Yi = a + BXi
• B = Slope coefficient• If B is positive than you have a positive
relationship. If it is negative you have a negative relationship.
• The larger the value of B the more steep the slope of the line…Greater (more dramatic) change in Y for a unit change in X
• General Interpretation: For one unit change in X, we expect a B change in Y on average.
![Page 12: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/12.jpg)
Calculating the Regression Equation For “Threat Hypothesis”
• The estimated regression equation is:
E(welfare benefit1995) = 422.7879 + [(-6.292) * %black(1995)]
Number of obs = 50F( 1, 64) = 76.651Prob < = 0.001
R-squared = 0.3361------------------------------------------------------------------------------welfare1995 | Coef. Std. Err. t P< [95% Conf. Interval]---------+-------------------------------------------------------------------
Black1995(b)| -6.29211 .771099 -8.162 0.001 -8.1173 -4.0746 _cons(a)| 422.7879 12.63348 25.551 0.001 317.90407 336.6615------------------------------------------------------------------------------
![Page 13: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/13.jpg)
Regression Example: “Threat Hypothesis”
• To generate a predicted value for various % of AA in 1995, we could simply plug in the appropriate X values and solve for Y.
10% E(welfare benefit1995) = 422.7879 + [(-6.292) * 10] = $359.87
20% E(welfare benefit1995) = 422.7879 + [(-6.292) * 20] = $296.99
30% E(welfare benefit1995) = 422.7879 + [(-6.292) * 30] = $234.09
![Page 14: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/14.jpg)
Regression Analysis and Statistical Significance
• Testing for statistical significance for the slope
– The p-value - probability of observing a sample slope value (Beta Coefficent) at least as large as the one we are observing in our sample IF THE NULL HYPOTHESIS IS TRUE
– P-values closer to 0 suggest the null hypothesis is less likely to be true (P < .05 usually the threshold for statistical significance)
– Based on t-value…(Beta/S.E.) = t
![Page 15: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/15.jpg)
Multiple Regression
• At what level(s) of measurement would we employ multiple regression???
• Interval and Ratio DVs
• Now working with a new model:
• Yi = abXibX2ibkXki ei
![Page 16: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/16.jpg)
Multiple Regression
• Yi = abXibX2ibkXki ei
• b are “Partial” slope coefficients.
• a is the Y-Intercept.
• e is the Error Term.
![Page 17: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/17.jpg)
Slope Coefficients
• Slope coefficients are now Partial Slope Coefficients, although we still refer to them generally as slope coefficients. They have a new interpretation:
• “The expected change in Y given a one‑unit change in X1, holding all other variables constant”
![Page 18: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/18.jpg)
Multiple Regression
• By “holding constant” all other X’s, we are therefore “controlling for” all other X’s, and thus isolating the “independent effect” of the variable of interest.
• “Holding Constant” – group observations according to levels of X2, X3, ect…then look at impact of X1 on Y!
• This is what Multiple Regression is doing in practice!!!• Make everyone “equal” in terms of “control” variable then
examine the impact of X1 on Y!
![Page 19: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/19.jpg)
“Holding Constant” other IVs
• Income (Y) = Education (X1); Seniority (X2)
• Look at relationship between Seniority and Income WITHIN different levels of education!!! “Holding Education Constant”
• Look at relationship between Education and Income WITHIN different levels of Senority!!! “Holding Seniority Constant”
![Page 20: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/20.jpg)
The Intercept
Yi = abXibX2ibkXki ei
Y-Intercept (Constant) value…(a)…is now the expected value of Y when ALL the Independent Variables are set to 0.
![Page 21: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/21.jpg)
Testing for Statistical Significance
• Proceeds as before – a probability that the null hypothesis holds (p-value) is generated for each sample slope coefficient
• Based on “t-value” (Beta/ S.E.)
• And Degrees of Freedom!
![Page 22: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/22.jpg)
Fit of the Regression
• R-squared value – the proportion of variation in the dependent variable explained by ALL of the independent variables combined
• TSS – ResSS/ TSS… “Explained Variation in DV divided by Total Variation in DV”
![Page 23: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/23.jpg)
R-square
• R-square ranges from 0 to 1.
• 0 is no relationship.
• 1 is a prefect relationship…IVs explain 100% of the variance in the DV.
![Page 24: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/24.jpg)
R-square
• Doesn’t tell us WHY the dependent variable varies or explains the results….This is why we need Theory!!!
• Simply a measure of how well your model fits the dependent variable.
• How well are the Xs predicting Y!• How much variation in Y is explained by Xs!
![Page 25: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/25.jpg)
Multiple Regression
- Y= Income in dollars
• X1= Education in years
• X2= Seniority in years
• Y= a + b1(education) + b2(Seniority) + e
![Page 26: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/26.jpg)
Example
• Y= 5666 + 432X1 + 281X2 + e
- Both Coefficients are statistically significant at the P < .05 Level…
• Because of the positive Beta…expected change in Income (Y) given a one‑unit increase in Education is +$432, holding seniority in years constant.
![Page 27: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/27.jpg)
Predicted Values
• Lets predict someone with 10 years of education and 5 years of seniority.
• Y= 5666+432X1+281X2+e
• = 5666+432(10)+281(5)
• = 5666+ 4320+1405
• Predicted value of Y for this case is $11,391.
![Page 28: Multiple Independent Variables](https://reader036.fdocuments.net/reader036/viewer/2022070402/5681386a550346895da01b4c/html5/thumbnails/28.jpg)
R-squared
• r-squared for this model is .56.
• Education and Seniority explain 56% of the variation in income.