THE MULTIPLE REGRESSION MODEL. MULTIPLE REGRESSION In a multiple regression we are trying to...
-
date post
22-Dec-2015 -
Category
Documents
-
view
245 -
download
0
Transcript of THE MULTIPLE REGRESSION MODEL. MULTIPLE REGRESSION In a multiple regression we are trying to...
MULTIPLE REGRESSION
• In a multiple regression we are trying to evaluate the cumulative effects that changes to more than one independent variable (x1, x2, x3, etc.) or will have on a dependent variable (y)
Transformations to a LinearLinear Model• Multiple regression can used to evaluate models
like: y = 0 + 1 x1 + 2 x2 + 3 x1
2 + 4 x1 x2+ 5 x1/x2 + 6 logx1 +
– Define • x3 = x1
2
• x4 = x1 x2
• x5 = x1/x2
• x6 = log x1
• Then the model becomes:
y = y = 00 + + 1 1 xx11 + + 2 2 xx22 + + 3 3 xx33 + + 4 4 xx4 4 + + 5 5 xx55 + + 66xx66 + +
GENERAL FORM OF A MULTIPLE REGRESSION MODEL
Since we can make substitutions similar to those just described, the general multiple regression model can be expressed as:
y = y = 00 + + 1 1 xx11 + + 2 2 xx22 + + 3 3 xx33 + …. + + …. + k k xxkk + +
THE REGRESSION APPROACH
• Hypothesize a form of the model
• Determine the best estimates for the ’s
• Assumptions about • Testing the strength of the model
• Using the model for prediction/estimation
Example
• It is felt that the price of a house in Laguna Hills is a function of its square footage, its lot size, and its age.
• A sample of 38 recent sales in Laguna Hills is taken.
STEP 1: Hypothesizing a form of the model
• One variable -- scatterplot – If it looks curved, hypothesize a higher order model
and make transformations to a linear model
• More than one variable – Simply HYPOTHESIZE – make a best judgment as
the form of the model– Make appropriate substitution of variables so that the
model is linear
STEP 2: Determining the Best Estimates for the ’s
• Involves complicated matrix operations but still uses the method of least squares.
• Use computer (EXCEL) only
• But the best values for the ’s minimizes the sum of the squared errors between the actual values of y and the predicted values for y -- i.e. They minimize SSE.They minimize SSE.
Using Excel to Get the b’s
Go to TOOLS/DATA ANALYSIS/REGRESSION
Note B1:D39Must be a contiguous range
STEP 3: Assumptions For
For any given set of the x’s: has a normal distribution– E() = 0
Also:– Errors are independent does vary between different values of the x’s
Since there is more than one x,we say x’s -- not just x
That’s the only difference
STEP 4:Assessing the Strength of the Model
• Question 1:Question 1: Can we conclude that at least one of the independent variables (x’s) is useful in predicting y?
• Question 2:Question 2: If yes, which of the independent variables (x’s) are useful in predicting y?
• Question 3:Question 3: What proportion of the overall variation in y is due to the changes in the x’s?
These are addressed in another module.These are addressed in another module.
STEP 5: Use the Model for Prediction/Estimation
equation. regression theinto
x values thengsubstitutiby found is y
Prediction/Confidence Intervals
• These are possible– but not easily with EXCEL
• Other Stat packages -- MINITAB, SPSS, SAS perform these calculations.
Important Excel NoteImportant Excel Note -- Inputting a Contiguous Range for the X’s
• Suppose in this example we wished to regress Price on only Sq. Feet (column B) and Age (column D).– These are not next to each other– They must be next to each other for the regression module in
Excel to work
• Highlight the data in column D and click “CUT”
• Click cell C1, which is where you want the data to begin, with rightright mouse key
• Click INSERT CUT CELLS
1. Highlight cells D1:D39.
2. With right mouse key click Cut
3. Place cursor on cell C1.
4. With right mouse key click
Insert Cut Cells.
Review
• Multiple regression is used when –– y is a function of more than one x– y includes terms of x raised to a power
• This can be converted to a linear term
• Excel (or another stat package) is used to calculate the best estimates of the ’s
• The assumptions about the error term are the same is constant for all values of all the x’s