Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue...

19
Regression Examples

Transcript of Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue...

Page 1: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

Regression Examples

Page 2: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

Gas Mileage 1993

SOURCES:• Consumer Reports: The 1993 Cars -

Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union.

• PACE New Car & Truck 1993 Buying Guide (1993), Milwaukee, WI: Pace Publications Inc.

• Specifications are given for 93 new car models for the 1993 year.

Page 3: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

Gas Mileage vs Weight

• Several measures are available, such as price, mpg ratings, engine size, cylinders, weight, horsepower, etc.

• We consider the relationship between weight and highway mpg.

• Since more fuel is needed to move more weight, an increase in weight should result in a decrease in mpg.

Page 4: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.
Page 5: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

Pearson Correlation Coefficients, N = 93 Prob > |r| under H0: Rho=0

mpg weight

mpg 1.00000 -0.81066

<.0001

weight -0.81066 1.00000

<.0001

SAS Output

Page 6: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

The REG Procedure Model: MODEL1 Dependent Variable: mpg

Number of Observations Read 93 Number of Observations Used 93

Analysis of Variance

Sum of Mean Source DF Squares Square F Value Pr > F

Model 1 1718.69528 1718.69528 174.43 <.0001 Error 91 896.61655 9.85293 Corrected Total 92 2615.31183

Root MSE 3.13894 R-Square 0.6572 Dependent Mean 29.08602 Adj R-Sq 0.6534 Coeff Var 10.79191

Parameter Estimates

Parameter Standard Variable DF Estimate Error t Value Pr > |t|

Intercept 1 51.60137 1.73555 29.73 <.0001 weight 1 -0.00733 0.00055477 -13.21 <.0001

Page 7: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.
Page 8: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

Interpretation• From the graph it is evident that there is a fairly strong

negative correlation between weight and mpg.• The correlation output tells us that r=-.81.• The regression output, under parameter estimates, tells

us that the equation for the least squares line (best fit line) is mpg=51.6 - .0073*weight.

• Also, there are standard errors for the estimates that can be used to build confidence intervals, and there are t-values and p-values for a test of the hypothesis Ho: Parameter=0 (vs not =0).

• Since the p-value for weight is .0001, we can reject Ho and conclude that there is a linear relationship between weight and mpg (weight contributes information about mpg, or helps to predict mpg).

Page 9: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

Gas Mileage vs Engine Size

• Generally speaking, larger engines burn more fuel (but is this due to weight?).

• We can check the relationship between liters (engine displacement) and mpg.

• We expect a negative relationship, since a larger engine would tend to decrease mpg.

Page 10: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.
Page 11: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

Pearson Correlation Coefficients, N = 93 Prob > |r| under H0: Rho=0

mpg liters

mpg 1.00000 -0.62679 <.0001

liters -0.62679 1.00000 <.0001

Page 12: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

The REG Procedure Model: MODEL1 Dependent Variable: mpg

Number of Observations Read 93 Number of Observations Used 93

Analysis of Variance

Sum of Mean Source DF Squares Square F Value Pr > F

Model 1 1027.48125 1027.48125 58.89 <.0001 Error 91 1587.83058 17.44869 Corrected Total 92 2615.31183

Root MSE 4.17716 R-Square 0.3929 Dependent Mean 29.08602 Adj R-Sq 0.3862 Coeff Var 14.36141

Parameter Estimates

Parameter Standard Variable DF Estimate Error t Value Pr > |t|

Intercept 1 37.68023 1.20080 31.38 <.0001 liters 1 -3.22153 0.41981 -7.67 <.0001

Page 13: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.
Page 14: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.
Page 15: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

Interpretation

• From the first graph it is evident that there is a negative relationship between weight and mpg.

• However, the pattern is not purely linear. It seems to be some kind of curve. Thus we will not expect a linear analysis to tell the whole story.

• The correlation output tells us that r=-.63.• The regression output, tells us that the equation for the

least squares line is mpg=37.68 – 3.22*liters.• The p-value for liters is .0001, so we conclude that mpg

has a linear relationship with engine size.• Which is a better predictor of mpg, engine size or

weight? We can use the R-square value to determine that. For weight it is .6572, and for liters it is .3929.

Page 16: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

More Advanced Interpretation• Which is a better predictor of mpg, engine size or

weight? We can use the R-square value to determine that. For weight it is .6572, and for liters it is .3929.

• R-square is an indication of the proportion of changes in y that are accounted for by x, so a larger value corresponds to a better predictor. Thus weight is a better predictor than engine size.

• The graphs show the best fit line and a best fit parabola (quadratic equation). The latter is provided for comparison purposes only.

• Note that the quadratic equation, even though it fits the points better, may not be a better model, because it shows mpg rising as engine sizes get very large. This does not make sense.

Page 17: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

Brief Look at Multiple Regression

• Now you might think, what if we wanted to use both weight and engine size to predict mpg?

• This idea is called multiple regression, and it involves making an equation with two or more “x” variables to predict y.

• The next regression output shows this.• Compare the R-square and p-values to

previous results.

Page 18: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

The REG Procedure Model: MODEL1 Dependent Variable: mpg

Number of Observations Read 93 Number of Observations Used 93

Analysis of Variance

Sum of MeanSource DF Squares Square F Value Pr > F

Model 2 1749.76356 874.88178 90.97 <.0001Error 90 865.54826 9.61720Corrected Total 92 2615.31183

Root MSE 3.10116 R-Square 0.6690 Dependent Mean 29.08602 Adj R-Sq 0.6617 Coeff Var 10.66203

Parameter Estimates

Parameter StandardVariable DF Estimate Error t Value Pr > |t|

Intercept 1 53.59100 2.04095 26.26 <.0001liters 1 1.04777 0.58295 1.80 0.0756weight 1 -0.00888 0.00103 -8.67 <.0001

Page 19: Regression Examples. Gas Mileage 1993 SOURCES: Consumer Reports: The 1993 Cars - Annual Auto Issue (April 1993), Yonkers, NY: Consumers Union. PACE New.

Interpretation• The least squares equation is

mpg=53.6-1.05*liters-.00888*weight.• The R-square for weight alone is .6572. In the

new model, it is .6690. It has gone up, but not much. This means that adding engine size to the equation does not improve predicted mpg very much.

• The p-value for weight is still very small, but the p-value for liters is now suspiciously large. Using alpha=.05, we would not reject that the coefficient of liters is zero, which means we are not able to detect a contribution to mpg.