Chapter 13: SIMPLE LINEAR REGRESSION. 2 Simple Regression Linear Regression.
SADC Course in Statistics Simple Linear Regression (Session 02)
-
Upload
abigail-hickey -
Category
Documents
-
view
236 -
download
0
Transcript of SADC Course in Statistics Simple Linear Regression (Session 02)
![Page 1: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/1.jpg)
SADC Course in Statistics
Simple Linear Regression
(Session 02)
![Page 2: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/2.jpg)
2To put your footer here go to View > Header and Footer
Learning ObjectivesAt the end of this session, you will be able to
• understand the meaning of a simple linear regression model, its aims and terminology
• determine the best fitting line describing the relationship between a quantitative response (y) and a quantitative explanatory variable (x)
• Interpret the unknown parameters of the regression line
![Page 3: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/3.jpg)
3To put your footer here go to View > Header and Footer
An illustrative example
Data on the next slide shows the average number of cigarettes smoked per adult in 1930 and the death rate per million in 1952 for sixteen countries.
The question of interest is whether there is a relationship between the death rate (y) and level of smoking (x). Here both y and x are quantitative measurements.
![Page 4: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/4.jpg)
4To put your footer here go to View > Header and Footer
The DataCountry Cig. Smoked (x) Death rate (y)England and Wales 1378 461Finland 1662 433
Austria 960 380Nethelands 632 276Belgium 1066 254Switzerland 706 236New Zealand 478 216U.S.A. 1296 202Denmark 465 179Australia 504 177Canada 760 176France 585 140Italy 455 110Sweden 388 89Norway 359 77Japan 723 40
![Page 5: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/5.jpg)
5To put your footer here go to View > Header and Footer
Start by plotting - shows pattern
-a straight line relationship seems plausible here.
010
020
030
040
050
0D
eat
h ra
te (
y)
0 500 1000 1500 2000Cigarettes smoked (x)
![Page 6: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/6.jpg)
6To put your footer here go to View > Header and Footer
Recall reasons for modelling
• To determine which of (often) several factors explain variability in the key response of interest;
• To summarise the relationship(s);
• For predictive purposes, e.g. predicting y for given x’s, or identifying x’s that optimise y in some way;
Note: Presence of an association betweenvariables does not necessarily implycausation.
![Page 7: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/7.jpg)
7To put your footer here go to View > Header and Footer
Describe variation in response (here death rate) in terms of its relationship with the explanatory variable (here cig. numbers).
Model : Model : data = pattern + residual
–can describe pattern as: a + bx , if straight line relationship seems
reasonable
–residual is unexplained variation - assumed to be random.
Describing the Regression Model
![Page 8: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/8.jpg)
8To put your footer here go to View > Header and Footer
If there is only one explanatory variable, we have a Simple Linear Regression Model.
Here data = pattern + residual becomes:
y = + x +
where + x =pattern and = residual.• is called the intercept• is called the slope• the ’s represent the departure of the true line from the observed values.
Simple Linear Regression Model
![Page 9: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/9.jpg)
9To put your footer here go to View > Header and Footer
A Diagrammatic Representation
}
}
x
y
y x
i
x
y
××
×
××
××
×
×
i
i
![Page 10: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/10.jpg)
10To put your footer here go to View > Header and Footer
and are the unknown parameters in the model. They are estimated from the data
• The random error, , is assumed to have a– normal distribution– with constant variance (whatever the
value of x)
We shall return to these assumptions later.
Parameters of Model & Assumptions
![Page 11: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/11.jpg)
11To put your footer here go to View > Header and Footer
Results of model fitting------------------------------------------------------ deathrate|Coef. Std.Err. t P>|t| [95% Conf.Int.]---------+--------------------------------------------Cigars | .2410 .0544 4.43 0.001 .1245 .3577Const. | 28.31 46.92 0.60 0.556 -72.34 128.95------------------------------------------------------
These are estimates of coefficients of the regression equation since this is a sample of data - precision quantified by standard errors
Estimated equation is: y = 28.31 + 0.241 * x
Note: The t and P>|t| columns will be discussed in the next session.
![Page 12: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/12.jpg)
12To put your footer here go to View > Header and Footer
The fitted line0
100
200
300
400
500
0 500 1000 1500 2000Cigarettes smoked (x)
Death rate (y) Fitted values
![Page 13: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/13.jpg)
13To put your footer here go to View > Header and Footer
Interpreting model parameters
• Slope (regression coefficient): If cigarettes smoked increases by 1 unit per year, death rate will increase by 0.24 units. In other words, if cigarettes smoked increases by 100 units, death rate will increase by 24 units.
• Intercept of 28.31 only has meaning if the range of x values (cigarettes smoked) under study includes the value of zero. Here zero cigarettes smoked still gives an estimated death rate of 28.3
![Page 14: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/14.jpg)
14To put your footer here go to View > Header and Footer
Predictions from the lineThe model equation can also be used to
predict y at a given value of x
Thus from y = 28.31 + 0.241 x, predicted death rate ( ) in a country where
number of cigarettes smoked is x=1000, is given by
= 28.31 + 0.241 (1000)= 269.3
Note: Predictions will be discussed in greater detail in Session 9.
ˆˆy x
y
![Page 15: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/15.jpg)
15To put your footer here go to View > Header and Footer
Computation of model estimates (for reference only)
i iˆy x ˆˆ y xn
i i i i
2 2i i
x y ( x )( y ) / n Sxyˆx ( x ) / n Sxx
Note: Can also write i i
2i
(x x)(y y)Sxy
Sxx (x x)
![Page 16: SADC Course in Statistics Simple Linear Regression (Session 02)](https://reader036.fdocuments.net/reader036/viewer/2022062318/5515fa83550346cf6f8b586f/html5/thumbnails/16.jpg)
16To put your footer here go to View > Header and Footer
Practical work follows to ensure learning objectives are
achieved…