Statistics 202
-
Upload
mandeep-singh -
Category
Documents
-
view
17 -
download
0
description
Transcript of Statistics 202
![Page 1: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/1.jpg)
Correlation & Correlation & RegressionRegression
![Page 2: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/2.jpg)
CorrelationCorrelation
Finding the relationship between two quantitative variables without being able to infer causal relationships
Correlation is a statistical technique used to determine the degree to which two variables are related
![Page 3: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/3.jpg)
• Rectangular coordinate• Two quantitative variables• One variable is called independent (X) and
the second is called dependent (Y)• Points are not joined • No frequency table
Scatter diagram
![Page 4: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/4.jpg)
Example
![Page 5: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/5.jpg)
Scatter diagram of weight and systolic blood Scatter diagram of weight and systolic blood pressurepressure
8 0
1 0 0
1 2 0
1 4 0
1 6 0
1 8 0
2 0 0
2 2 0
6 0 7 0 8 0 9 0 1 0 0 1 1 0 1 2 0w t (k g )
![Page 6: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/6.jpg)
8 0
1 0 0
1 2 0
1 4 0
1 6 0
1 8 0
2 0 0
2 2 0
6 0 7 0 8 0 9 0 1 0 0 1 1 0 1 2 0W t ( k g )
Scatter diagram of weight and systolic blood pressure
![Page 7: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/7.jpg)
Scatter plots
The pattern of data is indicative of the type of relationship between your two variables:
positive relationship negative relationship no relationship
![Page 8: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/8.jpg)
Positive relationshipPositive relationship
![Page 9: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/9.jpg)
0
2
4
6
8
10
12
14
16
18
0 10 20 30 40 50 60 70 80 90
Age in Weeks
Heig
ht in
CM
![Page 10: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/10.jpg)
Negative relationshipNegative relationship
Reliability
Age of Car
![Page 11: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/11.jpg)
No relationNo relation
![Page 12: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/12.jpg)
Correlation CoefficientCorrelation Coefficient
Statistic showing the degree of relation between two variables
![Page 13: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/13.jpg)
Simple Correlation coefficient Simple Correlation coefficient (r)(r)
It is also called Pearson's correlation It is also called Pearson's correlation or product moment correlation or product moment correlationcoefficient. coefficient.
It measures the It measures the naturenature and and strengthstrength between two variables ofbetween two variables ofthe the quantitativequantitative type. type.
![Page 14: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/14.jpg)
The The signsign of of rr denotes the nature of denotes the nature of association association
while the while the valuevalue of of rr denotes the denotes the strength of association.strength of association.
![Page 15: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/15.jpg)
If the sign is If the sign is +ve+ve this means the relation this means the relation is is direct direct (an increase in one variable is (an increase in one variable is associated with an increase in theassociated with an increase in theother variable and a decrease in one other variable and a decrease in one variable is associated with avariable is associated with adecrease in the other variable).decrease in the other variable).
While if the sign is While if the sign is -ve-ve this means an this means an inverse or indirectinverse or indirect relationship (which relationship (which means an increase in one variable is means an increase in one variable is associated with a decrease in the other).associated with a decrease in the other).
![Page 16: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/16.jpg)
The value of r ranges between ( -1) and ( +1)The value of r ranges between ( -1) and ( +1) The value of r denotes the strength of the The value of r denotes the strength of the
association as illustratedassociation as illustratedby the following diagram.by the following diagram.
-1 10-0.25-0.75 0.750.25
strong strongintermediate intermediateweak weak
no relation
perfect correlation
perfect correlation
Directindirect
![Page 17: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/17.jpg)
If If rr = Zero = Zero this means no association or this means no association or correlation between the two variables.correlation between the two variables.
If If 0 < 0 < rr < 0.25 < 0.25 = weak correlation. = weak correlation.
If If 0.25 ≤ 0.25 ≤ rr < 0.75 < 0.75 = intermediate correlation. = intermediate correlation.
If If 0.75 ≤ 0.75 ≤ rr < 1 < 1 = strong correlation. = strong correlation.
If If r r = l= l = perfect correlation. = perfect correlation.
![Page 18: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/18.jpg)
ny)(
y.nx)(
x
nyx
xyr
22
22
How to compute the simple correlation coefficient (r)
![Page 19: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/19.jpg)
ExampleExample:: A sample of 6 children was selected, data about their A sample of 6 children was selected, data about their
age in years and weight in kilograms was recorded as age in years and weight in kilograms was recorded as shown in the following table . It is required to find the shown in the following table . It is required to find the correlation between age and weight.correlation between age and weight.
serial No
Age (years)
Weight (Kg)
17122683812451056116913
![Page 20: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/20.jpg)
These 2 variables are of the quantitative type, one These 2 variables are of the quantitative type, one variable (Age) is called the independent and variable (Age) is called the independent and denoted as (X) variable and the other (weight)denoted as (X) variable and the other (weight)is called the dependent and denoted as (Y) is called the dependent and denoted as (Y) variables to find the relation between age and variables to find the relation between age and weight compute the simple correlation coefficient weight compute the simple correlation coefficient using the following formula:using the following formula:
ny)(
y.nx)(
x
nyx
xyr
22
22
![Page 21: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/21.jpg)
Serial n.
Age (years)
(x)
Weight (Kg)
(y)xyX2Y2
17128449144268483664381296641444510502510056116636121691311781169
Total∑x=41
∑y=66
∑xy= 461
∑x2=291
∑y2=742
![Page 22: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/22.jpg)
r = 0.759r = 0.759strong direct correlation strong direct correlation
6(66)742.
6(41)291
66641461
r22
![Page 23: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/23.jpg)
EXAMPLE: Relationship between Anxiety and EXAMPLE: Relationship between Anxiety and Test ScoresTest Scores
AnxietyAnxiety ))XX((
Test Test score (Y)score (Y)
XX22YY22XYXY
101022100100442020883364649924242299448181181811771149497755662525363630306655363625253030
∑∑X = 32X = 32∑∑Y = 32Y = 32∑∑XX22 = 230 = 230∑∑YY22 = 204 = 204∑∑XY=129XY=129
![Page 24: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/24.jpg)
Calculating Correlation CoefficientCalculating Correlation Coefficient
94.)200)(356(
102477432)204(632)230(6
)32)(32()129)(6(22
r
r = - 0.94
Indirect strong correlation
![Page 25: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/25.jpg)
exerciseexercise
![Page 26: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/26.jpg)
Regression AnalysesRegression Analyses
Regression: technique concerned with predicting some variables by knowing others
The process of predicting variable Y using variable X
![Page 27: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/27.jpg)
RegressionRegression Uses a variable (x) to predict some outcome Uses a variable (x) to predict some outcome
variable (y)variable (y) Tells you how values in y change as a function Tells you how values in y change as a function
of changes in values of xof changes in values of x
![Page 28: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/28.jpg)
Correlation and RegressionCorrelation and Regression
Correlation describes the strength of a Correlation describes the strength of a linear relationship between two variables
Linear means “straight line”
Regression tells us how to draw the straight line described by the correlation
![Page 29: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/29.jpg)
Regression Calculates the “best-fit” line for a certain set of dataCalculates the “best-fit” line for a certain set of dataThe regression line makes the sum of the squares of The regression line makes the sum of the squares of
the residuals smaller than for any other linethe residuals smaller than for any other lineRegression minimizes residuals
8 0
1 0 0
1 2 0
1 4 0
1 6 0
1 8 0
2 0 0
2 2 0
6 0 7 0 8 0 9 0 1 0 0 1 1 0 1 2 0W t ( k g )
![Page 30: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/30.jpg)
By using the least squares method (a procedure By using the least squares method (a procedure that minimizes the vertical deviations of plotted that minimizes the vertical deviations of plotted points surrounding a straight line) we arepoints surrounding a straight line) we areable to construct a best fitting straight line to the able to construct a best fitting straight line to the scatter diagram points and then formulate a scatter diagram points and then formulate a regression equation in the form of:regression equation in the form of:
nx)(
x
nyx
xyb 2
21)xb(xyy b
bXay
![Page 31: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/31.jpg)
Regression Equation
Regression equation describes the regression line mathematically Intercept Slope 8 0
1 0 0
1 2 0
1 4 0
1 6 0
1 8 0
2 0 0
2 2 0
6 0 7 0 8 0 9 0 1 0 0 1 1 0 1 2 0W t ( k g )
![Page 32: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/32.jpg)
Linear EquationsLinear EquationsY
Y = bX + a
a = Y-interceptX
Changein Y
Change in Xb = Slope
bXay
![Page 33: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/33.jpg)
Hours studying and Hours studying and gradesgrades
![Page 34: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/34.jpg)
Regressing grades on hours grades on hours
Linear Regression
2.00 4.00 6.00 8.00 10.00
Number of hours spent studying
70.00
80.00
90.00
Final grade in course = 59.95 + 3.17 * studyR-Square = 0.88
Predicted final grade in class =
59.95 + 3.17*(number of hours you study per week)
![Page 35: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/35.jpg)
Predict the final grade ofPredict the final grade of……
Someone who studies for 12 hours Final grade = 59.95 + (3.17*12) Final grade = 97.99
Someone who studies for 1 hour: Final grade = 59.95 + (3.17*1) Final grade = 63.12
Predicted final grade in class = 59.95 + 3.17*(hours of study)
![Page 36: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/36.jpg)
ExerciseExercise
A sample of 6 persons was selected the A sample of 6 persons was selected the value of their age ( x variable) and their value of their age ( x variable) and their weight is demonstrated in the following weight is demonstrated in the following table. Find the regression equation and table. Find the regression equation and what is the predicted weight when age is what is the predicted weight when age is 8.5 years8.5 years..
![Page 37: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/37.jpg)
Serial no.Age (x)Weight (y)123456
768569
128
12101113
![Page 38: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/38.jpg)
AnswerAnswer
Serial no.Age (x)Weight (y)xyX2Y2
123456
768569
128
12101113
8448965066
117
493664253681
14464
144100121169
Total4166461291742
![Page 39: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/39.jpg)
6.83641x 11
666
y
92.0
6)41(
291
666414612
b
Regression equation
6.83)0.9(x11y (x)
![Page 40: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/40.jpg)
0.92x4.675y (x)
12.50Kg8.5*0.924.675y (8.5)
Kg58.117.5*0.924.675y (7.5)
![Page 41: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/41.jpg)
11.411.611.8
1212.212.412.6
7 7.5 8 8.5 9
Age (in years)
Wei
ght (
in K
g)
we create a regression line by plotting two estimated values for y against their X component,
then extending the line right and left.
![Page 42: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/42.jpg)
Exercise 2Exercise 2
The following are the The following are the age (in years) and age (in years) and systolic blood systolic blood pressure of 20 pressure of 20 apparently healthy apparently healthy adults.adults.
Age (x)
B.P (y)
Age (x)
B.P (y)
20436326533158465870
120128141126134128136132140144
46536020634326193123
128136146124143130124121126123
![Page 43: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/43.jpg)
Find the correlation between age Find the correlation between age and blood pressure using simple and blood pressure using simple and Spearman's correlation and Spearman's correlation coefficients, and comment.coefficients, and comment.Find the regression equation?Find the regression equation?What is the predicted blood What is the predicted blood pressure for a man aging 25 years?pressure for a man aging 25 years?
![Page 44: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/44.jpg)
Serialxyxyx21201202400400243128550418493631418883396942612632766765531347102280963112839689617581367888336484613260722116958140812033641070144100804900
![Page 45: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/45.jpg)
Serialxyxyx21146128588821161253136720828091360146876036001420124248040015631439009396916431305590184917261243224676181912122993611931126390696120231232829529
Total852263011448
641678
![Page 46: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/46.jpg)
nx)(
x
nyx
xyb 2
21 4547.0
2085241678
2026308521144862
=
=112.13 + 0.4547 x
for age 25
B.P = 112.13 + 0.4547 * 25=123.49 = 123.5 mm hg
y
![Page 47: Statistics 202](https://reader035.fdocuments.net/reader035/viewer/2022062310/577cc5961a28aba7119cccf6/html5/thumbnails/47.jpg)