Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁...

22
Biomedical Statistics 系系系 : 系系系系系系系系系系 (NCUEE) 系系系系 : 系系系 (Jang-Zern Tsai) 系系 : 系系系 (Jacky Tu)

Transcript of Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁...

Page 1: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Biomedical Statistics系所別 : 中央大學電機工程學系 (NCUEE)

指導教授 :蔡章仁 (Jang-Zern Tsai)

姓名 : 凃建宇 (Jacky Tu)

Page 2: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Outline

Two Sample Hypothesis Testing for Correlation

Multiple Correlation

Spearman’s Rank Correlation

Page 3: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Two Sample Hypothesis Testing for Correlation

Case1: Independent samples

Case2: Dependent samples

Page 4: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Two Sample Hypothesis Testing for Correlation with independent samples

Example:

A sample of 40 couples from London is taken comparing the husband’s IQ with his wife’s. The correlation coefficient for the sample is .77. Is this significantly different from the correlation coefficient of .68 for a sample of 30 couples from Paris?

Page 5: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Some excel functions:

FISHER equ.

SQRT equ. square root(number)

NORMSDIST equ.

Then we can perform either one of the following tests:

Page 6: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Two Sample Hypothesis Testing for Correlation with dependent samples

What difference?

two correlations have one variable in common or because the two variables are correlated at one moment in time and again at another moment in time

Example:

 IQ tests are given to 20 couples. The oldest son of each couple is also given the IQ test with the scores displayed in Figure 1. We would like to know whether the correlation between son and mother is the significantly different from the correlation between son and father.

Page 7: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

use the following test statistic

S is the 3 × 3 sample correlation matrix and

Since p-value = .042 < .05 = α  we reject the null hypothesis, and conclude that the correlation between mother and son is significantly different from the correlation between father and son.

Page 8: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Multiple Correlation

We can also calculate the correlation between more than two variables

Definition 1:

multiple correlation coefficient

multiple coefficient of determination

Rz,xy^2

R^2

x,y:independent variables

z:dependent variable R

Page 9: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Multiple Correlation(Cont.)

Definition 2

adjusted multiple correlation coefficient

 k = the number of independent variables

Page 10: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Example

By using Excel’s Correlation data analysis tool, we can get correlation coefficients for data in Example 

Page 11: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

We use the data above to obtain the values , rPW , rPI , and rWI 

Definition 3:partial correlation(x and z holding y constant)

semi-partial correlation(x and y is eliminated,  x and z and y and z not)

Page 12: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Example

If we want to know the relationship between GPA (grade point average) , salary and IQ

but maybe IQ correlates well with both GPA and Salary.

To test this need to determine the correlation between GPA and salary eliminating the influence of IQ

so the partial correlation r(GS,I)

Page 13: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

If we continue calculate r(PW,I),rP(W,I)

Then we can proof the property by:

Page 14: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Since the coefficients of determination is a measure of the portion of variance attributable to the variables involved, we can look at the meaning of the concepts defined above using the following Venn diagram, where the rectangular represents the total variance of the poverty variable

calculate the breakdown of the variance for poverty:

Page 15: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

we can calculate B in a number of ways:

(A + B –  A, (B + C) – C, (A + B + C) – (A+ C)

where D = 1 – (A + B + C)

Page 16: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Follow the property 1:

If the independent variables are mutually independent:

Page 17: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Spearman’s Rank Correlation

Definition : the same as correlation coefficient r has the range -1~1

but is on the rank.

Example:

If IQ associates with the number of hours listen to rap music per month

Pearson’s correlation = CORREL(A4:A13,B4:B13) = -0.036

Spearman’s rho = CORREL(C4:C13,D4:D13) = -0.115

Can use Excel’s functionRANK.AVG(A4,A$4:A$13,1)

Page 18: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

shows there isn’t much of a correlation between IQ and listening to rap music, although the Spearman’s rho is closer to zero (indicating independent samples) than the Pearson’s.

If we plot the example

no ties in the ranking, there is alternative way of calculating Spearman’s rho using the following property

di = rank xi – rank yi

Page 19: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

If we use the property above to do the example again:

the same as the CORREL(C4:C13,D4:D13) = -0.115

Page 20: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Example:

Repeat the analysis for Example of One Sample Hypothesis Testing for Correlation using Spearman’s rho

Spearman’s rho is the correlation coefficient on the ranked data, namely CORREL(C5:C19,D5:D19) = -.674

A study is designed to check the relationship between smoking and longevity. A sample of 15 men 50 years and older was taken and the average number of cigarettes smoked per day and the age at death was recorded, as summarized in the table in Figure 1. Can we conclude from the sample that longevity is independent of smoking?

Page 21: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

We now use the table in Spearman’s Rho Table to find the critical value for the two-tail test where n = 15 and α = .05. Interpolating between the values for n = 14 and 16, we get a critical value of .525. Since the absolute value of rho is larger than the critical value, we reject the null hypothesis that there is no correlation between cigarette smoking and longevity.

Since n = 15 ≥ 10, we can use a t-test instead of the table

Since |t| = 3.29 > 2.16 = tcrit = TINV(.05,13), we again conclude that there is a significant negative correlation between the number of cigarettes smoked and longevity.

Page 22: Biomedical Statistics 系所別 : 中央大學電機工程學系 (NCUEE) 指導教授 : 蔡章仁 (Jang-Zern Tsai) 姓名 : 凃建宇 (Jacky Tu)

Thank you for your listensing