Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

27
Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2

Transcript of Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Page 1: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Measuring Association

September 10, 2001

Statistics for Psychosocial Research

Lecture 2

Page 2: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Today’s Topics

• Covariance

• Pearson correlation

• Spearman correlation

• Association with non-linear data– tetrachoric / polychoric correlation– odds ratios

• Association matrices

Page 3: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Measuring Associations

• Goal: Evaluate assocations between pairs of variables being used to measure a construct of interest

• Examples:– depression: sleeping problems ~ guilt?

– disability: time to walk 10 m ~ self-reported difficulty walking 10 m?

– schizophrenia: social class ~ schizophrenia?

– SES: education ~ income?

Page 4: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Associations in Psychosocial Research

• Crucial to the process of defining a construct(1) “too” associated?

(2) not associated?• not appropriately describing “construct”

• measuring different dimensions of “construct” (e.g. mood versus somatic symptoms of depression)

Page 5: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Associations between variables affect….

• Reliability

• Validity

• Factor Analysis

• Latent Class Analysis

• Structural Equation Models

Measurement Issue

Page 6: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Variance and Covariance

• Variance: Measures variability in one variable, X.

• Covariance: Measures how to two variables, X and Y, covary.

s x xx N i xi

N2 1

12 2

1

( )

xy

N

iiiNxy yyxxs

11

1 ))((

Page 7: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Examples of Variance

-10 -5 0 5 10 15

0.0

0.1

0.2

0.3

0.4

X-10 -5 0 5 10 15

0.0

0.1

0.2

0.3

0.4

Y

Page 8: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Examples of Covariance

X-2 0 2 4

-10

-50

510

X-3 -2 -1 0 1 2 3

-10

-50

510

Page 9: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Correlation, r

Correlation is a scaled version of covariance

-1 < r < 1

r = 1 perfect positive correlation

r = -1 perfect negative correlation

r = 0 uncorrelated

22yx

xyxy

ss

sr

Page 10: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Covariance and Correlation

• When are they appropriate measures of association?

• What type of association do they describe?

• Transformations

• Scatterplots

• Outliers

Page 11: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Spearman Correlation

• Use when:– skewed data– outliers– sparse data

• Effect:– downweights outliers– smooths a curve to a straight line

Page 12: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Spearman Correlation

• Method:– sort x and y

– replace data with ranks

– calculate pearson correlation on ranks.

data ranks

x y x* y*

0.1 0.41 1

0.3 0.62 3

0.5 0.5 3 2

0.6 0.94 4

0.8 1.85 6

1.0 1.26 5

r=0.79 r=0.89

x0.2 0.4 0.6 0.8 1.0

0.4

0.8

1.2

1.6

Page 13: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Spearman Correlation

x0.0 0.5 1.0 1.5 2.0 2.5

-50

510

15

x10 20 40 60 80 100

020

4060

8010

0

Pearson r = 0.72 Spearman r = 0.59

Page 14: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Problems with Correlation/Covariance between variables

What if one (or both) variables is (are) not really continuous?

e.g. number of pregnancies and education level

r = -0.6

0 2 4 6 8

0.0

0.05

0.15

Number of Pregnancies1 2 3 4

0.0

0.1

0.2

0.3

0.4

Education Level

Page 15: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Is correlation appropriate?

Education Level1.0 1.5 2.0 2.5 3.0 3.5 4.0

02

46

8

Page 16: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Other issues

• Highly skewed or “floor” or “ceiling” effects– e.g. number of hospital admissions, percent

humidity daily in Baltimore in July, mini-mental exam score

• Ordinal: Takes finite number of values– e.g. on a scale of 1 to 5

• Binary: r = 0.35

x0.0 0.2 0.4 0.6 0.8 1.0

0.0

0.4

0.8

Page 17: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Binary Example: Disability

• Two types of association– redundancy: b and c cells are

close to 0

– hierarchy: either b OR c is close to 0, but other is not.

• Pearson correlation mixes up association and similarity of “marginal”distribution

• Consequences: If hierarchy is relevant, you get low reliability, consistency, and misleading internal validity by using pearson correlation.

Difficulty Walking1/4 Mile

No Yes

Difficulty Walking 1 mile

No 40 0 40

Yes 40 20 60

80 20 100

Page 18: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Alternative Measures

• Tetrachoric Correlation– binary variables

• Polychoric Correlation– ordinal variables

• Odds Ratio– binary variables

Page 19: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Tetrachoric Correlation

• Estimates what the correlation between two binary variables would be if the “ratings” were made on a continuous scale.

• Example: difficulty walking up 10 steps and difficulty lifting 10 lbs.

Level of Difficultyno difficulty difficulty

Page 20: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Tetrachoric Correlation

• Assumes that both “traits” are normally distributed

• Correlation, r, measures how narrow the ellipse is.

• a, b, c, d are the proportions in each quadrant

a

cd

b

Page 21: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Tetrachoric Correlation

For = ad/bc,

Approximation 1:

Approximation 2 (Digby):

Q

1

1

Q

3 4

3 4

1

1

Page 22: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Tetrachoric Correlation

• Example:– Tetrachoric correlation

= 0.61

– Pearson correlation = 0.41

– Odds ratio = 6

• Interpretation? – Same as Pearson

correlation.

Difficulty WalkingUp 10 Steps

No Yes

Difficulty Lifting 10 lb.

No 40 10 50

Yes 20 30 50

60 40 100

Page 23: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Odds Ratio

• Measure of association between two binary variables

• Risk associated with x given y.

• Example:odds of difficulty walking

up 10 steps to the odds of difficulty lifting 10 lb:

O Rp pp p

adbc

1 1

2 2

11

4 0 3 02 0 1 0 6

/ ( )/ ( )

( )( )( )( )

Page 24: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Odds Ratio

adbc ( )( )

( )( )4 0 2 04 0 0

Difficulty Walking1/4 Mile

No Yes

Difficulty Walking 1 mile

No 40 0 40

Yes 40 20 60

80 20 100

Page 25: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Pros and Cons

• Tetrachoric correlation– same interpretation as spearman and pearson correlation

– “difficult” to calculate

• Odds Ratio– easy to understand, but no “perfect” association that is

manageable

– easy to calculate

– not comparable to correlations

• May give you different results/inference!

Page 26: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Association Matrices

• Age, income, education

• Correlation Matrix grade income age

grade 1.00 0.45 -0.25

income 0.45 1.00 -0.13

age -0.25 -0.13 1.00

• Covariance Matrix grade income age grade 6.61 28.18 -5.77

income 28.18 592.69 -29.10

age -5.77 -29.10 81.23

Page 27: Measuring Association September 10, 2001 Statistics for Psychosocial Research Lecture 2.

Association Matrices

• Depression: depressed mood, sleep problems, fatigue

• Odds Ratio Matrix

depress sleep fatigue

depress --- 8.17 10.91

sleep 8.17 --- 16.12

fatigue 10.91 16.12 ---