Accuracy lecture 131024

42
ACCURACY (AND OTHER VALIDATION MEASURES) Adina L. Feldman, M.Sc. Karolinska Institutet Department of Medical Epidemiology and Biostatistics e-mail: [email protected] tel. 08 5248 2313 24 October 2013 Adina L. Feldman 1

description

Lecture on validation measures in epidemiology for master students in publiv health and epidemiology at Karolinska Institutet in Stockholm, Sweden on 24 October 2013.

Transcript of Accuracy lecture 131024

Page 1: Accuracy lecture 131024

ACCURACY (AND OTHER VALIDATION MEASURES)

Adina L. Feldman, M.Sc.

Karolinska Institutet Department of Medical Epidemiology and Biostatistics

e-mail: [email protected] tel. 08 5248 2313

24 October 2013 Adina L. Feldman 1

Page 2: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 2

SYSTEMATIC ERROR

RA

ND

OM

ER

RO

R

Low High

Low

H

igh

Page 3: Accuracy lecture 131024

Validity

Accuracy is a type of systematic error (potential bias)

(Random error/precision is related to power, e.g. size of study sample)

Validity is what we call the certainty (accuracy) of a proxy measure/test

Why is knowing the validity of a measure important?

Consider these examples: What is the validity of breast cancer screening (mammography)?

What is the validity of home pregnancy tests?

What is the validity of self-reported height? …weight?

What is the validity of register-based Parkinson’s disease diagnoses?

24 October 2013 Adina L. Feldman 3

Page 4: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 4

Page 5: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 5

Page 6: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 6

Page 7: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 7

Page 8: Accuracy lecture 131024

Gold Standard

= The best possible available measure agianst which the measure under study is validated

Discuss: What gold standard was used in these validations? Breast cancer screening (mammography)?

Home pregnancy tests?

Self-reported height? …weight?

Register-based Parkinson’s disease diagnoses?

24 October 2013 Adina L. Feldman 8

Page 9: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 9

Gold Standard

Binary Continuous

Test

mea

sure

Bin

ary

Con

tiuou

s

Discuss: Where do these validations fit in?

Breast cancer screening (mammography)?

Home pregnancy tests?

Self-reported height? …weight?

Register-based Parkinson’s disease diagnoses?

Page 10: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 10

Gold Standard

Binary Continuous

Test

mea

sure

Bin

ary

Reg PDx X

Con

tiuou

s

Preg test BC screening

Height Weight

Discuss: Where do these validations fit in?

Breast cancer screening (mammography)?

Home pregnancy tests?

Self-reported height? …weight?

Register-based Parkinson’s disease diagnoses?

Page 11: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 11

Gold Standard

Binary Continuous

Test

mea

sure

Bin

ary Sensitivity,

Specificity, etc.

X

Con

tiuou

s

ROC-curves Correlations,

Bland-Altman plots

Different validation methods are used for different types of validation studies!

These are covered (or at least mentioned) today

Page 12: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 12

Gold Standard Validity measures for binary outcomes

(Print and pin to your office wall!)

Out

com

e m

easu

re

Positive +

Negative -

Positive +

True Positive (TP)

False Positive (FP)

Positive Predictive Value

(PPV)

=TP/ (TP+FP)

Negative -

False Negative (FN)

True Negative (TN)

Negative Predictive Value

(NPV)

=TN/ (TN+FN)

Sensitivity Specificity

=TP/ (TP+FN)

=TN/ (TN+FP)

Page 13: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 13

Gold Standard

O

utco

me

mea

sure

Positive +

Negative -

Positive +

True Positive (TP)

False Positive (FP)

Positive Predictive Value

(PPV)

=TP/ (TP+FP)

Negative -

False Negative (FN)

True Negative (TN)

Negative Predictive Value

(NPV)

=TN/ (TN+FN)

Sensitivity Specificity

=TP/ (TP+FN)

=TN/ (TN+FP)

Page 14: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 14

Gold Standard These are less commonly used measures, but still good to know

Out

com

e m

easu

re

Positive +

Negative -

Positive +

True Positive (TP)

False Positive (FP)

False Positive Rate (FPR),

cPPV (=1-PPV)

=FP/ (TP+FP)

Negative -

False Negative (FN)

True Negative (TN)

False Negative Rate (FNR),

cNPV (=1-NPV)

=FN/ (TN+FN)

True Positive Rate

FPR (OBS!!) (=1-Spec.) Accuracy

=Sens. =FP/ (FP+TN)

=TP+TN/ (TP+TN+FP+FN)

Page 15: Accuracy lecture 131024

Misclassification

FN and FP are misclassifications

Consider cause of misclassification FN: Why are some cases not detected?

FP: Why are some noncases given erroneous diagnoses?

Differential misclassification: Non-random distribution of TP and FN (with regards to the exposure)

24 October 2013 Adina L. Feldman 15

Page 16: Accuracy lecture 131024

Misclassification

Discuss: What could be the cause of FP and FN in these validations? What could be the consequences of misclassification here? Breast cancer screening (mammography)?

Home pregnancy tests?

Self-reported height? …weight?

Register-based Parkinson’s disease diagnoses?

24 October 2013 Adina L. Feldman 16

Page 17: Accuracy lecture 131024

Fictional Example 1 Cohort study of 10,000 participants (random population-based sample)

Binary proxy measure e.g. self-reported myocardial infarction (”heart attack”) ever/never

Binary Gold Standard e.g. myocardial infarction confirmed according to best clinical practice

24 October 2013 Adina L. Feldman 17

Page 18: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 18

Gold Standard Example 1

Out

com

e m

easu

re

Positive +

Negative -

Positive +

90 5 PPV?

Negative -

10 9,895 NPV?

Sens.? Spec.? GS prev.?

OM prev.?

Page 19: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 19

Gold Standard Example 1 ↓prevalence

↑PPV ↑Sens.

Out

com

e m

easu

re

Positive +

Negative -

Positive +

90 5 PPV 94.7%

Negative -

10 9,895 NPV 100%

(99.899%)

Sens. Spec. GS prev. 1.0%

90.0% 100% (99.95%)

OM prev. 0.95%

Page 20: Accuracy lecture 131024

Fictional Example 2 Cohort study of 10,000 participants (random population-based sample)

Binary proxy measure e.g. self-reported influenza during one winter season yes/no

Binary Gold Standard e.g. laboratory-confirmed infection with influenza virus

24 October 2013 Adina L. Feldman 20

Page 21: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 21

Gold Standard Example 2

Out

com

e m

easu

re

Positive +

Negative -

Positive +

1950 1400 PPV?

Negative -

50 6600 NPV?

Sens.? Spec.? GS prev.?

OM prev.?

Page 22: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 22

Gold Standard Example 2 ↑prevalence

↓PPV ↑Sens.

Out

com

e m

easu

re

Positive +

Negative -

Positive +

1950 1400 PPV 58.2%

Negative -

50 6600 NPV 99.2%

Sens. Spec. GS prev. 20.0%

97.5% 82.5% OM prev. 33.5%

Page 23: Accuracy lecture 131024

Discussion points

Many validation study have only available either: Only Gold Standard positive cases

Only proxy outcome positive cases

What validity measures can be calculated in each instance?

Two-phase screening is a very common approach to diagnosing disease, e.g. Breast cancer (mammography followed by ultrasound, cytology) What type of validity is most important in each phase?

24 October 2013 Adina L. Feldman 23

Page 24: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 24

Page 25: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 25

Page 26: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 26

Gold Standard

Binary Continuous

Test

mea

sure

Bin

ary Sensitivity,

Specificity, etc.

X

Con

tiuou

s

ROC-curves Correlations,

Bland-Altman plots

Different validation methods are used for different types of validation studies!

These are covered (or at least mentioned) today

Page 27: Accuracy lecture 131024

Freq

uenc

y of

cas

es

Measures with discrimination threshold for binary outcomes

24 October 2013 Adina L. Feldman 27

E.g. biomarker concentration in blood

GS-

GS+

Page 28: Accuracy lecture 131024

Freq

uenc

y of

cas

es

Measures with discrimination threshold for binary outcomes

24 October 2013 Adina L. Feldman 28

E.g. biomarker concentration in blood

GS-

GS+ TN

FN FP

TP

Page 29: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 29

Page 30: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 30

Gold Standard: Reduced insulin sensitivity based on established clinical index cutoff

Proxy test: Appendicular lean body mass (LBM) index (kg/m2)

The threshold for LBM is varied and for each step the sensitivity and 1-specificity for the GS are calculated and plotted

The goal is to determine the optimal threshold for LBM in predicting reduced insulin sensitivity

AUC = Area Under the Curve (%) (Bigger = Better)

Page 31: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 31

Page 32: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 32

Page 33: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 33

Gold Standard

Binary Continuous

Test

mea

sure

Bin

ary Sensitivity,

Specificity, etc.

X

Con

tiuou

s

ROC-curves Correlations,

Bland-Altman plots

Different validation methods are used for different types of validation studies!

These are covered (or at least mentioned) today

Page 34: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 34

Page 35: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 35

Page 36: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 36

?

Page 37: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 37

Page 38: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 38

Page 39: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 39

Page 40: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 40

Pearson correlation coefficient overall = 0.61

Page 41: Accuracy lecture 131024

Afternoon group excercise: Ad hoc study of the validity of self-reported height Define Gold Standard

Method of ascertainment of self-reported height

Collect data Proxy

Gold Standard

Using Excel Plot correlation (scatter plot)

Brand-Altman plot

Draw conclusion

24 October 2013 Adina L. Feldman 41

Page 42: Accuracy lecture 131024

24 October 2013 Adina L. Feldman 43

Welcome to my PhD dissertation defence 10 Januari 2014, at 9 am in Andreas Vesalius,

Karolinska Institutet Campus Solna

Dissertation title: ”If I Only Had a Brain

– Epidemiological Studies of Parkinson’s Disease”

Thank You! (See you this afternoon)