ROC Tutorial
-
Upload
chouxxpastry -
Category
Documents
-
view
251 -
download
0
Transcript of ROC Tutorial
-
8/10/2019 ROC Tutorial
1/62
Diagnostic tests: ROC curves,sensitivity, and specificity
Michael Walker
-
8/10/2019 ROC Tutorial
2/62
Tutorial articles
Spitalnic, S. Test Properties 1: Sensitivity,
Specificity and Predictive Values. HospitalPhysician, September 2004, p. 27-31.
Spitalnic, S. Test Properties 2: Likelihood ratios,
Bayes formula, and receiver operatingcharacteristic curves. Hospital Physician,October 2004, p. 53-58.
www.turner-white.com
-
8/10/2019 ROC Tutorial
3/62
Clinical objectives
Describe the diagnostic accuracy of a test.
Compare accuracy of different tests.
-
8/10/2019 ROC Tutorial
4/62
Suppose we wish to know if the
expression level of a gene in a tumor canpredict if patients will have a recurrence oftheir cancer.
High expression => high probability ofrecurrence
-
8/10/2019 ROC Tutorial
5/62
Not cancer Cancer
Gene expression
Threshold
In this case, gene expression separates cancer from non-cancer perfectly.
-
8/10/2019 ROC Tutorial
6/62
Not cancer Cancer
Gene expression
Gene expression levels for cancer and non-cancer overlap.
What threshold should we choose to predict recurrence?
-
8/10/2019 ROC Tutorial
7/62
Threshold
Choose a high threshold:
Few false positive predictions,but
lots of false negatives
Not cancer
Cancer
-
8/10/2019 ROC Tutorial
8/62
Threshold
Non-diseasedcases
Diseasedcases
Choose a low threshold:Many false positive predictions,
butfew false negatives
-
8/10/2019 ROC Tutorial
9/62
Notation for conditional probability
Suppose that the patient has the disease (according togold standard)
Notation to specify the probability that the test for thepatient is positive, given that the patient has the disease:
P(Test positive | patient has disease)
P(T+ | D+)
This notation describes the conditional probability.
-
8/10/2019 ROC Tutorial
10/62
Sensitivity and Specificity
We want the test to be positive when the patient
has the disease: Sensitivity
= P(Test positive | patient has disease)
We want the test to be negative when the patient
does not have the disease: Specificity
= P(Test negative | patient does not have disease)
-
8/10/2019 ROC Tutorial
11/62
Sensitivity and Specificity
Sensitive => find ALL disease
Sensitivity= P(Test positive | patient has disease)
Specific => find ONLY disease
Specificity= P(Test negative | patient does not have
disease)
-
8/10/2019 ROC Tutorial
12/62
Sensitivity and specificity depend on
How well the test separates the two groups What threshold we choose
-
8/10/2019 ROC Tutorial
13/62
Not cancer Cancer
Gene expression
Threshold
If gene expression separates cancer from non-cancer perfectly:
Good sensitivity = P(Test positive | patient has disease)
Good specificity = P(Test negative | patient does not have disease)
-
8/10/2019 ROC Tutorial
14/62
Not cancer Cancer
Gene expression
If gene expression levels for cancer and non-caner overlap:
Sensitivity and specificity depend on what threshold we choose
-
8/10/2019 ROC Tutorial
15/62
Threshold
High threshold:Few false positive predictions,
but
lots of false negatives
Not cancer
Cancer
Poor sensitivity = P(Test positive |patient has disease)Many false negatives
Good specificity = P(Test negative |patient does not have disease)Few false positives
-
8/10/2019 ROC Tutorial
16/62
Threshold
Not cancer
Cancer
Low threshold:Many false positive predictions,
but
few false negatives
Good sensitivity= P(Test positive | patient has disease)Few false negatives
Poor specificity= P(Test negative | patient does not have disease)Many false positives
-
8/10/2019 ROC Tutorial
17/62
Sensitivity and Specificity
Sensitivity
= P(Test positive | patient has disease)= P(T+ | D+)
= True positive rate
Specificity
= P(Test negative | patient does not have disease)
= P(T- | D-)
=True negative rate
-
8/10/2019 ROC Tutorial
18/62
False positive rate (FPR) = 1 specificity
False negative rate (FNR) = 1 - sensitivity
-
8/10/2019 ROC Tutorial
19/62
Threshold
High threshold:Few false positive predictions,
but
lots of false negatives
Not cancer
Cancer Poor sensitivity = P(Test positive |patient has disease)Many false negatives
Good specificity = P(Test negative |patient does not have disease)Few false positives
Low False positive rate (FPR) = 1 specificity
High False negative rate (FNR) = 1 - sensitivity
-
8/10/2019 ROC Tutorial
20/62
Sensitivity and specificity tell us about the
test result, given that we know if thepatient has the disease or not.
In clinic, we dont know if the patient hasthe disease; thats what we want the testto tell us.
-
8/10/2019 ROC Tutorial
21/62
PPV and NPV
Positive predictive value (PPV)
= P(patient has disease | Test positive )= P(D+ | T+)
Negative predictive value (NPV)
= P(patient does not have disease | Test
negative )
= P(D- | T-)
-
8/10/2019 ROC Tutorial
22/62
Sensitivity and Specificity
Sensitivity = P(T+ | D+)
Specificity = P(T- | D-) PPV = P(D+ | T+)
NPV = P(D- | T-)
-
8/10/2019 ROC Tutorial
23/62
PPV and NPV are a function of the
prevalence (the proportion of thepopulation that has the disease), as wewill see shortly
Sensitivity and specificity do not dependon the prevalence. They are conditional on
the patient either having or not having thedisease.
-
8/10/2019 ROC Tutorial
24/62
Spitalnic DVT Example 1
Disease +ve Disease -ve Total
Test +ve 90 160 250
Test -ve 10 240 250
Total 100 400 500
-
8/10/2019 ROC Tutorial
25/62
Example
Disease
+ve
Disease -
ve
Total
Test +ve 90 160 250
Test -ve 10 240 250
Total 100 400 500
Disease+ve
Disease -ve Total
Test
+ve True +ve False +ve Total T+
Test -ve False -ve True -ve Total T-
Total Total D+ Total D- Total pts
-
8/10/2019 ROC Tutorial
26/62
A B A+B
C D C+D
A+C B+D A+B+C+D
Disease
+ve
Disease -
ve
Total
Test +ve 90 160 250
Test -ve 10 240 250
Total 100 400 500
-
8/10/2019 ROC Tutorial
27/62
Sensitivity = A/(A+C) = 0.9
Specificity = D/(B+D) = 0.6
FPR = 1 - Specificity = B/(B+D) = 0.4
FNR = 1 - Sensitivity = C / (A+C) = 0.1
PPV = P(D+|T+) = A/(A+B) 0.36
NPV = P(D-|T-) = D/(C+D) 0.96
-
8/10/2019 ROC Tutorial
28/62
Later, well see how to calculate PPV and
NPV using Bayes rule
-
8/10/2019 ROC Tutorial
29/62
Tutorial article
Spitalnic, S. Test Properties 2: Likelihood ratios,
Bayes formula, and receiver operatingcharacteristic curves. Hospital Physician,October 2004, p. 53-58.
-
8/10/2019 ROC Tutorial
30/62
Pre-test probability = estimated probability
that the patient has the disease beforegetting the diagnostic test result. The pre-test probability = prior probability
Post-test probability = estimatedprobability that the patient has the diseaseafter getting the diagnostic test result. The post-test probability = posterior
probability
-
8/10/2019 ROC Tutorial
31/62
ROC curves
We want to be able to compare the
accuracy of diagnostic tests. Sensitivity and specificity are candidate
measures for accuracy, but have someproblems, as well see.
ROC curves are an alternative measure
-
8/10/2019 ROC Tutorial
32/62
ROC curves
We plot sensitivity against 1 specificity to
create the ROC curve for a test
-
8/10/2019 ROC Tutorial
33/62
Not cancer Cancer
Gene expression
Threshold
A test that perfectly separates the two groups
Sensitivity = P(Test positive | patient has disease) = 1.0
Specificity = P(Test negative | patient does not have disease) = 1.0
-
8/10/2019 ROC Tutorial
34/62
TPF,sensitivity
FPF, 1-specificity
ROC curve for a perfect test
Sensitivity = 1Specificity = 1
1 specificity = 0
-
8/10/2019 ROC Tutorial
35/62
For a single diagnostic test, sensitivity and
specificity vary with the threshold we use.
-
8/10/2019 ROC Tutorial
36/62
Threshold
TPF,sensitivity
FPF, 1-specificity
High threshold
Not cancer
Cancer
High threshold:Good specificity = P(T-| D-)
Medium sensitivity = P(T+|D+)
-
8/10/2019 ROC Tutorial
37/62
Threshold
TPF
,sensitivity
FPF, 1-specificity
Medium threshold
Not cancer
Cancer
Medium threshold:Medium specificity = P(T-| D-)
Medium sensitivity = P(T+|D+)
-
8/10/2019 ROC Tutorial
38/62
Threshold
TPF
,sensitivity
FPF, 1-specificity
Low threshold
Cancer
Not cancer
Low threshold:Medium specificity = P(T-| D-)
Good sensitivity = P(T+|D+)
E t l th h ld
-
8/10/2019 ROC Tutorial
39/62
Threshold
Not cancer
Cancer TPF
,sensitivity
FPF, 1-specificity
Extreme low threshold:No specificity = P(T-| D-)
Perfect sensitivity = P(T+|D+)
-
8/10/2019 ROC Tutorial
40/62
TPF
,sensitivity
FPF, 1-specificity
For a test that cannot separate
the two classes, the ROCcurve is a straight 45 degreeline.
Good tests approach the topleft corner of the ROC curve. c
hance
line
Good tests approach the top left corner of the
-
8/10/2019 ROC Tutorial
41/62
TPF
,sensitivity
FPF, 1-specificity
Good tests approach the top left corner of theROC curve.
chance
line
The area under the ROC curve describes test accuracy
-
8/10/2019 ROC Tutorial
42/62
TPF
,sensitivity
FPF, 1-specificity
The area under the ROC curve describes test accuracy
chance
line
Poor test: ROC area near 0.5
Good test: ROC area near 1.0
-
8/10/2019 ROC Tutorial
43/62
Sensitivity and specificity dont always
make it clear which of two diagnostic testsis better
Which test is better?
-
8/10/2019 ROC Tutorial
44/62
Which test is better?
False Positive Fraction
= 1.0 Specificity
True
Positive
Fraction
=
Se
nsitivity
1.0
1.0
0.00.0
Test A
Test B
Different ROC for each test
-
8/10/2019 ROC Tutorial
45/62
False Positive Fraction
= 1.0 Specificity
True
Positive
Fraction
=
Se
nsitivity
1.0
1.0
0.00.0
Test A
Test BTest B is betterGreater ROC area
Different ROC for each test
-
8/10/2019 ROC Tutorial
46/62
False Positive Fraction
= 1.0 Specificity
True
Positive
Fraction
=
Se
nsitivity
1.0
1.0
0.00.0
Test A
Test BTest A is betterGreater ROC area
Both tests are on the same ROC
-
8/10/2019 ROC Tutorial
47/62
Both tests are on the same ROC
False Positive Fraction
= 1.0 Specificity
True
Positive
Fraction
=
Se
nsitivity
1.0
1.0
0.0
0.0
Test A
Test B Tests have same
area under ROC
-
8/10/2019 ROC Tutorial
48/62
ROC curves may cross.
In this case, total area under the
ROC curve may not be a goodmeasure for comparing tests
Potential ROC issues
-
8/10/2019 ROC Tutorial
49/62
Potential ROC issues
Lack of gold standard for diagnosis
Lack of reproducibility E.g., disagreement among pathologists
Bias in sample selection, spectrum of
disease used in evaluating test Choose sickest patients, healthy controls
Problems in ascertainment Genetic disease may not be manifest
Cant always reliably measure ROC area
Bayes rule
-
8/10/2019 ROC Tutorial
50/62
Bayes rule
How to use Bayes rule to determine the
posterior probability
Bayes rule
-
8/10/2019 ROC Tutorial
51/62
y
P(D+) (prior probability), the prior probability
that the patient has the disease in theabsence of any test data (prevalence)
P(T+): probability of a positive test result
(including both true positive and false
positive)
P(D|T) (posteriori probability), the probability
of disease given the test result
Derivation of Bayes rule
-
8/10/2019 ROC Tutorial
52/62
Derivation of Bayes rule
)(
)()|(
)|( TPDPDTP
TDP =
)()|()()|( DPDTPTPTDP =
Bayes rule
-
8/10/2019 ROC Tutorial
53/62
Bayes rule
)()()|(
)|( TPDPDTP
TDP =
)()|()()|()()|()|(
+++++
+++=++
DPDTPDPDTPDPDTP
TDP
)()|()()|()()|()|( +++=++
DPDTPDPDTPDPDTPTDP
-
8/10/2019 ROC Tutorial
54/62
P(D+|T+) = Probability of disease given test +ve P(T+|D+) = sensitivity
P(D+) = prevalence
P(T+|D-) = 1 specificity
P(D-) = 1- prevalence
)()|()()|()|(
+++++++
DPDTPDPDTPTDP
Example: sensitivity = 0 9 specificity = 0 8
-
8/10/2019 ROC Tutorial
55/62
Example: sensitivity = 0.9, specificity = 0.8,
prevalence=0.5 P(T+|D+) = Sensitivity = 0.9
P(D+) = 0.5 P(T+|D-) = 1 specificity = 1 - 0.8 = 0.2
P(D-) = 1- prevalence = 1 0.5 = 0.5
82.0)|( 5.0*2.05.0*9.05.0*9.0
=+
=++
TDP
)()|()()|(
)()|()|(++++
+++=++
DPDTPDPDTP
DPDTPTDP
Likelihood ratio version of Bayes
-
8/10/2019 ROC Tutorial
56/62
rule This section is optional
We can express Bayes rule in terms ofodds and likelihood ratios
Well start by defining odds and likelihoodratios
Odds
-
8/10/2019 ROC Tutorial
57/62
Odds
Probability of an event is in the interval [0,1] If the probability of an event is P = 0.5,
Then the odds = P/(1-P) = 0.5/(1-0.5) = 0.5/0.5 = 1
P
POdds
=
1
-
8/10/2019 ROC Tutorial
58/62
-
8/10/2019 ROC Tutorial
59/62
Likelihood ratio (LR)
-
8/10/2019 ROC Tutorial
60/62
Likelihood ratio (LR)
Likelihood ratio for a negative test (LR-)
yspecificit
ysensitivit
DTP
DTP
LR
=
+=
1
)|(
)|(
Likelihood version of Bayes rule
-
8/10/2019 ROC Tutorial
61/62
Likelihood version of Bayes rule
Post-test odds = LR * pre-test odds
See example in Spitalnic, S. TestProperties 2: Likelihood ratios, Bayesformula, and receiver operatingcharacteristic curves. Hospital Physician,October 2004, p. 53-58.
-
8/10/2019 ROC Tutorial
62/62
Power and sample size for ROC curve
Example using NCSS PASS