2 -type Curves for Dynamic Predictions from Joint ... · R 2-type Curves for Dynamic Predictions...
Transcript of 2 -type Curves for Dynamic Predictions from Joint ... · R 2-type Curves for Dynamic Predictions...
un iver s i ty of copenhagen department of b io stat i s t i c s
Faculty of Health Sciences
R2-type Curves for Dynamic Predictions from
Joint Longitudinal-Survival Models
Inference & application to prediction of kidney graft failure
Paul Blanche
joint work with
M-C. Fournier & E. Dantan (Nantes, France)
July 2015
Slide 1/29
un iver s i ty of copenhagen department of b io stat i s t i c s
Context & Motivation
• Medical researchers hope to improve patient managementusing earlier diagnoses
• Statisticians can help by �tting prediction models
• The making of so-called "dynamic" predictions has recentlyreceived a lot of attention
• In order to be useful for medical practice, predictions shouldbe "accurate"
How should we evaluate dynamic prediction accuracy?
Slide 1/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Data & Clinical goal
I Data:
DIVAT cohort dataof kidney transplant recipients(subsample, n = 4, 119)
I Clinical goal:
• Dynamic prediction of risk of kidneygraft failure (death or return to dialysis)
• Using repeated measurements ofserum creatinine
Slide 2/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
DIVAT data sample (n=4,119)
• French cohort
• Adult recipients
• Transplanted after 2000
• Creatinine measured every year
• 6 centers
(www.divat.fr)
Slide 3/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
DIVAT data sample (n=4,119)
• French cohort
• Adult recipients
• Transplanted after 2000
• Creatinine measured every year
• 6 centers
(www.divat.fr)
Slide 3/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Statistical challenges discussed
How to evaluate and/or compare dynamic predictions?
I Using concepts of:
• Discrimination
• Calibration
I Accounting for:
• Dynamic setting
• Censoring issue
Slide 4/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Basic idea & concepts for evaluating predictions
Basic idea: comparing predictions and observations
(simple!)
Concepts:
I Discrimination:A model has high discriminative power if the range of predicted risks iswide and subjects with low (high) predicted risk are more (less) likely toexperience the event.
I Calibration:
A model is calibrated if we can expect that x subjects out of 100
experience the event among all subjects that receive a predicted risk
of x% (�weak� de�nition).
Slide 5/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Basic idea & concepts for evaluating predictions
Basic idea: comparing predictions and observations (simple!)
Concepts:
I Discrimination:A model has high discriminative power if the range of predicted risks iswide and subjects with low (high) predicted risk are more (less) likely toexperience the event.
I Calibration:
A model is calibrated if we can expect that x subjects out of 100
experience the event among all subjects that receive a predicted risk
of x% (�weak� de�nition).
Slide 5/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Basic idea & concepts for evaluating predictions
Basic idea: comparing predictions and observations (simple!)
Concepts:
I Discrimination:
A model has high discriminative power if the range of predicted risks iswide and subjects with low (high) predicted risk are more (less) likely toexperience the event.
I Calibration:
A model is calibrated if we can expect that x subjects out of 100
experience the event among all subjects that receive a predicted risk
of x% (�weak� de�nition).
Slide 5/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Basic idea & concepts for evaluating predictions
Basic idea: comparing predictions and observations (simple!)
Concepts:
I Discrimination:A model has high discriminative power if the range of predicted risks iswide and subjects with low (high) predicted risk are more (less) likely toexperience the event.
I Calibration:
A model is calibrated if we can expect that x subjects out of 100
experience the event among all subjects that receive a predicted risk
of x% (�weak� de�nition).
Slide 5/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Basic idea & concepts for evaluating predictions
Basic idea: comparing predictions and observations (simple!)
Concepts:
I Discrimination:A model has high discriminative power if the range of predicted risks iswide and subjects with low (high) predicted risk are more (less) likely toexperience the event.
I Calibration:
A model is calibrated if we can expect that x subjects out of 100
experience the event among all subjects that receive a predicted risk
of x% (�weak� de�nition).
Slide 5/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Dynamic prediction
• s: Landmark time at which predictions are made (varies)
• t: prediction horizon (�xed)
follow−up time (years)
Cre
atin
ine
(mol
/L)
100
150
200
250
300
0
Slide 6/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Dynamic prediction
• s: Landmark time at which predictions are made (varies)
• t: prediction horizon (�xed)
follow−up time (years)
Cre
atin
ine
(mol
/L)
100
150
200
250
300
0
Slide 6/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Dynamic prediction
• s: Landmark time at which predictions are made (varies)
• t: prediction horizon (�xed)
follow−up time (years)
Cre
atin
ine
(mol
/L)
100
150
200
250
300
0
Slide 6/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Dynamic prediction
• s: Landmark time at which predictions are made (varies)
• t: prediction horizon (�xed)
follow−up time (years)
Cre
atin
ine
(mol
/L)
100
150
200
250
300
0
Slide 6/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Dynamic prediction
• s: Landmark time at which predictions are made (varies)
• t: prediction horizon (�xed)
follow−up time (years)
Cre
atin
ine
(mol
/L)
100
150
200
250
300
0 s = 4 yearss = 4 years
Landmark time ''s''
Slide 6/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Dynamic prediction
• s: Landmark time at which predictions are made (varies)
• t: prediction horizon (�xed)
follow−up time (years)
Cre
atin
ine
(mol
/L)
100
150
200
250
300
0 s = 4 years
●
s = 4 years
Landmark time ''s''
Slide 6/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Dynamic prediction
• s: Landmark time at which predictions are made (varies)
• t: prediction horizon (�xed)
follow−up time (years)
Cre
atin
ine
(mol
/L)
100
150
200
250
300
0 s = 4 years
●
s = 4 years s+t = 9 years
Landmark time ''s''Horizon ''t''Horizon ''t''
Slide 6/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Dynamic prediction
• s: Landmark time at which predictions are made (varies)
• t: prediction horizon (�xed)
follow−up time (years)
Cre
atin
ine
(mol
/L)
100
150
200
250
300
0 s = 4 years s+t = 9 years
●
0 %
100
%1−
Pr(
Gra
ft Fa
ilure
in (
s,s+
t])
Landmark time ''s''Horizon ''t''
1 − πi(s, t) = 64%
Slide 6/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Right censoring issue
Landmark time s(when predictions are made)
Time s + t(end of prediction window)
time
: uncensored
: censored
For subject i censored within (s, s + t] the status
Di (s, t) = 11{event occurs in (s, s + t]}
is unknown.
Slide 7/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Right censoring issue
Landmark time s(when predictions are made)
Time s + t(end of prediction window)
time
: uncensored
: censored
For subject i censored within (s, s + t] the status
Di (s, t) = 11{event occurs in (s, s + t]}
is unknown.
Slide 7/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Right censoring issue
Landmark time s(when predictions are made)
Time s + t(end of prediction window)
time
: uncensored
: censored
For subject i censored within (s, s + t] the status
Di (s, t) = 11{event occurs in (s, s + t]}
is unknown.
Slide 7/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Right censoring issue
Landmark time s(when predictions are made)
Time s + t(end of prediction window)
time
: uncensored
: censored
For subject i censored within (s, s + t] the status
Di (s, t) = 11{event occurs in (s, s + t]}
is unknown.
Slide 7/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Notations for population parameters
I Indicator of event in (s, s + t]:
Di (s, t) = 11{s < Ti ≤ s + t}
where Ti is the time-to-event.
I Dynamic predictions:
πi (s, t)
= Pξ
(Di (s, t) = 1
∣∣∣Ti > s,Yi (s),Xi
)• ξ: previously estimated parameters (from independent training data)
• Yi (s): marker measurements observed before time s
• Xi : baseline covariates
Slide 8/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Notations for population parameters
I Indicator of event in (s, s + t]:
Di (s, t) = 11{s < Ti ≤ s + t}
where Ti is the time-to-event.
I Dynamic predictions:
πi (s, t)
= Pξ
(Di (s, t) = 1
∣∣∣Ti > s,Yi (s),Xi
)• ξ: previously estimated parameters (from independent training data)
• Yi (s): marker measurements observed before time s
• Xi : baseline covariates
Slide 8/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Notations for population parameters
I Indicator of event in (s, s + t]:
Di (s, t) = 11{s < Ti ≤ s + t}
where Ti is the time-to-event.
I Dynamic predictions:
πi (s, t) = Pξ
(
Di (s, t) = 1∣∣∣Ti > s,Yi (s),Xi
)• ξ: previously estimated parameters (from independent training data)
• Yi (s): marker measurements observed before time s
• Xi : baseline covariates
Slide 8/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Notations for population parameters
I Indicator of event in (s, s + t]:
Di (s, t) = 11{s < Ti ≤ s + t}
where Ti is the time-to-event.
I Dynamic predictions:
πi (s, t) = Pξ
(Di (s, t) = 1
∣∣∣
Ti > s,Yi (s),Xi
)• ξ: previously estimated parameters (from independent training data)
• Yi (s): marker measurements observed before time s
• Xi : baseline covariates
Slide 8/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Notations for population parameters
I Indicator of event in (s, s + t]:
Di (s, t) = 11{s < Ti ≤ s + t}
where Ti is the time-to-event.
I Dynamic predictions:
πi (s, t) = Pξ
(Di (s, t) = 1
∣∣∣Ti > s,Yi (s),
Xi
)• ξ: previously estimated parameters (from independent training data)
• Yi (s): marker measurements observed before time s
• Xi : baseline covariates
Slide 8/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Notations for population parameters
I Indicator of event in (s, s + t]:
Di (s, t) = 11{s < Ti ≤ s + t}
where Ti is the time-to-event.
I Dynamic predictions:
πi (s, t) = Pξ
(Di (s, t) = 1
∣∣∣Ti > s,Yi (s),Xi
)• ξ: previously estimated parameters (from independent training data)
• Yi (s): marker measurements observed before time s
• Xi : baseline covariates
Slide 8/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Predictive accuracy
How close are the predicted risks πi (s, t) to the �true underlying�risk P
(event occurs in (s,s+t]
∣∣information at s)?
I Prediction Error:
PEπ(s, t) = E[{
D(s, t)− π(s, t)}2∣∣∣T > s
]
• the lower the better
• PE ≈ Bias2 + Variance
• evaluates both Calibration and Discrimination
• depends on P(event occurs in (s,s+t]
∣∣at risk at s)
• often called "Expected Brier Score"
Slide 9/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Predictive accuracy
How close are the predicted risks πi (s, t) to the �true underlying�risk P
(event occurs in (s,s+t]
∣∣information at s)?
I Prediction Error:
PEπ(s, t) = E[{
D(s, t)− π(s, t)}2∣∣∣T > s
]
• the lower the better
• PE ≈ Bias2 + Variance
• evaluates both Calibration and Discrimination
• depends on P(event occurs in (s,s+t]
∣∣at risk at s)
• often called "Expected Brier Score"
Slide 9/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Predictive accuracy
How close are the predicted risks πi (s, t) to the �true underlying�risk P
(event occurs in (s,s+t]
∣∣information at s)?
I Prediction Error:
PEπ(s, t) = E[{
D(s, t)− π(s, t)}2∣∣∣T > s
]
• the lower the better
• PE ≈ Bias2 + Variance
• evaluates both Calibration and Discrimination
• depends on P(event occurs in (s,s+t]
∣∣at risk at s)
• often called "Expected Brier Score"
Slide 9/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
How does the PE relate to calibration and
discrimination?
PEπ(s, t) =E[{
D(s, t)− π(s, t)}2∣∣∣T > s
]E[{
D(s, t)+ E[D(s, t)
∣∣Hπ(s)]︸ ︷︷ ︸
�true underlying� risk
−π(s, t)∣∣∣T > s
]
Hπ(s) = {X π(s),T > s} denotes the subject-speci�c history attime s.
I the more discriminating Hπ(s) the smaller Var{D(s, t)
∣∣Hπ(s)}
I E[D(s, t)
∣∣Hπ(s)]− π(s, t) ≡ 0 de�nes "strong" calibration.
Slide 10/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
How does the PE relate to calibration and
discrimination?
PEπ(s, t) =E[{
D(s, t)− E[D(s, t)
∣∣Hπ(s)]}2
E[{
D(s, t) + E[D(s, t)
∣∣Hπ(s)]︸ ︷︷ ︸
�true underlying� risk
− π(s, t)}2∣∣∣T > s
]
Hπ(s) = {X π(s),T > s} denotes the subject-speci�c history attime s.
I the more discriminating Hπ(s) the smaller Var{D(s, t)
∣∣Hπ(s)}
I E[D(s, t)
∣∣Hπ(s)]− π(s, t) ≡ 0 de�nes "strong" calibration.
Slide 10/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
How does the PE relate to calibration and
discrimination?
PEπ(s, t) =E[{
D(s, t)− E[D(s, t)
∣∣Hπ(s)]}2∣∣∣T > s
]+ E
[{E[D(s, t)
∣∣Hπ(s)]︸ ︷︷ ︸
�true underlying� risk
− π(s, t)}2∣∣∣T > s
]
Hπ(s) = {X π(s),T > s} denotes the subject-speci�c history attime s.
I the more discriminating Hπ(s) the smaller Var{D(s, t)
∣∣Hπ(s)}
I E[D(s, t)
∣∣Hπ(s)]− π(s, t) ≡ 0 de�nes "strong" calibration.
Slide 10/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
How does the PE relate to calibration and
discrimination?
PEπ(s, t) =E[{
D(s, t)− E[D(s, t)
∣∣Hπ(s)]}2∣∣∣T > s
]︸ ︷︷ ︸
Inseparability
+ E[{
E[D(s, t)
∣∣Hπ(s)]− π(s, t)
}2∣∣∣T > s]
︸ ︷︷ ︸Bias/Calibration
Hπ(s) = {X π(s),T > s} denotes the subject-speci�c history attime s.
I the more discriminating Hπ(s) the smaller Var{D(s, t)
∣∣Hπ(s)}
I E[D(s, t)
∣∣Hπ(s)]− π(s, t) ≡ 0 de�nes "strong" calibration.
Slide 10/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
How does the PE relate to calibration and
discrimination?
PEπ(s, t) =E[Var{D(s, t)
∣∣Hπ(s)}∣∣∣T > s
]︸ ︷︷ ︸
Discrimination
+ E[{
E[D(s, t)
∣∣Hπ(s)]− π(s, t)
}2∣∣∣T > s]
︸ ︷︷ ︸Calibration
Hπ(s) = {X π(s),T > s} denotes the subject-speci�c history attime s.
I the more discriminating Hπ(s) the smaller Var{D(s, t)
∣∣Hπ(s)}
I E[D(s, t)
∣∣Hπ(s)]− π(s, t) ≡ 0 de�nes "strong" calibration.
Slide 10/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
How does the PE relate to calibration and
discrimination?
PEπ(s, t) =E[Var{D(s, t)
∣∣Hπ(s)}∣∣∣T > s
]︸ ︷︷ ︸
Discrimination
+ E[{
E[D(s, t)
∣∣Hπ(s)]− π(s, t)
}2∣∣∣T > s]
︸ ︷︷ ︸Calibration
Hπ(s) = {X π(s),T > s} denotes the subject-speci�c history attime s.
I the more discriminating Hπ(s) the smaller Var{D(s, t)
∣∣Hπ(s)}
I E[D(s, t)
∣∣Hπ(s)]− π(s, t) ≡ 0 de�nes "strong" calibration.
Slide 10/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
How does the PE relate to calibration and
discrimination?
PEπ(s, t) =E[Var{D(s, t)
∣∣Hπ(s)}∣∣∣T > s
]︸ ︷︷ ︸
Discrimination
+ E[{
E[D(s, t)
∣∣Hπ(s)]− π(s, t)
}2∣∣∣T > s]
︸ ︷︷ ︸Calibration
Hπ(s) = {X π(s),T > s} denotes the subject-speci�c history attime s.
I the more discriminating Hπ(s) the smaller Var{D(s, t)
∣∣Hπ(s)}
I E[D(s, t)
∣∣Hπ(s)]− π(s, t) ≡ 0 de�nes "strong" calibration.
Slide 10/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
How does the PE relate to calibration and
discrimination?
PEπ(s, t) =E[Var{D(s, t)
∣∣Hπ(s)}∣∣∣T > s
]︸ ︷︷ ︸
Does NOT depend on π(s,t)
+ E[{
E[D(s, t)
∣∣Hπ(s)]− π(s, t)
}2∣∣∣T > s]
︸ ︷︷ ︸Depends on π(s,t)
Hπ(s) = {X π(s),T > s} denotes the subject-speci�c history attime s.
I the more discriminating Hπ(s) the smaller Var{D(s, t)
∣∣Hπ(s)}
I E[D(s, t)
∣∣Hπ(s)]− π(s, t) ≡ 0 de�nes "strong" calibration.
Slide 10/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
R2π-type criterion
I Benchmark PE0
The best "null" prediction tool, which gives the same (marginal)predicted risk
S(s + t|s) = E[D(s, t)
∣∣H0(s)], H0(s) = {T > s}
to all subjects leads to
PE0(s, t) = Var{D(s, t)|T > s} = S(s + t|s){1− S(s + t|s)
}.
I Simple idea
R2π(s, t) = 1− PEπ(s, t)
PE0(s, t)
Slide 11/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
R2π-type criterion
I Benchmark PE0
The best "null" prediction tool, which gives the same (marginal)predicted risk
S(s + t|s) = E[D(s, t)
∣∣H0(s)], H0(s) = {T > s}
to all subjects leads to
PE0(s, t) = Var{D(s, t)|T > s} = S(s + t|s){1− S(s + t|s)
}.
I Simple idea
R2π(s, t) = 1− PEπ(s, t)
PE0(s, t)
Slide 11/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Why bother?
I R2π(s, t) aims to circumvent the di�cult interpretation of:
• the scale on which PE (s, t) is measured
• interpretation for trend of PE (s, t) vs s
I Because the meaning of the scale on which PE(s,t) ismeasured changes with s, an increasing/decreasing trend can bedue to changes in:
• the quality of the predictions
and/or • the at risk population
Slide 12/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Why bother?
I R2π(s, t) aims to circumvent the di�cult interpretation of:
• the scale on which PE (s, t) is measured
• interpretation for trend of PE (s, t) vs s
I Because the meaning of the scale on which PE(s,t) ismeasured changes with s, an increasing/decreasing trend can bedue to changes in:
• the quality of the predictions
and/or • the at risk population
Slide 12/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Changes in the quality of the predictions
"Essentially, all models are wrong, but someare useful.", G. Box
I The prediction model from which we have obtained thepredictions can be �more wrong� for some s than for some others.
• Calibration term of PE (s, t) changes with s
• We can work on it!
Slide 13/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Changes in the quality of the predictions
"Essentially, all models are wrong, but someare useful.", G. Box
I The prediction model from which we have obtained thepredictions can be �more wrong� for some s than for some others.
• Calibration term of PE (s, t) changes with s
• We can work on it!
Slide 13/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Changes in the at risk population
An example:
• Patients with cardiovascular history (CV) all die early.
• Only those without CV remain at risk for late s.
• Then:
• the earlier s the more homogeneous the at risk population• CV is useful for prediction for early s but useless for late s.
I The available information can be more informative for some sthan for some others.
• Discriminating term of PE (s, t) changes with s
• This is just how it is, there is nothing we can do!
(we can only work with the data we have)
Slide 14/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Changes in the at risk population
An example:
• Patients with cardiovascular history (CV) all die early.
• Only those without CV remain at risk for late s.
• Then:
• the earlier s the more homogeneous the at risk population• CV is useful for prediction for early s but useless for late s.
I The available information can be more informative for some sthan for some others.
• Discriminating term of PE (s, t) changes with s
• This is just how it is, there is nothing we can do!
(we can only work with the data we have)
Slide 14/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
R2π(s, t) interpretation
I Always true:Measure of how the prediction tool π(s, t) performs compared to the
benchmark null prediction tool, which gives the same predicted risk to
all subjects (marginal risk).
I When predictions are calibrated (strongly):
R2π(s, t) =
Var{π(s, t)|T > s}Var{D(s, t)|T > s}
explained variation
= Corr2{D(s, t), π(s, t)
∣∣T > s}
correlation
= E{π(s, t)
∣∣∣D(s, t) = 1,T > s}
mean risk di�erence
− E{π(s, t)
∣∣∣D(s, t) = 0,T > s}.
(after little algebra)
Slide 15/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
R2π(s, t) interpretation
I Always true:Measure of how the prediction tool π(s, t) performs compared to the
benchmark null prediction tool, which gives the same predicted risk to
all subjects (marginal risk).
I When predictions are calibrated (strongly):
R2π(s, t) =
Var{π(s, t)|T > s}Var{D(s, t)|T > s}
explained variation
= Corr2{D(s, t), π(s, t)
∣∣T > s}
correlation
= E{π(s, t)
∣∣∣D(s, t) = 1,T > s}
mean risk di�erence
− E{π(s, t)
∣∣∣D(s, t) = 0,T > s}.
(after little algebra)
Slide 15/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
R2π(s, t) interpretation
I Always true:Measure of how the prediction tool π(s, t) performs compared to the
benchmark null prediction tool, which gives the same predicted risk to
all subjects (marginal risk).
I When predictions are calibrated (strongly):
R2π(s, t) =
Var{π(s, t)|T > s}Var{D(s, t)|T > s}
explained variation
= Corr2{D(s, t), π(s, t)
∣∣T > s}
correlation
= E{π(s, t)
∣∣∣D(s, t) = 1,T > s}
mean risk di�erence
− E{π(s, t)
∣∣∣D(s, t) = 0,T > s}.
(after little algebra)
Slide 15/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
R2π(s, t) interpretation
I Always true:Measure of how the prediction tool π(s, t) performs compared to the
benchmark null prediction tool, which gives the same predicted risk to
all subjects (marginal risk).
I When predictions are calibrated (strongly):
R2π(s, t) =
Var{π(s, t)|T > s}Var{D(s, t)|T > s}
explained variation
= Corr2{D(s, t), π(s, t)
∣∣T > s}
correlation
= E{π(s, t)
∣∣∣D(s, t) = 1,T > s}
mean risk di�erence
− E{π(s, t)
∣∣∣D(s, t) = 0,T > s}.
(after little algebra)
Slide 15/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Observations & IPCW PE estimatorI Observations (i.i.d.){(
Ti ,∆i , πi (·, ·)), i = 1, . . . , n
}where Ti = Ti ∧ Ci , ∆i = 11{Ti ≤ Ci}
I Indicator of �observed event occurrence� in (s, s + t]:
Di (s, t) = 11{s < Ti ≤ s + t,∆i = 1} =
1 : event occurred
0 : event did not occur
or censored obs.
I Inverse Probability of Censoring Weighting (IPCW) estimator:
PEπ(s, t) =1
n
n∑i=1
Wi (s, t)
{Di (s, t)− πi (s, t)
}2
and
R2
π(s, t) = 1− PEπ(s, t)
PE 0(s, t)
Slide 16/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Observations & IPCW PE estimatorI Observations (i.i.d.){(
Ti ,∆i , πi (·, ·)), i = 1, . . . , n
}where Ti = Ti ∧ Ci , ∆i = 11{Ti ≤ Ci}
I Indicator of �observed event occurrence� in (s, s + t]:
Di (s, t) = 11{s < Ti ≤ s + t,∆i = 1} =
1 : event occurred
0 : event did not occur
or censored obs.
I Inverse Probability of Censoring Weighting (IPCW) estimator:
PEπ(s, t) =1
n
n∑i=1
Wi (s, t)
{Di (s, t)− πi (s, t)
}2
and
R2
π(s, t) = 1− PEπ(s, t)
PE 0(s, t)
Slide 16/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Observations & IPCW PE estimatorI Observations (i.i.d.){(
Ti ,∆i , πi (·, ·)), i = 1, . . . , n
}where Ti = Ti ∧ Ci , ∆i = 11{Ti ≤ Ci}
I Indicator of �observed event occurrence� in (s, s + t]:
Di (s, t) = 11{s < Ti ≤ s + t,∆i = 1} =
1 : event occurred
0 : event did not occur
or censored obs.
I Inverse Probability of Censoring Weighting (IPCW) estimator:
PEπ(s, t) =1
n
n∑i=1
Wi (s, t)
{Di (s, t)− πi (s, t)
}2
and
R2
π(s, t) = 1− PEπ(s, t)
PE 0(s, t)
Slide 16/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Observations & IPCW PE estimatorI Observations (i.i.d.){(
Ti ,∆i , πi (·, ·)), i = 1, . . . , n
}where Ti = Ti ∧ Ci , ∆i = 11{Ti ≤ Ci}
I Indicator of �observed event occurrence� in (s, s + t]:
Di (s, t) = 11{s < Ti ≤ s + t,∆i = 1} =
1 : event occurred
0 : event did not occur
or censored obs.
I Inverse Probability of Censoring Weighting (IPCW) estimator:
PEπ(s, t) =1
n
n∑i=1
Wi (s, t){Di (s, t)− πi (s, t)
}2
and
R2
π(s, t) = 1− PEπ(s, t)
PE 0(s, t)
Slide 16/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Observations & IPCW PE estimatorI Observations (i.i.d.){(
Ti ,∆i , πi (·, ·)), i = 1, . . . , n
}where Ti = Ti ∧ Ci , ∆i = 11{Ti ≤ Ci}
I Indicator of �observed event occurrence� in (s, s + t]:
Di (s, t) = 11{s < Ti ≤ s + t,∆i = 1} =
1 : event occurred
0 : event did not occur
or censored obs.
I Inverse Probability of Censoring Weighting (IPCW) estimator:
PEπ(s, t) =1
n
n∑i=1
Wi (s, t){Di (s, t)− πi (s, t)
}2
and
R2
π(s, t) = 1− PEπ(s, t)
PE 0(s, t)
Slide 16/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Inverse Probability of Censoring Weights
Wi (s, t) =
11{s < Ti ≤ s + t}∆i
G (Ti |s)
+
11{Ti > s + t}G (s + t|s)
+
0
with G(u|s) the Kaplan-Meier estimator of P(C > u|C > s).
Landmark time s Time s + t
time
: uncensored
: censored
Slide 17/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Inverse Probability of Censoring Weights
Wi (s, t) =11{s < Ti ≤ s + t}∆i
G (Ti |s)+
11{Ti > s + t}G (s + t|s)
+
0
with G(u|s) the Kaplan-Meier estimator of P(C > u|C > s).
Landmark time s Time s + t
time
: uncensored
: censored
Slide 17/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Inverse Probability of Censoring Weights
Wi (s, t) =11{s < Ti ≤ s + t}∆i
G (Ti |s)+
11{Ti > s + t}G (s + t|s)
+
0
with G(u|s) the Kaplan-Meier estimator of P(C > u|C > s).
Landmark time s Time s + t
time
: uncensored
: censored
Slide 17/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Inverse Probability of Censoring Weights
Wi (s, t) =11{s < Ti ≤ s + t}∆i
G (Ti |s)+
11{Ti > s + t}G (s + t|s)
+ 0
with G(u|s) the Kaplan-Meier estimator of P(C > u|C > s).
Landmark time s Time s + t
time
: uncensored
: censored
Slide 17/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Asymptotic i.i.d. representation
Lemma: Assume that the censoring time C is independent of(T , η, π(·, ·)) and let θ denote either PEπ, R
2π or a di�erence in PE
or R2π, then
√n(θ(s, t)− θ(s, t)
)=
1√n
n∑i=1
IFθ(Ti ,∆i , πi (s, t), s, t) + op (1)
where IFθ(Ti ,∆i , πi (s, t), s, t) being :
I zero-mean i.i.d. terms
I easy to estimate (using Nelson-Aalen & Kaplan-Meier)
Slide 18/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Pointwise con�dence interval (�xed s)
• Asymptotic normality:
√n(θ(s, t)− θ(s, t)
)D−→ N
(0, σ2s,t
)• 95% con�dence interval:{
θ(s, t)± z1−α/2σs,t√n
}
where z1−α/2 is the 1− α/2 quantile of N (0, 1).
• Variance estimator:
σ2s,t =1
n
n∑i=1
{IFθ(Ti ,∆i , πi (s, t), s, t)
}2
Slide 19/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Simultaneous con�dence band over s ∈ S
{θ(s, t)± q
(S,t)1−α
σs,t√n
}, s ∈ S
Computation of q(S,t)1−α by the simulation algorithm (≈ Wild Bootstrap):
1 For b = 1, . . . ,B, say B = 4000, do:
1 Generate {ωb1 , . . . , ω
bn} from n i.i.d. N (0, 1).
2 Using the plug-in estimator IFθ(·), compute:
Υb = sups∈S
∣∣∣∣∣∣∣∣∣∣
2 Compute q(S,t)1−α as the 100(1− α)th percentile of
{Υ1, . . . ,ΥB
}(Conditional multiplier central limit theorem)
Slide 20/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Simultaneous con�dence band over s ∈ S
{θ(s, t)± q
(S,t)1−α
σs,t√n
}, s ∈ S
Computation of q(S,t)1−α by the simulation algorithm
(≈ Wild Bootstrap):
1 For b = 1, . . . ,B, say B = 4000, do:
1 Generate {ωb1 , . . . , ω
bn} from n i.i.d. N (0, 1).
2 Using the plug-in estimator IFθ(·), compute:
Υb = sups∈S
∣∣∣∣∣∣∣∣∣∣
2 Compute q(S,t)1−α as the 100(1− α)th percentile of
{Υ1, . . . ,ΥB
}(Conditional multiplier central limit theorem)
Slide 20/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Simultaneous con�dence band over s ∈ S
{θ(s, t)± q
(S,t)1−α
σs,t√n
}, s ∈ S
Computation of q(S,t)1−α by the simulation algorithm (≈ Wild Bootstrap):
1 For b = 1, . . . ,B, say B = 4000, do:
1 Generate {ωb1 , . . . , ω
bn} from n i.i.d. N (0, 1).
2 Using the plug-in estimator IFθ(·), compute:
Υb = sups∈S
∣∣∣∣∣∣∣∣∣∣
2 Compute q(S,t)1−α as the 100(1− α)th percentile of
{Υ1, . . . ,ΥB
}
(Conditional multiplier central limit theorem)
Slide 20/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Simultaneous con�dence band over s ∈ S
{θ(s, t)± q
(S,t)1−α
σs,t√n
}, s ∈ S
Computation of q(S,t)1−α by the simulation algorithm (≈ Wild Bootstrap):
1 For b = 1, . . . ,B, say B = 4000, do:
1 Generate {ωb1 , . . . , ω
bn} from n i.i.d. N (0, 1).
2 Using the plug-in estimator IFθ(·), compute:
Υb = sups∈S
∣∣∣∣∣∣∣∣∣∣
2 Compute q(S,t)1−α as the 100(1− α)th percentile of
{Υ1, . . . ,ΥB
}
(Conditional multiplier central limit theorem)
Slide 20/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Simultaneous con�dence band over s ∈ S
{θ(s, t)± q
(S,t)1−α
σs,t√n
}, s ∈ S
Computation of q(S,t)1−α by the simulation algorithm (≈ Wild Bootstrap):
1 For b = 1, . . . ,B, say B = 4000, do:
1 Generate {ωb1 , . . . , ω
bn} from n i.i.d. N (0, 1).
2 Using the plug-in estimator IFθ(·), compute:
Υb = sups∈S
∣∣∣∣∣ 1√n
n∑i=1
ωbiIFθ(Ti ,∆i , πi (s, t), s, t)
σs,t
∣∣∣∣∣
2 Compute q(S,t)1−α as the 100(1− α)th percentile of
{Υ1, . . . ,ΥB
}
(Conditional multiplier central limit theorem)
Slide 20/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Simultaneous con�dence band over s ∈ S
{θ(s, t)± q
(S,t)1−α
σs,t√n
}, s ∈ S
Computation of q(S,t)1−α by the simulation algorithm (≈ Wild Bootstrap):
1 For b = 1, . . . ,B, say B = 4000, do:
1 Generate {ωb1 , . . . , ω
bn} from n i.i.d. N (0, 1).
2 Using the plug-in estimator IFθ(·), compute:
Υb = sups∈S
∣∣∣∣∣ 1√n
n∑i=1
ωbiIFθ(Ti ,∆i , πi (s, t), s, t)
σs,t
∣∣∣∣∣
2 Compute q(S,t)1−α as the 100(1− α)th percentile of
{Υ1, . . . ,ΥB
}
(Conditional multiplier central limit theorem)
Slide 20/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Simultaneous con�dence band over s ∈ S
{θ(s, t)± q
(S,t)1−α
σs,t√n
}, s ∈ S
Computation of q(S,t)1−α by the simulation algorithm (≈ Wild Bootstrap):
1 For b = 1, . . . ,B, say B = 4000, do:
1 Generate {ωb1 , . . . , ω
bn} from n i.i.d. N (0, 1).
2 Using the plug-in estimator IFθ(·), compute:
Υb = sups∈S
∣∣∣∣∣ 1√n
n∑i=1
ωbiIFθ(Ti ,∆i , πi (s, t), s, t)
σs,t
∣∣∣∣∣
2 Compute q(S,t)1−α as the 100(1− α)th percentile of
{Υ1, . . . ,ΥB
}
(Conditional multiplier central limit theorem)
Slide 20/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Simultaneous con�dence band over s ∈ S
{θ(s, t)± q
(S,t)1−α
σs,t√n
}, s ∈ S
Computation of q(S,t)1−α by the simulation algorithm (≈ Wild Bootstrap):
1 For b = 1, . . . ,B, say B = 4000, do:
1 Generate {ωb1 , . . . , ω
bn} from n i.i.d. N (0, 1).
2 Using the plug-in estimator IFθ(·), compute:
Υb = sups∈S
∣∣∣∣∣ 1√n
n∑i=1
ωbiIFθ(Ti ,∆i , πi (s, t), s, t)
σs,t
∣∣∣∣∣2 Compute q
(S,t)1−α as the 100(1− α)th percentile of
{Υ1, . . . ,ΥB
}(Conditional multiplier central limit theorem)
Slide 20/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
DIVAT sample
• Population based study of kidney recipients (n=4,119)
• Split the data into training (2/3) and validation (1/3) samples
• T : time from 1-year after transplantation to graft failure which is:
Death Return to dialysis
OR
• Censoring due to:
• delayed entries: 2000-2013• end of follow-up: 2014
• Baseline covariates: age, sex, cardiovascular history
• Longitudinal biomarker (yearly): serum creatinine
Slide 21/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
DIVAT sample
• Population based study of kidney recipients (n=4,119)
• Split the data into training (2/3) and validation (1/3) samples
• T : time from 1-year after transplantation to graft failure which is:
Death Return to dialysis
OR
• Censoring due to:
• delayed entries: 2000-2013• end of follow-up: 2014
• Baseline covariates: age, sex, cardiovascular history
• Longitudinal biomarker (yearly): serum creatinine
Slide 21/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
DIVAT sample
• Population based study of kidney recipients (n=4,119)
• Split the data into training (2/3) and validation (1/3) samples
• T : time from 1-year after transplantation to graft failure which is:
Death Return to dialysis
OR
• Censoring due to:
• delayed entries: 2000-2013• end of follow-up: 2014
• Baseline covariates: age, sex, cardiovascular history
• Longitudinal biomarker (yearly): serum creatinine
Slide 21/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Descriptive statistics & censoring issue
• s ∈ S = {0, 0.5, . . . , 5}• t = 5 years
s=0 s=1 s=2 s=3 s=4 s=5
Censored in (s,s+5]Known as event−free at s+5Observed failure in (s,s+5]
landmark time (year)
No.
of s
ubje
cts
020
060
010
0014
00
Slide 22/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Joint model
I Longitudinal
log[Yi (tij)
]= (β0 + b0i ) + β0,ageAGEi + β0,sexSEXi
+(β1 + b1i + β1,ageAGEi
)× tij + εij
= mi (t) + εij
I Survival (hazard)
hi (t) = h0(t) exp
{γageAGEi + γCVCVi
+ α1mi (t) + α2dmi (t)
dt
}
(�tted using package JM)
Slide 23/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Joint model
I Longitudinal
log[Yi (tij)
]= (β0 + b0i ) + β0,ageAGEi + β0,sexSEXi
+(β1 + b1i + β1,ageAGEi
)× tij + εij
= mi (t) + εij
I Survival (hazard)
hi (t) = h0(t) exp
{γageAGEi + γCVCVi
+ α1mi (t) + α2dmi (t)
dt
}(�tted using package JM)
Slide 23/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
R2π(s, t) vs s (t=5 years)
Landmark time s (years)
Rπ2 (s
, t)
0 1 2 3 4 5
0 %
10 %
20 %
30 %
●
●
●
●
●
●
●
●
●
●
●
95% pointwise CI95% simultaneous CB
Slide 24/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Comparing R2π(s, t) vs s for di�erent π(s, t)
Landmark time s (years)
Rπ2 (s
, t)
0 1 2 3 4 5
0 %
10 %
20 %
30 %
●
●
●
●
●
●
●
●
●
●
●
T ~ Age + CV + m(t) + m'(t) (JM)
Slide 25/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Comparing R2π(s, t) vs s for di�erent π(s, t)
Landmark time s (years)
Rπ2 (s
, t)
0 1 2 3 4 5
0 %
10 %
20 %
30 %
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●● ●
●
●●
T ~ Age + CV + m(t) + m'(t) (JM)
T ~ Age + CV
Slide 25/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Comparing R2π(s, t) vs s for di�erent π(s, t)
Landmark time s (years)
Rπ2 (s
, t)
0 1 2 3 4 5
0 %
10 %
20 %
30 %
●
●
●
●
●
●
●
●
●
●
●
●
●● ●
●
●● ●
●
●●●
●
●
●●
●
● ●
● ●
●
T ~ Age + CV + m(t) + m'(t) (JM)T ~ Age + CV + Y(t=0)T ~ Age + CV
Slide 25/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Comparing R2π(s, t) vs s for di�erent π(s, t)
Landmark time s (years)
Rπ2 (s
, t)
0 1 2 3 4 5
0 %
10 %
20 %
30 %
●
●
●
●
●
●
●
●●
●
●●
●
●
T ~ Age + CV + m(t) + m'(t) (JM)T ~ Age + CV + Y(t=0)
Slide 25/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Comparing R2π(s, t) vs s for di�erent π(s, t)
Landmark time s
Diff
eren
ce in
Rπ2 (s
,t)
●●
●●
●
●
●
−5
%5
%10
%20
%0
%
0 1 2 3 4 5
Slide 25/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Comparing R2π(s, t) vs s for di�erent π(s, t)
Landmark time s
Diff
eren
ce in
Rπ2 (s
,t)
●●
●●
●
●
●
●
●
● ●
−5
%5
%10
%20
%0
%
0 1 2 3 4 5
Slide 25/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Calibration plot (example for s = 3 years)
(0,10) (0;3.5]
(10,20) (3.5;5.3]
(20,30) (5.3;8]
(30,40) (8;10.4]
(40,50) (10.4;13.1]
(50,60) (13.1;16.5]
(60,70) (16.5;22.3]
(70,80) (22.3;33]
(80,90) (33;55.1]
(90,100) (55.1;100]
PredictedObserved
Risk groups (quantile groups, in %)
5−ye
ar r
isk
(%)
020
4060
8010
0
2%
8(4)%4%
15(6)%
7%
15(5)%
9%13(5)% 12%
16(6)% 15%
21(7)%19%
26(6)% 27%29(7)%
42%
36(9)%
77%74(8)%
Slide 26/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Area under the ROC(s, t) curve vs s
Landmark time s (years)
AU
Cπ(s
, t)
0 1 2 3 4 5
50%
60%
70%
80%
100%
●
●
●●
●
●
●● ● ●
●
●
●
●●
●
● ●●
●
●
●
●
●●
●●
●●
●
●●
●
T ~ Age + CV + m(t) + m'(t) (JM)T ~ Age + CV + Y(t=0)T ~ Age + CV
Slide 27/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Summing up
I R2-type curve
• summarizes calibration and discriminating simultaneously
• has an understandable trend
I Simple model free inference
• predictions can be obtained from any model
• we do not assume any model to hold
• allows fair comparisons of di�erent predictions
I The method accounts for:
• Censoring
• Dynamic setting (the at risk population changes)
Slide 28/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Discussion
I The strong calibration assumption allows di�erent interestinginterpretations:
• Explained variation• Correlation• Mean risk di�erence
I Unfortunately
• the strong calibration cannot be checked(curse of dimensionality)
I However• weak and strong de�nitions are closely related:
• strong calibration implies weak calibration• weak calibration can �often� be seen as a reasonable
approximation of strong calibration in practice
• weak calibration can be assessed (plots)
Thank you for your attention!
Slide 29/29 � P Blanche et al.
un iver s i ty of copenhagen department of b io stat i s t i c s
Discussion
I The strong calibration assumption allows di�erent interestinginterpretations:
• Explained variation• Correlation• Mean risk di�erence
I Unfortunately
• the strong calibration cannot be checked(curse of dimensionality)
I However• weak and strong de�nitions are closely related:
• strong calibration implies weak calibration• weak calibration can �often� be seen as a reasonable
approximation of strong calibration in practice
• weak calibration can be assessed (plots)
Thank you for your attention!
Slide 29/29 � P Blanche et al.