On a dynamic approach to the analysis of multivariate failure time data

40
On a dynamic approach to the analysis of multivariate failure time data Odd Aalen Section of Medical Statistics, University of Oslo, Norway

description

On a dynamic approach to the analysis of multivariate failure time data. Odd Aalen Section of Medical Statistics, University of Oslo, Norway. Coworkers. Ørnulf Borgan: Institute of Mathematics, University of Oslo Johan Fosen: Section of Medical Statistics, University of Oslo - PowerPoint PPT Presentation

Transcript of On a dynamic approach to the analysis of multivariate failure time data

Page 1: On a dynamic approach to the analysis of multivariate failure time data

On a dynamic approach to the analysis of multivariate failure time data

Odd AalenSection of Medical Statistics,University of Oslo, Norway

Page 2: On a dynamic approach to the analysis of multivariate failure time data

2

Coworkers

Ørnulf Borgan: Institute of Mathematics, University of Oslo

Johan Fosen: Section of Medical Statistics, University of Oslo

Harald Fekjaer: The Norwegian Cancer Registry

Page 3: On a dynamic approach to the analysis of multivariate failure time data

3

What are multivariate survival data?1. Repeated events over time.2. Possibly, each individual may have several

”units on test”.3. Complication (as always) is censoring.4. The long-term ambition is to tackle complex

event histories with many different types of events.

Page 4: On a dynamic approach to the analysis of multivariate failure time data

4

Example: Small bowel motility

Study cyclic pattern of motility (spontaneous movements) of the small bowel in humans. Focus on MMC complexes which come with irregular intervals (lasting from minutes to several hours).

Motility is very important from a clinical point of view. Data studied by frailty models in (Aalen & Husebye,

1991). Will here apply dynamic models instead.

Page 5: On a dynamic approach to the analysis of multivariate failure time data

5

Small bowel motility data (Husebye)

Page 6: On a dynamic approach to the analysis of multivariate failure time data

6

Other examples

Duration of amalgam fillingsEach patient contributes a number of fillings

Repeated tumors in animal experiments Continuous registration of sleep, moving into

and out of various sleep states

Page 7: On a dynamic approach to the analysis of multivariate failure time data

7

How are such data analysed? By marginal models. Dependence in the data is generally

ignored, except for estimating standard errors. By frailty models. One assumes that each individual has

a separate risk of the event occurring. Frailty models are random effect models. Censoring is handled well by frailty models. Excellent state of the art book: Hougaard, 2000.

Will propose alternative method based on regression on dynamic covariates.

Page 8: On a dynamic approach to the analysis of multivariate failure time data

8

Cox 1972: Two major papers

The famous one Cox models for ordinary survival data JRSSB, 1972, 34, 187-220

The ignored one Cox models for point processes introducing dynamic covariates, e.g. ”time since last event” In: P.A.W.Lewis, ”Stochastic point processes: Statistical

analysis, theory and applications”, Wiley, 1972, 55-66 It is time to take up the challenge of Cox’s second paper

Page 9: On a dynamic approach to the analysis of multivariate failure time data

9

Counting process framework What is a counting process? Observing events occurring over time. Examples of events:

waking up during night amalgam filling failing detecting a tumor

Counting the number of events as they come along yields a counting process

The counting process is denoted N(t) where t is time. The process is constant between events and jumps one unit at each event

Page 10: On a dynamic approach to the analysis of multivariate failure time data

10

Illustration of a counting process

5

2

3

4

1

Time

Page 11: On a dynamic approach to the analysis of multivariate failure time data

11

The intensity process of a counting process Definition of the intensity process:

Extremely fruitful reformulation of the definition: The fact that a counting process N(t) has an intensity process (t) can be made precise by the following mathematical statement:

( ) ( ( ) ( ) | )tdt

P N t dt N t 1 1 past

M t N t s dst

( ) ( ) ( ) z0is a zero mean martingale

Page 12: On a dynamic approach to the analysis of multivariate failure time data

12

Martingale for simulated Poisson process with rate 1

Time

1086420

1.5

1.0

.5

0.0

-.5

-1.0

-1.5

-2.0

Page 13: On a dynamic approach to the analysis of multivariate failure time data

13

Stochastic integrals If M(t) is a martingale, then the following is also a martingale:

I t H s dM st

( ) ( ) ( )z0where H(t) is (essentially) any stochastic process dependent on the past, and with left-continuous sample functions

Useful properties of stochastic integrals include explicit formulas for variances and central limit theorems

Page 14: On a dynamic approach to the analysis of multivariate failure time data

14

Following up on Cox 1972b: Dynamic models

Dynamic models incorporate past observation in the analysis.

Example: Frailty induces dependence over time. e.g. several previous events increases likelihood of new

event hence frailty models can be analyzed as dynamic models this connection can be made mathematically precise

In addition, there are real dynamic effects. (Which may be difficult to distinguish from frailty effects.)

Page 15: On a dynamic approach to the analysis of multivariate failure time data

15

Regression on dynamic covariates

Dynamic covariates may be defined: number of previous events time since last occurrence or in fact any function of the past (due to martingale theory)

Dynamic covariates are continuously updated. Dynamic regression may be more flexible than a

frailty approach, since effects may change over time.

Page 16: On a dynamic approach to the analysis of multivariate failure time data

16

Types of regression The data are a number of counting processes, with many

possible events in each, and observed in parallel Can use

Cox regression additive regression

Will focus on additive regression This is a local approach, as opposed to Cox regression Basic idea: Whenever an event occurs, a linear model is

estimated with dependent variable being a vector of 0’s, except for a 1 in the process where the event happened.

Individually, these estimates are not informative, but summing them up over time yields something sensible.

Page 17: On a dynamic approach to the analysis of multivariate failure time data

17

Additive intensity regression For each individual in the risk set, the intensity process for

individual i is defined as a linear function of the covariates:

where Ki(t) is the number of units at risk (e.g. amalgam fillings) for the individual.

The regression functions (’s) are arbitrary functions, while the covariates (Z’s) are arbitrary predictable processes (e.g. adapted with left-continuous sample paths).

i i k i kk

p

t K t t tbg bg bg ( )( ),0

1

Page 18: On a dynamic approach to the analysis of multivariate failure time data

18

Why additivity (linearity)? Seems unnatural since the intensity should be positive. However: Additivity yields complete flexibility as to how effects of

covariate change over time. Also: Additivity yields exact martingales in several settings, which

is technically convenient. The Cox model never yields exact martingales.

In practice, effects are not always (in fact, usually not?) proportional.

The additive model can be connected up with other linear models, e.g. for the covariates, and connected into path analysis.

Page 19: On a dynamic approach to the analysis of multivariate failure time data

19

Additive model: Local least square estimation

Easiest to estimate cumulative regression functions:

The slope of these gives information about the regression functions. Estimate defined as stochastic integral of counting proess, for a suitable design matrix Y(t):

t

kk dsst0

( ) ( )_

t Y t dN ttbgz0

where Y(t)- is a generalized inverse.

Page 20: On a dynamic approach to the analysis of multivariate failure time data

20

Residual processes

Martingale residual processes are defined as follows:

M t I Y s Y s J s dN st

res( ) ( ( ) ( ) ) ( ) ( ) z0 These are exact martingales. For judging the influence of outliers, one may look

at the sum of the hat matrices over jump times:

Y T Y Ti ii

( ) ( )

Page 21: On a dynamic approach to the analysis of multivariate failure time data

21

Theory for additive model There exists much theory, including:

asymptotics, testing, residuals, density type estimation of regression functions, ridge regression, optimal procedures

estimating covariate effects on transition probabilities in Markov chains

Most theory is based on stochastic integrals and martingales.

See e.g. the book by Andersen, Borgan, Gill and Keiding (1993).

Aalen et al, Biometrics, 2001.

Page 22: On a dynamic approach to the analysis of multivariate failure time data

22

Dynamic analysis of frailty: Simulation Simulating 40 independent Poisson processes Rate in each process simulated from an exponential

distribution with expectation 1. The rate serves as a frailty variable

For each counting processes we define a dynamic covariate to be the number of previous events in the process

Page 23: On a dynamic approach to the analysis of multivariate failure time data

23

Cumulative regression function for dynamic covariate

Cumulative regression function with 95% confidence limits

Covariate: Sum events divided by time

Time

43210

Cum

ulat

ive

regr

essi

on fu

nctio

n

350

300

250

200

150

100

50

0

-50

Page 24: On a dynamic approach to the analysis of multivariate failure time data

24

Standardized residual processes Cumulative residual

processes shown left and kernel estimated processes shown right.

Upper panels: No covariate

Lower panels: Dynamic covariate

9.008.007.006.005.004.003.002.001.00.00

8

6

4

2

0

-2

-4

-6

-8

TIME

2.001.801.601.401.201.00.80.60.40.20.00

8

6

4

2

0

-2

-4

-6

-8

2.001.801.601.401.201.00.80.60.40.20.00

8

6

4

2

0

-2

-4

-6

-89.008.007.006.005.004.003.002.001.00.00

8

6

4

2

0

-2

-4

-6

-8

Page 25: On a dynamic approach to the analysis of multivariate failure time data

25

Small bowel motility Dynamic covariates

Number of previous events Time since last event (cut point at 50 minutes)

 Cox regression can be applied. Hazard ratios with 95% confidence intervals: 0.98 (0.76, 1.27) 4.66 (2.36, 9.19)

Illustrated by additive model on the next slides, which shows Number of previous occurences has no effect Time since last occurence does have an effect

Page 26: On a dynamic approach to the analysis of multivariate failure time data

26

Dynamic covariate I: number of previous occurrences(upper and lower curves give pointwise 95% confidence intervals)

There is clearly no effect of the covariate

Page 27: On a dynamic approach to the analysis of multivariate failure time data

27

Dynamic covariate II:time since last occurrence (above or below 50 days)

Page 28: On a dynamic approach to the analysis of multivariate failure time data

28

Example: Repeated tumors Gail, Santner and Brown (1980). Carcinogen injected at day 0 in 76 female rats. Then

treated with retinyl acetate for 60 days. The 48 animals which were still tumor free, where

randomised to continued retinoid prophylaxis, or to control.

The animals were followed until 182 days after the initial injection. Several tumors were observed in most animals, and the time of each tumor was recorded.

Page 29: On a dynamic approach to the analysis of multivariate failure time data

29

Data on repeated mammary tumors. Additive model with one fixed and three dynamic covariates.

Time (days)

2001801601401201008060

5

4

3

2

1

0

Cumulative baseline intensity Treatment effect

Time (days)

2001801601401201008060

3.0

2.5

2.0

1.5

1.0

.5

0.0

Time (days)

2001801601401201008060

2.5

2.0

1.5

1.0

.5

0.0

-.5

Number of previous occurrences Time since previous event

Time (days)

2001801601401201008060

1.0

.5

0.0

-.5

-1.0

Page 30: On a dynamic approach to the analysis of multivariate failure time data

30

Residual plots I

Fitting all covariates (treatment and dynamic ones). Left panel shows standardized residual. Right panel shows mean and standard deviation of standardized residuals.

Time (days)

1801601401201008060

5

4

3

2

1

0

-1

-2

-3

Time (days)

1801601401201008060

1.2

1.0

.8

.6

.4

.2

0.0

-.2

SD

MEAN

Page 31: On a dynamic approach to the analysis of multivariate failure time data

31

Residual plots II

Using only treatment as covariate. Left panel shows standardized residual. Right panel shows mean and standard deviation of standardized residuals.

Time (days)

1801601401201008060

2.5

2.0

1.5

1.0

.5

0.0

-.5

SD

MEAN

Time (days)

1801601401201008060

6

4

2

0

-2

-4

Page 32: On a dynamic approach to the analysis of multivariate failure time data

32

Influence plot

Cumulative sum of hat matrices

Straight line marks the limit for influential processes

Sequence number

210199

188177

166155

144133

122111

10089

7867

5645

3423

121

50

40

30

20

10

0

Page 33: On a dynamic approach to the analysis of multivariate failure time data

33

Conclusion for mammary tumors

FindingsConstant rate over time.Effect of treatment.Effect of number previous occurrences.No effect of time since last occurrence.

Type of process:Markovian. Several Poisson processes with varying

rate.

Page 34: On a dynamic approach to the analysis of multivariate failure time data

34

Example of more complex event history: Sleep data

Data collected at Max Planck institute in Munich concerning sleep patterns. Analysing tendency to fall asleep, wake up, have REM periods etc. Clearly, many occurrences of events each night.

Example of counting process: number of times you wake up Dynamic covariates

measurement of cortisol (stress hormone) number of previous events divided by elapsed time log duration of ongoing sleep period

Page 35: On a dynamic approach to the analysis of multivariate failure time data

35

Kernel estimates of regression functions

0 1 2 3 4 5 6

-10

12

34

5

Hours since first time asleep

Sm

ooth

ed r

egre

ssio

n fu

nctio

n

Base hazard

0 1 2 3 4 5 6

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

Hours since first time asleep

Sm

ooth

ed r

egre

ssio

n fu

nctio

n

Log cortisol

0 1 2 3 4 5 6

-0.4

-0.2

0.0

0.2

0.4

0.6

Hours since first time asleep

Sm

ooth

ed r

egre

ssio

n fu

nctio

n

Number of previous events divided by e lapsed time

0 1 2 3 4 5 6

-4-3

-2-1

01

Hours since first time asleep

Sm

ooth

ed r

egre

ssio

n fu

nctio

n

Log time since not being under risk

Page 36: On a dynamic approach to the analysis of multivariate failure time data

36

Interpretation

Strong dynamic effects. Could be:FrailtyReal causal effects

Only additional information will tell us which is which.

The approach is purely empirical as apart from the latent variable thinking of frailty models

Page 37: On a dynamic approach to the analysis of multivariate failure time data

37

Warning

Dynamic covariates may ”steal” from the effect of fixed covariates. One therefore has to be careful, using orthogonalization or path analysis type methods. This is presently being developed. The ”locality” of the additive approach makes this easy to handle.

Page 38: On a dynamic approach to the analysis of multivariate failure time data

38

General event histories

Many events of many different types. Examples: individual histories of sick- leave, part-time work, full-

time work, with many transitions between different states

individual histories of illness There are no good tools for handling complex

event histories. The present approach is one attempt that we will develop further.

Page 40: On a dynamic approach to the analysis of multivariate failure time data

40

References Hougaard, P. (2000). Analysis of Multivariate

Survival Data. Springer-Verlag, New York Andersen, P.K., Borgan, Ø., Gill, R.D. and Keiding,

N. (1993). Statistical Models Based on Counting Processes. Springer-Verlag, New York

Aalen, O.O.; Borgan, Ø.; Fekjær, H. (2001). Covariate adjustment of event histories estimated from Markov chains: The additive approach – Biometrics, 57: 993-1001