On a dynamic approach to the analysis of multivariate failure time data Odd Aalen Section of Medical...

40
On a dynamic approach to the analysis of multivariate failure time data Odd Aalen Section of Medical Statistics, University of Oslo, Norway
  • date post

    22-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    0

Transcript of On a dynamic approach to the analysis of multivariate failure time data Odd Aalen Section of Medical...

On a dynamic approach to the analysis of multivariate failure time data

Odd Aalen

Section of Medical Statistics,

University of Oslo, Norway

2

Coworkers

Ørnulf Borgan: Institute of Mathematics, University of Oslo

Johan Fosen: Section of Medical Statistics, University of Oslo

Harald Fekjaer: The Norwegian Cancer Registry

3

What are multivariate survival data?1. Repeated events over time.

2. Possibly, each individual may have several ”units on test”.

3. Complication (as always) is censoring.

4. The long-term ambition is to tackle complex event histories with many different types of events.

4

Example: Small bowel motility

Study cyclic pattern of motility (spontaneous movements) of the small bowel in humans. Focus on MMC complexes which come with irregular intervals (lasting from minutes to several hours).

Motility is very important from a clinical point of view. Data studied by frailty models in (Aalen & Husebye,

1991). Will here apply dynamic models instead.

5

Small bowel motility data (Husebye)

6

Other examples

Duration of amalgam fillingsEach patient contributes a number of fillings

Repeated tumors in animal experiments Continuous registration of sleep, moving into

and out of various sleep states

7

How are such data analysed? By marginal models. Dependence in the data is

generally ignored, except for estimating standard errors. By frailty models. One assumes that each individual has

a separate risk of the event occurring. Frailty models are random effect models. Censoring is handled well by frailty models. Excellent state of the art book: Hougaard, 2000.

Will propose alternative method based on regression on dynamic covariates.

8

Cox 1972: Two major papers

The famous one Cox models for ordinary survival data JRSSB, 1972, 34, 187-220

The ignored one Cox models for point processes introducing dynamic covariates, e.g. ”time since last event” In: P.A.W.Lewis, ”Stochastic point processes: Statistical

analysis, theory and applications”, Wiley, 1972, 55-66

It is time to take up the challenge of Cox’s second paper

9

Counting process framework

What is a counting process? Observing events occurring over time. Examples of events:

waking up during night amalgam filling failing detecting a tumor

Counting the number of events as they come along yields a counting process

The counting process is denoted N(t) where t is time. The process is constant between events and jumps one unit at each event

10

Illustration of a counting process

5

2

3

4

1

Time

11

The intensity process of a counting process Definition of the intensity process:

Extremely fruitful reformulation of the definition: The fact that a counting process N(t) has an intensity process (t) can be made precise by the following mathematical statement:

( ) ( ( ) ( ) | )tdt

P N t dt N t 1

1 past

M t N t s dst

( ) ( ) ( ) z0is a zero mean martingale

12

Martingale for simulated Poisson process with rate 1

Time

1086420

1.5

1.0

.5

0.0

-.5

-1.0

-1.5

-2.0

13

Stochastic integrals

If M(t) is a martingale, then the following is also a martingale:

I t H s dM st

( ) ( ) ( )z0where H(t) is (essentially) any stochastic process dependent on the past, and with left-continuous sample functions

Useful properties of stochastic integrals include explicit formulas for variances and central limit theorems

14

Following up on Cox 1972b: Dynamic models

Dynamic models incorporate past observation in the analysis.

Example: Frailty induces dependence over time. e.g. several previous events increases likelihood of new

event hence frailty models can be analyzed as dynamic models this connection can be made mathematically precise

In addition, there are real dynamic effects. (Which may be difficult to distinguish from frailty effects.)

15

Regression on dynamic covariates

Dynamic covariates may be defined: number of previous events time since last occurrence or in fact any function of the past (due to martingale theory)

Dynamic covariates are continuously updated. Dynamic regression may be more flexible than a

frailty approach, since effects may change over time.

16

Types of regression The data are a number of counting processes, with many

possible events in each, and observed in parallel Can use

Cox regression additive regression

Will focus on additive regression This is a local approach, as opposed to Cox regression Basic idea: Whenever an event occurs, a linear model is

estimated with dependent variable being a vector of 0’s, except for a 1 in the process where the event happened.

Individually, these estimates are not informative, but summing them up over time yields something sensible.

17

Additive intensity regression For each individual in the risk set, the intensity process for

individual i is defined as a linear function of the covariates:

where Ki(t) is the number of units at risk (e.g. amalgam fillings) for the individual.

The regression functions (’s) are arbitrary functions, while the covariates (Z’s) are arbitrary predictable processes (e.g. adapted with left-continuous sample paths).

i i k i kk

p

t K t t tbg bg bg ( )( ),0

1

18

Why additivity (linearity)?

Seems unnatural since the intensity should be positive. However: Additivity yields complete flexibility as to how effects of

covariate change over time. Also: Additivity yields exact martingales in several settings, which

is technically convenient. The Cox model never yields exact martingales.

In practice, effects are not always (in fact, usually not?) proportional.

The additive model can be connected up with other linear models, e.g. for the covariates, and connected into path analysis.

19

Additive model: Local least square estimation

Easiest to estimate cumulative regression functions:

The slope of these gives information about the regression functions. Estimate defined as stochastic integral of counting proess, for a suitable design matrix Y(t):

t

kk dsst0

( ) ( )_

t Y t dN ttbgz0

where Y(t)- is a generalized inverse.

20

Residual processes

Martingale residual processes are defined as follows:

M t I Y s Y s J s dN st

res( ) ( ( ) ( ) ) ( ) ( ) z0 These are exact martingales. For judging the influence of outliers, one may look

at the sum of the hat matrices over jump times:

Y T Y Ti ii

( ) ( )

21

Theory for additive model There exists much theory, including:

asymptotics, testing, residuals, density type estimation of regression functions, ridge regression, optimal procedures

estimating covariate effects on transition probabilities in Markov chains

Most theory is based on stochastic integrals and martingales.

See e.g. the book by Andersen, Borgan, Gill and Keiding (1993).

Aalen et al, Biometrics, 2001.

22

Dynamic analysis of frailty: Simulation Simulating 40 independent Poisson processes Rate in each process simulated from an exponential

distribution with expectation 1. The rate serves as a frailty variable

For each counting processes we define a dynamic covariate to be the number of previous events in the process

23

Cumulative regression function for dynamic covariate

Cumulative regression function with 95% confidence limits

Covariate: Sum events divided by time

Time

43210

Cum

ulat

ive

regr

essi

on fu

nctio

n

350

300

250

200

150

100

50

0

-50

24

Standardized residual processes Cumulative residual

processes shown left and kernel estimated processes shown right.

Upper panels: No covariate

Lower panels: Dynamic covariate

9.008.007.006.005.004.003.002.001.00.00

8

6

4

2

0

-2

-4

-6

-8

TIME

2.001.801.601.401.201.00.80.60.40.20.00

8

6

4

2

0

-2

-4

-6

-8

2.001.801.601.401.201.00.80.60.40.20.00

8

6

4

2

0

-2

-4

-6

-89.008.007.006.005.004.003.002.001.00.00

8

6

4

2

0

-2

-4

-6

-8

25

Small bowel motility Dynamic covariates

Number of previous events Time since last event (cut point at 50 minutes)

 Cox regression can be applied. Hazard ratios with 95% confidence intervals: 0.98 (0.76, 1.27) 4.66 (2.36, 9.19)

Illustrated by additive model on the next slides, which shows Number of previous occurences has no effect Time since last occurence does have an effect

26

Dynamic covariate I: number of previous occurrences(upper and lower curves give pointwise 95% confidence intervals)

There is clearly no effect of the covariate

27

Dynamic covariate II:time since last occurrence (above or below

50 days)

28

Example: Repeated tumors

Gail, Santner and Brown (1980). Carcinogen injected at day 0 in 76 female rats. Then

treated with retinyl acetate for 60 days. The 48 animals which were still tumor free, where

randomised to continued retinoid prophylaxis, or to control.

The animals were followed until 182 days after the initial injection. Several tumors were observed in most animals, and the time of each tumor was recorded.

29

Data on repeated mammary tumors. Additive model with one fixed and three dynamic covariates.

Time (days)

2001801601401201008060

5

4

3

2

1

0

Cumulative baseline intensity Treatment effect

Time (days)

2001801601401201008060

3.0

2.5

2.0

1.5

1.0

.5

0.0

Time (days)

2001801601401201008060

2.5

2.0

1.5

1.0

.5

0.0

-.5

Number of previous occurrences Time since previous event

Time (days)

2001801601401201008060

1.0

.5

0.0

-.5

-1.0

30

Residual plots I

Fitting all covariates (treatment and dynamic ones). Left panel shows standardized residual. Right panel shows mean and standard deviation of standardized residuals.

Time (days)

1801601401201008060

5

4

3

2

1

0

-1

-2

-3

Time (days)

1801601401201008060

1.2

1.0

.8

.6

.4

.2

0.0

-.2

SD

MEAN

31

Residual plots II

Using only treatment as covariate. Left panel shows standardized residual. Right panel shows mean and standard deviation of standardized residuals.

Time (days)

1801601401201008060

2.5

2.0

1.5

1.0

.5

0.0

-.5

SD

MEAN

Time (days)

1801601401201008060

6

4

2

0

-2

-4

32

Influence plot

Cumulative sum of hat matrices

Straight line marks the limit for influential processes

Sequence number

210

199

188

177

166

155

144

133

122

111

100

89

78

67

56

45

34

23

12

1

50

40

30

20

10

0

33

Conclusion for mammary tumors

FindingsConstant rate over time.Effect of treatment.Effect of number previous occurrences.No effect of time since last occurrence.

Type of process:Markovian. Several Poisson processes with varying

rate.

34

Example of more complex event history:

Sleep data Data collected at Max Planck institute in Munich

concerning sleep patterns. Analysing tendency to fall asleep, wake up, have REM periods

etc. Clearly, many occurrences of events each night.

Example of counting process: number of times you wake up

Dynamic covariates measurement of cortisol (stress hormone) number of previous events divided by elapsed time log duration of ongoing sleep period

35

Kernel estimates of regression

functions 0 1 2 3 4 5 6

-10

12

34

5

Hours since first time asleep

Sm

oo

the

d r

eg

res

sio

n f

un

cti

on

Base hazard

0 1 2 3 4 5 6

-1.5

-1.0

-0.5

0.0

0.5

1.0

1.5

Hours since first time asleep

Sm

oo

the

d r

eg

res

sio

n f

un

cti

on

Log cortisol

0 1 2 3 4 5 6

-0.4

-0.2

0.0

0.2

0.4

0.6

Hours since first time asleep

Sm

oo

the

d r

eg

res

sio

n f

un

cti

on

Number of previous events divided by elapsed time

0 1 2 3 4 5 6

-4-3

-2-1

01

Hours since first time asleep

Sm

oo

the

d r

eg

res

sio

n f

un

cti

on

Log time since not being under risk

36

Interpretation

Strong dynamic effects. Could be:FrailtyReal causal effects

Only additional information will tell us which is which.

The approach is purely empirical as apart from the latent variable thinking of frailty models

37

Warning

Dynamic covariates may ”steal” from the effect of fixed covariates. One therefore has to be careful, using orthogonalization or path analysis type methods. This is presently being developed. The ”locality” of the additive approach makes this easy to handle.

38

General event histories

Many events of many different types. Examples: individual histories of sick- leave, part-time work, full-

time work, with many transitions between different states

individual histories of illness There are no good tools for handling complex

event histories. The present approach is one attempt that we will develop further.

39

Additive regression in practice

An S-Plus computer program named Addreg may be found in:

www.med.uio.no/imb/stat/addreg/ Information on research in event history analysis in

Oslo may be found on the web page:

www.med.uio.no/imb/stat/norevent/

40

References

Hougaard, P. (2000). Analysis of Multivariate Survival Data. Springer-Verlag, New York

Andersen, P.K., Borgan, Ø., Gill, R.D. and Keiding, N. (1993). Statistical Models Based on Counting Processes. Springer-Verlag, New York

Aalen, O.O.; Borgan, Ø.; Fekjær, H. (2001). Covariate adjustment of event histories estimated from Markov chains: The additive approach – Biometrics, 57: 993-1001