Estimatingthemortalityreductionproducedbyeach round … · 2017-06-21 · *Codes 441.3-6...

1
Estimating the mortality reduction produced by each round of cancer screening J. Hanley (McGill U., Canada) & Sisse Njor (Aarhus U, Denmark) [email protected] FOR THOSE IN A HURRY: OUR MESSAGE IS SUMMARIZED IN THE BOTTOM LINE(s) Rate Reductions: time-pattern NOT SAME as if using ... to # (risk of) ... ADULT CIRCUMCISION: (HIV). VACCINATION: (MEASLES, POLIO, .. ), Ultrasound SCREENING: (AAA rupture) Years since randomisation Cumulative mortality related to abdominal aortic aneurysm 0 1 2 3 4 5 6 7 8 9 10 33 887 33 883 Control group Invited group Men at risk 32 103 32076 29 992 30 101 27 664 27 860 25 000 25 388 13 242 13 385 0 0.002 0.004 0.006 0.008 0.010 Control group Invited group # virtually immediate , and sustained BLOOD THINNERS: (STROKE/MI) STATINS: LDL CHOLESTEROL # disappears when agent removed Typically: 1 Hazard Ratio(HR) “Average f.-up: 8.8y . Rate ratio for death from prostate cancer in screening group: 0.80 .” With sustained screening, the steady-state mortality reduction would be more than the 20% observed after just the 3 trial rounds. Some time after screening ceases, mortality rates revert to those in unscreened, e.g., as in the 30 y. FOBT trial [next column]. Baker calls this dilution “post screening noise.” Nor should there be mortality deficits in the 21st year if lung cancer screening lasted just 6 years. Bottom Line (1) The unprincipled 1-number hazard-ratio (HR) measure ignores 1. how many screens, 2. when the last screen was, 3. when follow-up ended or 4. when mortality deficits are expected to manifest. First Principles Screening : pursuit of earlier Dx (& earlier Tx). Because of the Detectability : Curability trade- off, the course of many cancers, ’otherwise’ fa- tal at T = t, is not altered by screen at T =0. They are too early /late to be detected /cured . Mortality deficits manifest after some delay , and disappear at some point after last screen. Principles ! HR function The depth & duration of the mortality deficits produced by 3 screenings. In women screened from 50-69, deficits would reach their max. at age 56 & maintain this level for many age-bins. FOBT Screening. HR function Ovarian Cancer. HR function Bottom Line (2) IT’S ABOUT TIME: to not just recognize the importance of the HR function & its determi- nants, but to use them in data analysis Pop’l n Mammography Programs Norway (NEJM): Some counties only in 2nd or 6th year, too short for full impact to manifest. (cf. Hanley, Epi Reviews , 2011) Funen, Denmark: 22 years’ experience. 'Constant HR' model, data to end of 2009 Data now extended to end of 2015 Njor S, et al., J Med Scr 2015 Rest of Denmark (0) FUNEN (1) Br. Ca. Deaths / 100,000 WY 67 68 60 47 HR=0.78 22% REDUCTION 1979 1994 2000 2005 2010 2015 Age Age No. of Invitations 52 65 7 57 78 7 62 83 4 67 88 2 60 6 55 3 50 1 Invitations [FUNEN only] None in FUNEN, or in 'Rest' of Denmark 7 (of 41) birth-cohorts are shown Funen-‘RoD’ differences in Rates Average, and followup-year-specific, differ- ences in breast cancer mortality, in 3 birth co- horts, each 5 years wide (color-coded). In mod- ified Lexis diagram in bottom panel, grey cir- cles indicate invitations to those Funen women who attained the indicated ages in the years in- dicated. Numbers are numbers of deaths from breast cancer in the 3 age-bands. Percentage differences in upper panel: . Dotted line: age-year-matched M-H ‘average’. . 3 lines: age-matched M-H year-specific. . 3 smooth patterns: cohort-specific spline fits. 20% 10% 0% -10% -20% -30% -40% Overall 0 0 0 4 2 5 6 5 7 7 6 5 6 3 4 5 9 7 9 11 4 1 2 0 0 1 2 2 9 2 8 5 7 8 6 6 8 4 14 11 16 9 11 10 6 4 0 1 4 2 5 5 7 7 5 10 8 9 12 8 10 8 10 8 6 13 10 11 1 55 66 78 56 67 79 57 68 80 58 69 81 59 70 82 60 71 83 61 72 84 62 73 85 63 74 86 64 75 87 65 76 88 66 77 89 67 78 90 68 79 91 69 80 92 Percentage difference in mortality rate [from RoD] 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 Attained Age Year Effects of 1,2,3, ... , 7 screens Data for, and fitting of, HR model No. Deaths Person Years Invitation History ('Design' Matrix) Year[y] Age[a] D 0 D 1 PY 0 PY 1 How many years earlier 2014 87 11 1 16,827 2,101 20 18 2013 81 24 3 17,034 2,227 19 17 15 13 2012 75 18 1 19,788 2,491 17 15 13 11 9 7 5 etc. .. .. . ..,... .,... etc. D 1 + D 0 = D fixed D 1 ~ Binomial(D, π) with π = HR ay × PY 1 (HR ay × PY 1 + 1 × PY 0 ) HR ay = AgeAtS<a Prob.not.helped.by.screen.at.age.AgeAtS Model for impact of 1,2, .. ,7 rounds of screening HR Reduction τ Otherwise-fatal cancers 1 0% 0.9 10% 0.8 20% 0.7 30% 0.6 0.5 0.4 0.3 0.2 0.1 7 6 5 4 3 2 1 deaths averted because of (biennial) screen no. ... θ 0 2 4 6 8 10 12 14 16 18 20 22 Years after 1st screen Further descriptions of 2 model parameters and model fitting, and examples are available in Liu, Hanley, Saarela, Dendukuri. Int. Stat. Rev, 2015. Fitted Percentage Reductions Fitted reductions (%) based on parameters ( ˆ , ˆ ) of model for effect of 1 round of screening, and on the variations in numbers of invitations. Age 68,69 89,90 0 0 2 4 6 7 8 7 7 6 5 4 3 3 2 2 1 1 1 0 0 0 66,67 87,88 0 0 2 4 8 11 13 14 14 13 12 10 8 7 5 4 3 2 2 1 1 1 64,65 85,86 0 0 2 4 8 11 15 18 19 19 18 17 15 12 10 8 7 5 4 3 2 2 62,63 83,84 0 0 2 4 8 11 15 18 21 23 23 23 21 19 16 14 11 9 7 5 4 3 60,61 81,82 0 0 2 4 8 11 15 18 21 23 25 26 26 25 23 20 17 15 12 9 7 6 58,59 79,80 0 0 2 4 8 11 15 18 21 23 25 26 27 28 27 26 24 21 18 15 12 10 56,57 77,78 0 0 2 4 8 11 15 18 21 23 25 26 27 28 29 29 28 27 24 21 18 15 54,55 75,76 0 0 2 4 8 11 15 18 21 23 25 26 27 28 29 29 30 30 29 27 24 22 52,53 0 0 2 4 8 11 15 18 21 23 25 26 27 28 29 50,51 0 0 2 4 8 11 15 18 21 23 25 26 27 28 29 0 0 2 4 8 11 15 18 21 23 25 26 27 28 0 0 2 4 8 11 15 18 21 23 25 26 27 0 0 2 4 8 11 15 18 21 23 25 26 0 0 2 4 8 11 15 18 21 23 25 0 0 2 4 8 11 15 18 21 23 0 0 2 4 8 11 15 18 21 0 0 2 4 8 11 15 18 0 0 2 4 8 11 15 0 0 2 4 8 11 0 0 2 4 8 0 0 2 4 0 0 2 0 0 0 1995 1997 1999 2001 2003 2005 2007 2009 2011 2013 2015 Fitted Percent Differences ('Reductions') The relatively small number of events in the screened population makes it difficult to show that the 2-parameter model fits significantly better than the prevailing (but un-principled) constant- hazard-ratio (proportional-hazards) model. The fitted model assumes the same 2 parameter values for both the initial ('prevalence') and follow-up screens, and for all ages at the first screen. With sufficient data, it could readily be extended to allow these to vary. THE BOTTOM LINE This first principles model can use RCT or population data to pursue more realistic mea- sures of mortality reductions, and better inputs for cost effectiveness calculations. To more precisely measure reductions due to mammography, we wish to collaborate with those already holding suitable population data.

Transcript of Estimatingthemortalityreductionproducedbyeach round … · 2017-06-21 · *Codes 441.3-6...

Page 1: Estimatingthemortalityreductionproducedbyeach round … · 2017-06-21 · *Codes 441.3-6 (international classification of diseases, ninth revision), or equivalently codes I71.3-4

Estimating themortality reductionproducedbyeach roundofcancerscreening J. Hanley (McGill U., Canada) & Sisse Njor (Aarhus U, Denmark) [email protected]

FOR THOSE IN A HURRY: OUR MESSAGE IS SUMMARIZED IN THE BOTTOM LINE(s)

Rate Reductions: time-patternNOT SAME as if using . . . to # (risk of) . . .

• ADULT CIRCUMCISION: (HIV).VACCINATION: (MEASLES, POLIO, .. ),Ultrasound SCREENING: (AAA rupture)

METHODS

The design of MASS is described in detail elsewhere.4

Briefly, a population based sample of 67 770men aged65-74 was recruited during 1997-9 from four centres inthe UK and randomised to receive an invitation toscreening for abdominal aortic aneurysm (invitedgroup) or not (control group). Among the 33 883 meninvited to screening, principally in a primary care set-ting, 27 204 (80%) attended and 1334 aneurysms (dia-meter ≥3.0 cm) were detected. Within this group ofdetected aneurysms, surveillance involved rescanning:annually for those with diameters of 3.0-4.4 cm andevery three months for those of 4.5-5.4 cm. Patientswere referred to a hospital outpatient clinic for possibleelective surgery when the aneurysm reached 5.5 cm,the aneurysm had expanded by 1.0 cm or more inone year, or symptoms attributable to the aneurysmwere reported.We collected additional data from local hospital

records on follow-up ultrasound scanning done withinmedical imaging departments and surgery for abdom-inal aortic aneurysm. The UK Office for National Sta-tistics notified us of deaths up to 31 March 2008, aftermatching on the unique National Health Service(NHS) number for each participant. Follow-up rangedfrom 8.9 to 11.2 years (mean 10.1 years). The primaryoutcome of interest—deaths related to abdominal aor-tic aneurysm—is defined as all deaths within 30 days ofany surgery (electiveor emergency) for abdominal aor-tic aneurysm plus all deaths with codes 441.3-441.6(international classification of diseases, ninth revision;see table 1).

We used unadjusted Cox regression to comparedeaths related to abdominal aortic aneurysm (censor-ing other causes of death) and all cause mortalitybetween the two randomised groups. Life years gainedwas derived as the area between the Kaplan-Meiercurves of deaths related to abdominal aortic aneurysmfor the control and invited groups, adjusting for theeffect of deaths from other causes.14 We also obtainedan unbiased randomisation based estimate of the ben-efit of attending initial screening.15 This estimate wascalculated by subtracting from the controls a groupthat is equivalent to the non-attending group amongthose invited, thus leaving a control group comparableto those attending in the invited group.We estimated the cost effectiveness of screening

from a UK health service perspective, for follow-uptruncated at 10 years. The relevant unit costs aretaken from a recent UK Department of Healthreport16; these are based on a detailed costing exerciseat 2000-1 prices17 uplifted to reflect 2008-9 prices.Events costed include each invitation to screening(£1.74; €2.02; $2.88), reinvitation to screening(£1.70), initial scan (£25.31), recall scan (£61.07), refer-ral for consideration for elective surgery (£411.07),elective surgery (£9165), and emergency surgery(£14 825). We applied discounting at the currentlyrecommended rate of 3.5% per year for both costsand effects. Incremental costs and the cost effectivenessratio take into account censoring at the end of follow-up by dividing the follow-up into intervals of sixmonths.18 19 We used Fieller’s method to calculate theconfidence interval for the incremental cost effective-ness ratio.20

RESULTS

The flow of participants in the trial is as reportedpreviously,5 except for two features. Firstly, of the1334 men with abdominal aortic aneurysm detectedat initial scan, 72% (n=963) had complete clinical fol-low-up to 10 years according to the protocol; this com-pares with 76% at seven years. Secondly, inability tofollow up deaths because some men may have movedwas 2.7% at 10 years, compared with 2.1% at seven

Table 1 | Deaths related to abdominal aortic aneurysm*, ruptured abdominal aortic aneurysm,and other causes of death

Category Control group (n=33 887) Invited group (n=33 883)

Deaths related to aneurysm:

<30 days after elective surgery† 13 21

Ruptured aneurysm‡ 251 110

Ruptured aneurysm of unspecified site§ 32 24

Total No 296 155

Hazard ratio (95% CI) 1 (reference) 0.52 (0.43 to 0.63)

Ruptured aneurysm:

Non-fatal rupture 78 42

Total incidence of rupture¶ 374 197

Hazard ratio (95% CI) 1 (reference) 0.52 (0.44 to 0.62)

Other causes of death:

Ischaemic heart disease 2448 2324

Other cardiovascular 1391 1430

Non-cardiovascular** 6346 6365

All deaths 10 481 10 274

Hazard ratio (95% CI) 1 (reference) 0.97 (0.95 to 1.00)

*Codes 441.3-6 (international classification of diseases, ninth revision), or equivalently codes I71.3-4 and 8-9(international classification of diseases, 10th revision).†Those with ICD-9 codes 441.3-6 who died within 30 days of elective surgery are classified here.‡ICD-9 codes 441.3 (ruptured abdominal aortic aneurysm) and 441.4 (abdominal aortic aneurysm withoutmention of rupture), and all deaths occurring within 30 days of emergency surgery for abdominal aorticaneurysm.§ICD-9 codes 441.5 (ruptured aortic aneurysm at unspecified site) and 441.6 (aortic aneurysm at unspecifiedsite without mention of rupture).¶Deaths related to abdominal aortic aneurysm plus incidence of non-fatal ruptured abdominal aortic aneurysm.**Includes 19 deaths of unknown cause.

Years since randomisation

Cum

ulat

ive

mor

talit

y re

late

dto

abd

omin

al a

orti

c an

eury

sm

0 1 2 3 4 5 6 7 8 9 10

33 88733 883

Control groupInvited group

Men at risk32 10332076

29 99230 101

27 66427 860

25 00025 388

13 24213 385

0

0.002

0.004

0.006

0.008

0.010Control groupInvited group

Fig 1 | Cumulative deaths related to abdominal aorticaneurysm, by time since randomisation

RESEARCH

page 2 of 6 BMJ | ONLINE FIRST | bmj.com

on 26 June 2009 bmj.comDownloaded from

# virtually immediate, and sustained

• BLOOD THINNERS: (STROKE/MI)STATINS: LDL CHOLESTEROL

# disappears when agent removed

Typically: 1 Hazard Ratio(HR)

“Average f.-up: 8.8y. Rate ratio for death fromprostate cancer in screening group: 0.80.”

With sustained screening, the steady-statemortality reduction would be more than the20% observed after just the 3 trial rounds.

Some time after screening ceases, mortalityrates revert to those in unscreened, e.g., as inthe 30 y. FOBT trial [next column]. Bakercalls this dilution “post screening noise.” Norshould there be mortality deficits in the 21styear if lung cancer screening lasted just 6 years.

Bottom Line (1)The unprincipled 1-number hazard-ratio (HR)measure ignores 1. how many screens, 2. whenthe last screen was, 3. when follow-up endedor 4. when mortality deficits are expected tomanifest.

First PrinciplesScreening: pursuit of earlier Dx (& earlier Tx).Because of the Detectability : Curability trade-off, the course of many cancers, ’otherwise’ fa-tal at T = t, is not altered by screen at T = 0.They are too early/late to be detected/cured.Mortality deficits manifest after some delay,and disappear at some point after last screen.

Principles ! HR function

The depth & duration of the mortality deficitsproduced by 3 screenings. In women screenedfrom 50-69, deficits would reach their max. at ⇡age 56 & maintain this level for many age-bins.

FOBT Screening. HR function

Ovarian Cancer. HR function

Bottom Line (2)IT’S ABOUT TIME: to not just recognize theimportance of the HR function & its determi-nants, but to use them in data analysis

Pop’ln Mammography Programs• Norway (NEJM): Some counties only in2nd or 6th year, too short for full impact tomanifest. (cf. Hanley, Epi Reviews , 2011)

• Funen, Denmark: 22 years’ experience.'Constant HR' model, data to end of 2009 Data now extended to end of 2015

Njor S, et al.,J Med Scr2015

Rest ofDenmark

(0)

FUNEN(1)

Br. Ca. Deaths / 100,000 WY

67

68

60

47

HR=0.78

22%REDUCTION

1979 1994 2000 2005 2010 2015

Age

AgeNo. ofInvitations

52

65 7

57

78 7

62

83 4

67

88 2

60 6

55 3

50 1

Invitations [FUNEN only]None in FUNEN,or in 'Rest' of Denmark

7 (of 41) birth-cohorts

are shown

Funen-‘RoD’ differences in RatesAverage, and followup-year-specific, differ-ences in breast cancer mortality, in 3 birth co-horts, each 5 years wide (color-coded). In mod-ified Lexis diagram in bottom panel, grey cir-cles indicate invitations to those Funen womenwho attained the indicated ages in the years in-dicated. Numbers are numbers of deaths frombreast cancer in the 3 age-bands. Percentagedifferences in upper panel:. Dotted line: age-year-matched M-H ‘average’.. 3 lines: age-matched M-H year-specific.. 3 smooth patterns: cohort-specific spline fits.

20%

10%

0%

-10%

-20%

-30%

-40%

Overall

0 0 0 4 2 5 6 5 7 7 6 5 6 3 4 5 9 7 9 11 4 1 2

0 0 1 2 2 9 2 8 5 7 8 6 6 8 4 14 11 16 9 11 10 6 4

0 1 4 2 5 5 7 7 5 10 8 9 12 8 10 8 10 8 6 13 10 11 1

55

66

78

56

67

79

57

68

80

58

69

81

59

70

82

60

71

83

61

72

84

62

73

85

63

74

86

64

75

87

65

76

88

66

77

89

67

78

90

68

79

91

69

80

92

Percentage difference in mortality rate [from RoD]

1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

AttainedAge

Year

Effects of 1,2,3, ... , 7 screensData for, and fitting of, HR model

No.Deaths

PersonYears

Invitation History('Design' Matrix)

Year[y] Age[a] D0 D1 PY0 PY1 How many years earlier

2014 87 11 1 16,827 2,101 20 182013 81 24 3 17,034 2,227 19 17 15 132012 75 18 1 19,788 2,491 17 15 13 11 9 7 5etc. .. .. . ..,... .,... etc.

D1 +D0 = D fixed→ D1 ~ Binomial(D, π)

with

π = HRay ×PY1 (HRay ×PY1 + 1 ×PY0)

HRay = ∏AgeAtS<a

Prob.not.helped.by.screen.at.age.AgeAtS

Model for impact of 1,2, .. ,7 rounds of screeningHR Reductionτ

Otherwise-fatal cancers

1 0%

0.9 10%

0.8 20%

0.7 30%

0.6

0.5

0.4

0.3

0.2

0.1

7654

32

1

deaths avertedbecause of (biennial)

screen no. ...

θ

0 2 4 6 8 10 12 14 16 18 20 22Years after 1st screen

Further descriptions of 2 model parameters and model fitting, and examples are available in Liu,Hanley, Saarela, Dendukuri. Int. Stat. Rev, 2015.

Fitted Percentage ReductionsFitted reductions (%) based on parameters (⌧̂ ,✓̂) of model for effect of 1 round of screening,and on the variations in numbers of invitations.

Age68,69

89,90

0 0 2 4 6 7 8 7 7 6 5 4 3 3 2 2 1 1 1 0 0 0

66,67

87,88

0 0 2 4 8 11 13 14 14 13 12 10 8 7 5 4 3 2 2 1 1 1

64,65

85,86

0 0 2 4 8 11 15 18 19 19 18 17 15 12 10 8 7 5 4 3 2 2

62,63

83,84

0 0 2 4 8 11 15 18 21 23 23 23 21 19 16 14 11 9 7 5 4 3

60,61

81,82

0 0 2 4 8 11 15 18 21 23 25 26 26 25 23 20 17 15 12 9 7 6

58,59

79,80

0 0 2 4 8 11 15 18 21 23 25 26 27 28 27 26 24 21 18 15 12 10

56,57

77,78

0 0 2 4 8 11 15 18 21 23 25 26 27 28 29 29 28 27 24 21 18 15

54,55

75,76

0 0 2 4 8 11 15 18 21 23 25 26 27 28 29 29 30 30 29 27 24 22

52,53 0 0 2 4 8 11 15 18 21 23 25 26 27 28 29

50,51 0 0 2 4 8 11 15 18 21 23 25 26 27 28 29

00

24

811

1518

2123

2526

2728

00

24

811

1518

2123

2526

27

00

24

811

1518

2123

2526

00

24

811

1518

2123

25

00

24

811

1518

2123

00

24

811

1518

21

00

24

811

1518

00

24

811

15

00

24

811

00

24

8

00

24

00

2

000

1995 1997 1999 2001 2003 2005 2007 2009 2011 2013 2015

Fitted Percent Differences ('Reductions')The relatively small number of events in the screened population makes itdifficult to show that the 2-parametermodel fits significantly better than theprevailing (but un-principled) constant-hazard-ratio (proportional-hazards) model.

The fitted model assumes the same 2parameter values for both the initial('prevalence') and follow-up screens, and for all ages at the first screen.

With sufficient data, it could readilybe extended to allow these to vary.

THE BOTTOM LINE• This first principles model can use RCT orpopulation data to pursue more realistic mea-sures of mortality reductions, and better inputsfor cost effectiveness calculations.• To more precisely measure reductions dueto mammography, we wish to collaborate withthose already holding suitable population data.