Methods for Summarizing the Evidence: Meta-Analyses and Pooled Analyses

Methods for Summarizing the Evidence:

Meta-Analyses and Pooled Analyses

Donna Spiegelman, Sc.D.Departments of Epidemiology and

BiostatisticsHarvard School of Public [email protected]

Methods for Summarizing the Evidence

Narrative review

Meta-analysis of published data Pooled analysis of primary data (meta-

analysis of individual data)• Retrospectively planned• Prospectively planned

Narrative Review

Fruits, Vegetables and Breast Cancer: Summary of Case-Control Studies

Total number of studies: 12

Studies showing a statistically significant protective association: 8 (67%)

Studies showing no statistically significant protective association: 4 (33%)

WCRF/AICR 1997

Specific Fruits and Vegetables and Breast Cancer: Summary of Case-Control Studies

# of % of Total Food Group studies risk null risk

Fruits 4 75 0 25

Citrus Fruits 3 33 0 66

Vegetables <3 -- -- --

Green Veg. 6 83 17 0

Carrots 4 75 25 0

WCRF/AICR 1997

Fruits, Vegetables and Breast Cancer: Conclusion of Narrative Review

A large amount of evidence has accumulated regarding vegetable and fruit consumption and the risk of breast cancer. Almost all of the data from epidemiological studies show either decreased risk with higher intakes or no relationship; the evidence is more abundant and consistent for vegetables, particularly green vegetables, than for fruits.

Diets high in vegetables and fruits probably decrease the risk of breast cancer.

WCRF/AICR 1997

Narrative Review

Strengths• Short timeframe• Inexpensive

Limitations• Provide qualitative summary only frequently

tabulate results• Subjective• Selective inclusion of studies• May be influenced by publication bias

Publication Bias: Funnel Plots

Sterne 2001

Where Can Publication Bias Occur?

Project dropped when preliminary analyses suggest results are negative

Authors do not submit negative study Results reported in small, non-indexed journal Editor rejects manuscript Reviewers reject manuscript Author does not resubmit rejected manuscript Journal delays publication of negative study Results not reported by news, policy makers,

or narrative reviewsMontori 2000

Meta-Analyses

Rationale for Combining Studies

“Many of the groups…are far too small to allow of any definite opinion being formed at all, having regard to the size of the probable error involved.”

Pearson, 1904

Definition

“Meta-analysis refers to the analysis of analyses…the statistical analysis of a large collection of analysis results from individual studies for the purpose of integrating findings. It connotes a rigorous alternative to the casual, narrative discussions of research studies which typify our attempts to make sense of the rapidly expanding literature…”

Glass, 1976

Meta-Analyses: # of Publications by Year

010002000300040005000

1986 1991 1996 2001Year of Publication

Why conduct a meta-analysis?

Summarize published literature• A more objective summary of literature than narrative

review• Estimate average effect from all available studies

Increase statistical power more precise estimate of effect size

Identify between study heterogeneity

Identify research needs

Outline for Conducting Meta-Analyses

Objective, hypothesis

Define outcome, exposure, population

Study inclusion criteria

Search strategy

Data extraction

Quality assessment

Statistical analysis

Meta-Analyses: Study Sources

Published literature

• citation indexes

• abstract databases

• reference lists

• contact with authors

Unpublished literature

Uncompleted research reports

Work in progress

Meta-Analyses: Data Extraction

Publication year

Performing year

Study design

Characteristics of study population (n, age, sex)

Geographical setting

Assessment procedures

Risk estimate and variance

Covariates

Meta-Analyses: Quality Assessment

Study components• Study design• Outcome measurement• Exposure measurement• Response rate/follow-up rate

Analytic strategy• Adjustment for confounding

Quality of reporting

Meta-Analyses: Statistical Analyses

Define analytic strategy

Investigate between study heterogeneity

Decide whether results can be combined

Estimate summary effect, if appropriate

Conduct sensitivity analysis• Stratification• Meta-regression analyses

Fixed Effects

Assumes that all studies are estimating the same true effect

Variability only from sampling of people within each study

Precision depends mainly on study size

Fixed Effects Model

, , . . . ,

lo g( var )

lo g( ) , , . . . ,

( ) ( )

, [ ( )]

( ) [ ( )] [

s s

s

s

s

s s

s ss

S

ss

S s s

ss

S

s S stu d ies

true co m m o n re la tive riskra n d om w ith in stu d y sa m p lin g ia b ility

estim a ted rela tive risk in stu dy sE s S

V a r V a r

P o o ledw

ww V a r

V a r V a r w

1

0 1

1

1

1

1

1

1

ss

S

im u m ia n ce w eigh ts

]

m in va r

1

1

Random Effects

Studies allowed to have different underlying or true effects

Allows variation between studies as well as within studies

Random Effects Model , , . . . ,

lo gva r

( va r lo g

( ) ( ) , , . . . ,

( ) ( ), , . . . ,( ) , , . . . ,

s s s

s

s

s

s s

s s

s B

s ss

S

ss

S

b s S stud ies

true co m m o n rela tive riskb ran do m betw een stud ies ia b ility

ran d om w ith in stud y sam p ling iab ility

estim a ted re la tive risk in stud y sE E b s S

V a r V ar s SV a r b s S

P o o ledw

w

1

0 1

112

1

1

, [ ( ) ]

( ) [ ( ) ]

m in var

w V a r

V a r V a r w

im um ia nce w eig h ts

s s B

s Bs

S

ss

S

2 1

2 1

1

1

1

1

Random Effects Model, continued

2

2

2

1

1

1

1 1

mod ,ˆ , 1,...,

( )

( 1)ˆ max 0,

ˆˆ[ ( )] [ ( )]

s s s

s B

B S

sSs

s Ss

ss

s s s

Under random effects el

b s S studies

Var b

Q S

ww

w

w Var Var

Test for Heterogeneity

U nder random effects e l

Q w

poo led rela tive risk fixed effects el

study specific estim ated rela tive risk

w V ar

Q under H

s ss

S

s

s s

S B

m od ,

( ) ,

log , m od log

[ ( )]

~ :

2

1

1

12

02 0

Fixed Effects vs Random Effects Model

Random effects generally yield larger variances and CI • Why? Incorporate

If heterogeneity between studies is large,

will dominate the weights and all studies will be weighted more equally

Model weight for large studies less in random vs fixed effects model

2B

2B

Sources of Between Study Heterogeneity

Different study designs

Different incidence rates among unexposed

Different length of follow-up

Different distributions of effect modifiers

Different statistical methods/models used

Different sources of bias

Study quality

Meta-Analyses: Sensitivity Analyses

Exclude studies with particular heterogeneous results

Conduct separate analyses based on• Study design• Geographic location• Time period• Study quality

Meta-regression (Stram, Biometrics, 1996)

Purpose: to identify heterogeneity of effects by covariates that are constant within study (e.g. gender, smoking status)

Model: β s = β 0 + β1 GENDERs + β2 CURRENTs

+ β2 PASTs + bs + εs

GENDERs = 1 if studys is male; 0 if femaleCURRENTs = 1 if studys has current smokers only, 0 otherwise PASTs = 1 if studys has past smokers only, 0 otherwise

H0: β1 = 0 no effect-modification by gender

Standard method for mixed effects models can be used to test hypotheses and estimate parameters

^

Analysis of between-studies heterogeneity p-value for test for heterogeneity is a function of the

power of the pooled analysis to detect between-studies differences. This power is believed to be low.

• a simulation study was conducted and published (Takkouche, Cardoso-Suárez, Spiegelman, AJE, 1999) which investigated the power of several old and some newly developed test statistics to detect heterogeneity of different plausible magnitudes, as quantified by CV and R2 (to be defined)

• CV = /Ι β Ι , with S=ranging from 7 to 33

12

2B

Analysis of between-studies heterogeneity Explored maximum likelihood methods for estimation of and testing H0: =0 (i.e. no between-studies

heterogeneity)

• ML methods have power roughly equivalent to D&L’s, but assume bs ~N (0, ) and εs ~ N [ 0, Vars ( βs ) ]

• LRT for H0: = 0 has no known asymptotic distribution because hypothesis is on the boundary of the parameter space

• A simulation-based bootstrap approach for constructing the empirical distribution function of the test statistic was developed

14

^

2B

2B

2B

2B

Quantification of heterogeneity CVB = /β• between-studies variance expressed relative to the magnitude of

the overall association • if the association is small, CV 'blows up’

proportion of the variance of the pooled estimate due to between-studies variation

• Useful for when β is near 0 as well as when it is far from it • Heterogeneity is evaluated relative to within-studies contribution

to the variance, and can appear large if the participating studies yield precise estimates

Further experience with these measures will give us more insight as to their relative merits

Confidence intervals for CVB, R2 (Identical to I2 (Higgins et al., 2006))

1

^ 2 2 2

1

ˆ/ [ ( ) / ]S

B B ss

R Var S

2B

2B

Q

LRT‡

Q*

LRT*

2- bootstrap

RI§ = 0.25 S = 7 S = 20 S = 40

14.38 25.90 38.12

7.85

17.90 30.64

12.18 24.18 35.10

14.68 25.97 37.81

15.32 24.92 37.94

RI = 0.5 S = 7 S = 20 S = 40

37.54 70.56 90.88

27.14 64.11 88.61

27.60 58.54 79.76

39.10 69.21 90.21

38.50 68.62 91.14

RI = 0.75 S = 7 S = 20 S = 40

77.66 98.66

100.00

70.21 97.51 99.93

58.84 93.64 99.64

78.31 99.12 99.94

75.02 98.34 99.94

* Parametric bootstrap version of the test.

† Odds ratio = 2.‡ WLS, weighted least squares; LRT, likelihood ratio test. § R2, proportion of the total variance due to between-study variance: /(( + (S x Var())).

Power of Test of Heterogeneity

13

2B 2

B

Fruit & Vegetable and Breast Cancer Meta-Analysis

Objective: analyze published results that explore the relationship between breast cancer risk and the consumption of fruits and vegetables

Search strategy

• MEDLINE search of studies published January 1982 – April 1997

• Review of reference lists Gandini, 2000

Fruit & Vegetable and Breast Cancer Meta-Analysis: Inclusion Criteria

Relative risks and confidence intervals reported or could be estimated• Comparisons: tertiles, quartiles, quintiles

Studies were independent Diet assessed by food frequency

questionnaire Populations were homogeneous, not limited

to specific subgroup

Gandini, 2000

Fruit & Vegetable and Breast Cancer Meta-Analysis: Data Extraction and Analysis

Selected risk estimate for total fruits and total vegetables, when possible• Otherwise, selected nutrient dense food

Extracted the most adjusted relative risk comparing the highest vs. lowest intake • Comparisons: tertiles, quartiles, quintiles

Used random effects model to calculate summary estimate

Sensitivity analyses

Gandini, 2000

Fruit and Vegetable and Breast Cancer Meta-Analysis: Results

RR (95% CI) p for het high vs low

Total fruits 0.94 (0.79-1.11) <0.001

Total vegetables 0.75 (0.66-0.85) <0.001

Gandini, 2000

Fruits and Vegetables and Breast Cancer Meta-Analysis: Sensitivity Analysis

# of RR Significance Studies of factors

Study design Case-control 14 0.71Cohort 3 0.73 0.30

Validated FFQYes 6 0.85No 11 0.66 0.13

Gandini, 2000

Fruits and Vegetables and Breast Cancer: Conclusion of Meta-Analysis

The quantitative analysis of the published studies…suggests a moderate protective effect for high consumption of vegetables… For fruit intake, study results were less clear. Only two studies show a significant protective effect of high fruit intake for breast cancer.

This analysis confirms the association between intake of vegetables and, to a lesser extent, fruits and breast cancer risk from published sources.

Gandini, 2000

Limitations of Meta-Analyses

Heterogeneity across studies Difficult to evaluate dose-response

associations Difficult to examine population subgroups Errors in original work cannot be checked Errors in data extraction Limited by quality of the studies included May be influenced by publication bias

Pooled Analyses

Outline for Conducting Pooled Analyses

• Search strategy

• Study inclusion criteria

• Obtain primary data

• Prepare data for pooled analysis

• Estimate study-specific effects

• Examine whether results are heterogeneous

• Estimate pooled result

• Conduct sensitivity analyses

Friedenreich 1993

Pooling Project of Prospective Studies of Diet and Cancer

• Collaborative project to re-analyze the primary data in multiple cohort studies using standardized analytic criteria to generate summary estimates

• Retrospectively-plannedRetrospectively-planned meta-analysis of individual patient data

• Established in 1991

• http://www.hsph.harvard.edu/poolingproject/about.html

Pooling Project of Prospective Studies of Diet and Cancer: Inclusion Criteria

Prospective study with a publication on diet and cancer

Usual dietary intake assessed

Validation study of diet assessment method

Minimum number of cases specific for each cancer site examined

Sweden Mammography

CohortNetherlands Cohort Study

New York State Cohort

Iowa Women’s Health Study

Canadian National Breast Screening Study

Cohort Studies in the Pooling Project of Prospective Studies of Diet and Cancer

Cancer Prevention Study II Nutrition Cohort

Adventist Health Study

Alpha-Tocopherol Beta-Carotene Cancer Prevention Study

New York University

Women’s Health Study Health Professionals

Follow-up Study, Nurses’ Health Study, Women’s Health Study, Nurses’

Health Study II

Breast Cancer Detection Demonstration Project

ORDET

Total=948,983

CA Teacher’s Study

Analytic Strategy

Study-specific analyses

Data collection Pooling

Nutrients

Food groups

Foods

Non-dietary risk factors

Main effect

Population subgroups

Effect modification

Annual meeting

Pooling Project: Data Management

Receive primary data–Different media–Different formats

Check data follow-up with investigators Apply exclusion criteria Calculate energy-adjusted nutrient intakes Create standardized name and format for

each variable

Pooling Project: Data Checks Dates

• Questionnaire return• Follow-up time• Diagnosis• Death

Non-dietary variables• Frequency distributions, means• Consistency checks

–Parity and age at first birth–Smoking status and # cigarettes

Nutrients and foods: means, range

Data Management: Inconsistent UnitsExample: Vitamin A, Carotene

Units:g retinol equivalentsg • International units

How determine units• Original data sheets• Published values• Correspondence

Data Management: Standardizing Food Data

Created standardized name for each food on the FFQs (range: 46-276 items)

• Multiple foods on same line

• “Other” categories Standardized quantity information as grams/d

• For some studies, calculated gram intakes by frequency * portion size * gram weight

Defined food groups

Pooling Project: Foods

Food Std Name Std Serv

Orange juice juior 6 oz

Orange or grapefruit juice juici 6 oz

Oranges orang one

Oranges, grapefruits cifru average

Oranges, orange juice orang one orange

Oranges, tangerines orang one

Oranges, tangerines, grapefruit cifru average

Pooling Project: Primary Analysis Programs

Read data• Study name, exposure, covariates

Data management Analysis - Cox proportional hazards model

(SAS, Epicure)• Continuous• Categorical• Splines• Interactions

Pooling Project: Primary Analysis Programs, cont’d

Output for each study

• File with number of cases and covariates included in the model

• File identifying whether any of the relative risks > 10

• For quantile analyses, file with the mean, range and number of cases for each quantile

• Data set with beta, variance, covariance, likelihood for each variable

Pooling Project: Pooling Programs

Read study-specific output data sets Calculate summary relative risk by using random

effects model

• Weight studies by the inverse of their variance

• Test for between studies heterogeneity

• Test for effect modification by sex Output table

Pooling Project: Analytic Strategy for Breast Cancer Analyses

Analysis• Cohorts analyzed using Cox proportional

hazards model• Canadian National Breast Screening Study is a

nested case-control study with a 1:2 ratio of cases:controls

• Netherlands Cohort Study is a case-cohort study–Subcohort: 1812 women

Pooling Project: Exposures Total fruits

• Fruits, excluding fruit juice

• Fruit juice

Total vegetables

Total fruits and vegetables

Botanical groups

Specific foods

Pooled Multivariate Relative Risks for Breast Cancer and Fruits and Vegetables

1 1.00 (ref) 1.00 (ref)

2 0.94 (0.87-1.01) 0.99 (0.90-1.08)

3 0.92 (0.86-0.99) 0.97 (0.90-1.05)

4 0.93 (0.86-1.00) 0.96 (0.89-1.04)

p for trend 0.08 0.54

p for hetero. 0.94 0.73

Quartile Total Fruits Total Vegetables RR (95% CI) RR (95% CI)

Smith-Warner, et al. 2001

Contrast toVegetable and Breast Cancer Meta-Analysis: Results

RR (95% CI) p for het high vs low

Total fruits 0.94 (0.79-1.11) <0.001

Total vegetables 0.75 (0.66-0.85) <0.001

Gandini, 2000

Specific Fruits and Vegetables and Breast Cancer: Summary of Case-Control Studies

# of % of Total Food Group studies risk null risk

All Fruits 4 75 0 25

Citrus Fruits 3 33 0 66

All Vegetables <3 -- -- --

Green Veg. 6 83 17 0

Carrots 4 75 25 0

WCRF/AICR 1997

CONTRAST TOFruits, Vegetables and Breast Cancer:

Conclusion of Narrative ReviewA large amount of evidence has accumulated regarding vegetable and fruit consumption and the risk of breast cancer. Almost all of the data from epidemiological studies show either decreased risk with higher intakes or no relationship; the evidence is more abundant and consistent for vegetables, particularly green vegetables, than for fruits.

Diets high in vegetables and fruits probably decrease the risk of breast cancer.

WCRF/AICR 1997

Pooled Multivariate Relative Risks of Breast Cancer and Botanically Defined Fruit and Vegetable Groups and Individual Foods

RR (95% CI)

Cruciferae 0.96 (0.87-1.06)

Leguminosae 0.97 (0.87-1.08)

Rutaceae 0.99 (0.97-1.01)

Carrots 0.95 (0.81-1.12)

Potatoes 1.03 (0.98-1.08)

Apples 0.96 (0.92-1.01)

Increment=100 g/dSmith-Warner 2001

Pooled Multivariate Relative Risks of Breast Cancer and F & V by Menopausal Status

Multi RR (95% CI) p-for - (increment=100 g/d) interaction

Total fruits Premeno. 0.98 (0.94-1.02) Postmeno. 0.99 (0.98-1.01) 0.80

Total vegetables Premeno. 0.99 (0.93-1.06) Postmeno. 1.00 (0.97-1.02) 0.54

Smith-Warner, et al. 2001

Pooling Study-Specific Results vs. Combining Primary Data into One Dataset

Difficult to distinguish population-specific differences in true intake from artifactual differences due to differences in dietary assessment methods• exception: when the unit of measurement

is standard (e.g. alcohol, body mass index)

Pooling allows for study-specific differences in the adjustment for confounders

Multivariate meta-analysis for data consortia, individual patient meta-analysis, and pooling

projectsRitz J, Demidenko E, Spiegelman D

Journal of Statistical Planning and Inference, 2008; 138: 1919-1933

Maximum likelihood and estimating equations metnods for combining results from multiple studies in pooling projects

The univariate meta-analysis model is generalized to a multivariate method, and eficiency advantages are investigated

The test for heterogeneity is generalized to a multivariate one4

Multivariate Pooling-Let βs = β + bs + es + + , s = 1,… s studies

dim (β)= p model covariates E(bs) = E (es) = 0, Var(bs) = ΣB, Var(es) = Cov (βs)

pxp

5

-If ΣB and Cov (βs) are not diagonal, more efficient estimates of β can be obtained

-Weighted estimating equation ideas are used

-Test statistics for H0: ΣB = 0and major submatrices are given assuming normality in bs and es

and asymptotically

^

^

^

ResultsFrom Smith-Warner, et al. 2001

“Types of dietary fat and breast cancer: a pooled analysis of cohort studies”

ARE ( compared to corresponding univariate estimate)

p=3 p=18* p=3 p=18*Saturated fat 86% 81% 93% 84%Mono-unsaturated fat 76% 72% 36% 32%Poly-unsaturated fat 91% 88% 1.35% 1.21%

*Using estimates adjusted for total energy, % protein intake, bmi, parity x age at first birth, height (4 levels), dietary fiber (4 quintiles)

MLE EE

Measurement error correction for pooled estimates

Measurement error correction study-specific estimate using the measurement error model developed from each study's validation data

Pool measurement error corrected β's in the usual way -- note that pooled variance will reflect additional uncertainty in the study-specific measurement error corrected estimates

17

Pooled Analyses vs. Meta-Analyses

Strengths• Increased standardization• Can examine rare exposures• Can analyze population subgroups• Reduce publication bias

Limitations• Expensive• Time-consuming• Requires close cooperation with many

investigators• Errors in study design multiplied (prospective)

Another example: Dairy intake and ovarian cancer incidence (Genkinger et al., 2005)

• From pooling project of prospective studies of diet and cancer

• 12 studies, 2,087 epithelial ov ca, 560,035 women

• Located in North America and Europe• Follow-up between 1976-2002, of between 7

to 20 years• Studies had to have 50 or more cases to be

included (range 52 to 315)

Daily Mean Intakes of Dairy Nutrients and Foods by Cohort Study in the Ovarian Cancer Analyses in the Pooling Project

Table 1. Daily Mean Intakes of Dairy Nutrients and Foods by Cohort Study in the Ovarian Cancer Analyses in the Pooling Project of Prospective Studies of Diet and Cancer

Mean (SD) Intake3 Cohort1

Follow-up Years

Baseline Cohort Size2

Number of Cases

Dietary Calcium (mg/day)

Total Calcium4 (mg/day)

Lactose (g/day)

Dietary Vitamin D (IU/day)

Total Vitamin

D4 (IU/day)

Total Milk

(g/day) 5

Hard Cheese

(g/day) 5

Yogurt (g/day) 5

AHS 1976-1988 18,402 53 832 (124) 878 (139) 18 (14) --- --- 419 (349) 8 (8) --- BCDDP 1987-1999 32,885 142 862 (369) 1186 (2979) 19 (14) 206 (122) 341 (279) 260 (269) 13 (20) --- CNBSS 1980-2000 56,837 223 672 (252) --- 8 (7) --- --- 199 (198) 22 (23) 29 (60) CPS II 1992-2001 60,796 233 888 (381) 1140 (585) 18 (14) 197 (119) 343 (259) 269 (269) 6 (12) 44 (71) IWHS 1986-2001 28,486 208 748 (285) 1029 (483) 15 (11) 223 (111) 382 (292) 275 (265) 11 (13) 12 (39) NLCS 1986-1995 62,412 208 869 (261) --- 14 (8) --- --- 187 (153) 23 (18) 52 (56) NYSC 1980-1987 22,550 77 828 (209) 873 (220) 15 (9) 203 (68) 371 (227) 137 (87) --- --- NYU 1985-1998 12,401 65 810 (307) 867 (317) 14 (11) --- --- 202 (243) 17 (22) 38 (61) NHS80 1980-1986 80,195 120 722 (298) 731 (310) 14 (11) 167 (107) 279 (262) 215 (241) 14 (15) 21 (54) NHS86 1986-2000 59,538 315 718 (254) 1056 (492) 13 (10) 182 (100) 319 (243) 221 (230) 13 (13) 28 (55) NHS II 1991-2000 91,502 52 787 (271) 910 (381) 16 (11) 223 (109) 351 (231) 268 (255) 12 (12) 31 (55) SMC 1987-2003 61,103 287 913 (256) --- 16 (10) 199 (130) --- 156 (130) 27 (19) 104 (108) WHS 1993-2002 32,466 104 729 (258) 940 (442) 14 (10) 217 (104) 324 (216) 215 (222) 9 (11) 36 (64)

1 AHS=Adventist Health Study, BCDDP=Breast Cancer Detection Demonstration Project, CNBSS=Canadian National Breas t Screening Study, CPS II=Cancer Prevention Study II Nutrition Cohort, IWHS=Iowa Women’s Health Study, NLCS=Netherlands Cohort Study, NYSC=New York State Cohort, NYU=New York University Women’s Health Study, NHS80=Nurses’ Health Study (part a), NHS86=Nurses’ Health Study (part b), NHS II=Nurses’ Health Study II, SMC=Sweden Mammography cohort, WHS=Womens’ Health Study 2 Baseline cohort size determined after specific exclusions (i.e., prior cancer diagnosis other than non -melanoma skin cancer at baseline, had a bilateral oophorectomy prior to baseline, or if they had loge-transformed energy intakes beyond three standard deviations from the study-specific loge-transformed mean energy intake of the population ) 3 Studies which have a --- did not estimate that nutrient or did not ask on their questionnaire about the consumption of that food item 4 Total calcium and vitamin D includes dietary and supplemental sources.

Multivariate1 Adjusted Relative Risks (RR) and 95% Confidence Intervals (CI)

for Ovarian Cancer According to Lactose Intake (>30g/day compared to <10g/day)

by Study

F ig u r e 2 . M u lt iv a r ia te 1 A d ju s te d R e la t iv e R is k s (R R ) a n d 9 5 % C o n f id e n c e I n te r v a ls (C I ) fo r O v a r ia n C a n c e r A c c o r d in g to L a c to s e I n ta k e (> 3 0 g /d a y c o m p a r e d to < 1 0 g /d a y ) b y S tu d y

R e la tiv e Ris k.2 .5 1 2 5

C o m b i n e d

W o m e n 's H e a l th S tu d y

N u rse s' H e a l th S t u d y 1 9 8 0

N u rse s' H e a l th S tu d y I I

N e th e r l a n d s C o h o rt S tu d y

S w e d e n M a m m o g ra p h y C o h o rt

N u rse s' H e a l th S t u d y 1 9 8 6

Io w a W o m e n 's H e a l t h S tu d y

C a n c e r P re v e n t i o n S tu d y I I

B re a st C a n c e r D e m o n st ra t i o n D e te c t i o n P ro g ra m

N e w Y o rk U n i v e rsi t y W o m e n 's H e a l th S tu d y

N e w Y o rk S ta t e C o h o rt

C a n a d i a n N a t i o n a l B re a st S c re e n i n g S tu d y

A d v e n t i st H e a l t h S tu d y

1 M u ltiv a r ia te re la t iv e r isk s w e re a d ju s te d fo r a g e a t m e n a rc h e (< 1 3 , 1 3 , > 1 3 y e a r s ) , m e n o p a u sa l s ta tu s a t b a se l in e , o ra l c o n tra c e p tiv e u se (e v e r , n e v e r ) , h o r m o n e re p la c e m e n t th e ra p y u s e a m o n g p o s t- m e n o p a u s a l w o m e n ( n e v e r , p a s t , c u r re n t) , p a r ity (0 , 1 , 2 , > 2 ) , b o d y m a ss in d e x (< 2 3 , 2 3 -2 4 .9 , 2 5 -2 9 .9 , > 3 0 k g /m 2 ) , s m o k in g s ta tu s (n e v e r , p a s t, c u r r e n t) , p h y s ic a l a c tiv ity ( lo w , m e d iu m , h ig h ) , a n d e n e rg y in ta k e (c o n tin u o u s ly ) , m o d e le d id e n t ic a lly a c ro s s s tu d ie s . . E u n y o u n g p a p e r [T h e b la c k s q u a re s a n d h o r iz o n ta l l in e s c o r re sp o n d to th e s tu d y -sp e c i f ic re la t iv e r is k s a n d 9 5 % c o n f id e n c e in te rv a ls fo r > 3 0 g /d a y la c to s e in ta k e . T h e a re a o f th e b la c k s q u a re s re f le c ts th e s tu d y -sp e c i f ic w e ig h ts ( in v e rse o f th e v a r ia n c e ) , w h ic h is re la te d to s a m p le s iz e a n d in ta k e v a r ia t io n . T h e d ia m o n d re p re s e n ts th e p o o le d m u l tiv a r ia te re la t iv e r is k a n d 9 5 % c o n f id e n c e in te rv a l. T h e d a s h e d lin e re p re se n ts th e p o o le d m u lt iv a r ia te re la tiv e r is k . ]

Data and SAS Program to analyze Lactose and Ovarian Cancer – Pooling Project

AHS 0.496 1.23 BCDDP 0.873 1.46 CNBSS 0.659 2.76 CPS2 1.033 1.48 IWHS 1.225 1.95 NLCS 1.568 3.32 NYS 0.767 3.53 NYU 0.787 2.69 NHSa 1.619 2.90 NHSb 1.240 1.90 NHSII 1.573 3.51 SMC 1.346 2.07 WHS 1.699 3.23 options ps=58 ls=120 nodate nonumber; %include "newmeta.mac"; data dat; *infile "no_conf.dat" lrecl=2000; infile "lactose.dat" lrecl=2000; input study $ beta ub; beta = log(beta); var = ((log(ub)-beta)/1.96)**2; w=1/var; %meta(beta = beta, var = var, data = _last_, labels=study, wt=1,name='pooling project -- lactose and ov ca',modlab=1, out=metaout); proc mixed data=dat; class study; model beta = /s; random study/s; repeated; weight w; parms (0) (1) / eqcons=2; proc mixed data=dat; class study; model beta = /s; random study/s; repeated; weight w; parms (0) (1) / eqcons=1 to 2;

SAS Output for Analysis of Lactose Data The SAS System Meta-Analysis for variable : 'pooling project -- lactose and ov ca' Model : 1 Weight used for Odds Ratios etc., is 1 Inverse-variance weighted average of estimates ----------------------------------------------------------------------------------------------- Pooled Lower Upper Z-score for Chi-sq Estimate S.E. 95% CL 95% CL H0: OR=1 Prob ----------------------------------------------------------------------------------------------- Beta 0.164614 0.082208 0.003487 0.325741 2.002423 0.045239 OR 1.178938 1.003494 1.385057 ----------------------------------------------------------------------------------------------- Test of heterogeneity ======================= Q : 10.550490 df : 12 Prob : 0.567783 ======================= Estimate of among-study variability = 0 Weighted average of estimates using the DerSimonian and Laird random effects model ----------------------------------------------------------------------------------------------- Pooled Lower Upper Z-score for Chi-sq Estimate S.E. 95% CL 95% CL H0: OR=1 Prob ----------------------------------------------------------------------------------------------- Beta 0.164614 0.082208 0.003487 0.325741 2.002423 0.045239 OR 1.178938 1.003494 1.385057 ----------------------------------------------------------------------------------------------- Input data: ========================================================================================================== Label Beta Variance OR Lower 95% Upper 95% Fixed wt Random wt ========================================================================================================== AHS -0.701179 0.214706 0.496000 0.200013 1.230000 0.03 0.03 BCDDP -0.135820 0.068841 0.873000 0.522006 1.460000 0.10 0.10 CNBSS -0.417032 0.533990 0.659000 0.157348 2.760000 0.01 0.01 CPS2 0.032467 0.033656 1.033000 0.721006 1.480000 0.20 0.20 IWHS 0.202941 0.056258 1.225000 0.769551 1.950000 0.12 0.12 NLCS 0.449801 0.146487 1.568000 0.740549 3.320000 0.05 0.05 NYS -0.265268 0.606623 0.767000 0.166654 3.530000 0.01 0.01 NYU -0.239527 0.393224 0.787000 0.230249 2.690000 0.02 0.02 NHSa 0.481809 0.088446 1.619000 0.903849 2.900000 0.08 0.08 NHSb 0.215111 0.047405 1.240000 0.809263 1.900000 0.14 0.14 NHSII 0.452985 0.167695 1.573000 0.704937 3.510000 0.04 0.04 SMC 0.297137 0.048223 1.346000 0.875225 2.070000 0.14 0.14 WHS 0.530040 0.107438 1.699000 0.893685 3.230000 0.06 0.06 ==========================================================================================================

SAS Output: Random effects regression model The SAS System The Mixed Procedure Model Information Data Set WORK.DAT Dependent Variable beta Weight Variable w Covariance Structure Variance Components Estimation Method REML Residual Variance Method Parameter Fixed Effects SE Method Model-Based Degrees of Freedom Method Containment Class Level Information Class Levels Values study 13 AHS BCDDP CNBSS CPS2 IWHS NHSII NHSa NHSb NLCS NYS NYU SMC WHS Dimensions Covariance Parameters 2 Columns in X 1 Columns in Z 13 Subjects 1 Max Obs Per Subject 13 Observations Used 13 Observations Not Used 0 Total Observations 13 Parameter Search CovP1 CovP2 Res Log Like -2 Res Log Like 0 1.0000 -5.3070 10.6140 Iteration History Iteration Evaluations -2 Res Log Like Criterion 1 1 10.61404477 0.00000000 Convergence criteria met.

SAS Output: Random effects regression model The SAS System The Mixed Procedure Covariance Parameter Estimates Cov Parm Estimate study 0 Residual 1.0000 Fit Statistics -2 Res Log Likelihood 10.6 AIC (smaller is better) 10.6 AICC (smaller is better) 10.6 BIC (smaller is better) 10.6 PARMS Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 0 0.00 1.0000 Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept 0.1646 0.08221 12 2.00 0.0684

SAS Output: Fixed effects regression model

fixed effects model The Mixed Procedure Model Information Data Set WORK.DAT Dependent Variable beta Weight Variable w Covariance Structure Variance Components Estimation Method REML Residual Variance Method Parameter Fixed Effects SE Method Model-Based Degrees of Freedom Method Containment Class Level Information Class Levels Values study 13 AHS BCDDP CNBSS CPS2 IWHS NHSII NHSa NHSb NLCS NYS NYU SMC WHS Dimensions Covariance Parameters 2 Columns in X 1 Columns in Z 13 Subjects 1 Max Obs Per Subject 13 Observations Used 13 Observations Not Used 0 Total Observations 13 Parameter Search CovP1 CovP2 Res Log Like -2 Res Log Like 0 1.0000 -5.3070 10.6140 Iteration History Iteration Evaluations -2 Res Log Like Criterion 1 1 10.61404477 0.00000000 Convergence criteria met.

SAS Output: Fixed effects regression model The SAS System The Mixed Procedure Covariance Parameter Estimates Cov Parm Estimate study 0 Residual 1.0000 Fit Statistics -2 Res Log Likelihood 10.6 AIC (smaller is better) 10.6 AICC (smaller is better) 10.6 BIC (smaller is better) 10.6 PARMS Model Likelihood Ratio Test DF Chi-Square Pr > ChiSq 0 0.00 1.0000 Solution for Fixed Effects Standard Effect Estimate Error DF t Value Pr > |t| Intercept 0.1646 0.08221 12 2.00 0.0684

Prospectively PlannedProspectively Planned Pooled Analyses

Protocol standardized across studies for

• hypotheses

• data collection methods

• definition of variables

• analyses

European Prospective Investigation into Cancer and Nutrition (EPIC): Study Design

Multicenter prospective study• 22 centers in 9 countries• Initiated 1993-1998

Objective:• Improve scientific knowledge on nutritional

factors involved in diet• Provide scientific bases for public health

interventions Baseline cohort: 484,042

Riboli 2001

EPIC: Study Design, cont’d

Measures• Questionnaires• Anthropometry• Blood samples (n=387,256)

Outcomes• Cancer registry (n=6)• Combination (n=3)

–Health insurance records–Cancer and pathology registries–Active followup

• Mortality registries (n=9)

Riboli 2001

EPIC: Diet Assessment Methods

Self-administered questionnaire (n=7 countries)• 300-350 foods

Interviewer-administered questionnaire (n=2 centers)• Similar to self-administered questionnaire

Food frequency questionnaire + 7-day diet record (n=2 centers)

24-hr recall from 8-10% random sample from each cohort

Riboli 2001

Why Are Pooled Analyses Time-Consuming?

Data management Add updated case information Errors found in data Individual study wants to publish their findings

prior to submitting pooled results Manuscript review Signature sheets

“Pooling decreases the variation caused by random error (increasing the sample size) but does not eliminate any bias (systematic error)”.

Blettner 1999

REFERENCES

1. Blettner M, Sauerbrei W, Schlehofer B, Scheuchenpflug T, Friedenreich C. Traditional reviews, meta-analyses and pooled analyses in epidemiology. International Journal of Epidemiology, 1999; 28:1-9.

2. Costa-Bouzas J, Takkouche B, Cadarso-Suárez C, Spiegelman D. HEpiMA: Software for the identification of heterogeneity in meta-analysis. Computer Methods and Programs in Biomedicine, 2000; 64(2):101-107.

3. DerSimonian R, Laird N. Meta-analysis in clinical trials. Controlled Clinical Trials, 1986; 7:177-188.

4. Friedenreich CM, Methods for pooled analyses of epidemiologic studies. Epidemiology; 1993; 4:295-302.

5. Gandini S, Merzenich H, Robertson C, Boyle P. Meta-analysis of studies on breast cancer risk and diet: the role of fruit and vegetable consumption and the intake of associated micronutrients. European Journal of Cancer, 2000; 36:636.

6. Smith-Warner S, Spiegelman D, Adami H, et al. Intake of fruits and vegetables and risk of breast cancer: A pooled analysis of cohort studies. JAMA, 2001; 285:769-776.

7. Steinberg KK, Smith SJ, Striup DF, et al. Comparison of effect estimates from a meta-analysis of summary data from published studies and from a meta-analysis using individual patient data for ovarian cancer studies. American Journal of Epidemiology, 1997; 145:917-925.

8. Stram DO. Meta-analysis of publixhed data using a linear mixed-effects model. Biometrics, 1996; 52:536-544.

9. Takkouche B, Cardarso-Suárez C, Spiegelman D. An evaluation of old and new tests for heterogeneity in meta-analysis for epidemiologic research. American Journal of Epidemiology, 1999; 150:206-215.

10. World Cancer Research Fund, American Institute for Cancer Research Expert Panel (J.D. Potter, Chair). Food, nutrition and the prevention of cancer: A global perspective. Washington DC: American Institute for Cancer Research, 1997.

11. Smith-Warner SA, et al. Types of dietary fat and breast cancer: A pooled analysis of cohort studies. International Journal of Cancer, 2001; 92:767-774.

12. Ritz J, Demidenko E, Spiegelman D. Multivariate pooling for efficiency. Journal of Statistical Planning and Inference, 2008; 138:1919-1933.

13. Higgins JPT, T hompson SG, Deeks JJ, Altman DG. Measuring inconsistency in meta-analysis. British Journal of Medicine, 2003; 327:557-560.

14. Sterne J, Egger M. Funnel plots for detecting bias in meta-analytics: Guidelines in choice of axis. Journal of Clinical Epidemiology, 2001; 54:1046-1055.

15. Genkinger JM, Hunter DJ, Spiegelman D, Anderson KE, Arslan A, Beeson WL, Buring JE, Fraser GE, Freudenheim JL, Goldbohm RA, Hankinson SE, Jacobs DR Jr, Koushik A, Lacey JV Jr, Larsson SC, Leitzmann M, McCullough ML, Miller AB, Rodriguez C, Rohan TE, Schouten LJ, Shore R, Smit E, Wolk A, Zhang SM, Smith-Warner SA. Dairy products and ovarian cancer: A pooled analysis of 12 cohorts. Cancer Epidemiol Biomarkers Prev, 2006; 15:364-72.

16. Riboli E. The European prospective investigation into cancer and nutrition (EPIC): Plans and progress. Journal of Nutrition, 2001; 131(1):170S-175S.

Methods for Summarizing the Evidence: Meta-Analyses and Pooled Analyses

Documents

Transcript of Methods for Summarizing the Evidence: Meta-Analyses and Pooled Analyses