Download - 1 Basic epidemiological study designs and its role in measuring disease exposure association M. A. Yushuf Sharker Assistant Scientist Center for Communicable.

1

Basic epidemiological study designs and its role in measuring disease exposure

association

M. A. Yushuf SharkerAssistant Scientist

Center for Communicable diseasesIcddr,b

[email protected]

2

Measures of Disease Exposure Association

Does coffee drinking affect the risk of pancreatic cancer?

Coffee drinking is the exposure (E) or risk factor measured in binary scale

Exposed = E = 1Not exposed = E’ = 0

Pancreatic cancer is the disease outcome (D) also measured in binary scale

Disease = D = 1Not Disease = D’ = 0

We are interested in the association between the risk factor and disease

3

Data

• We have collected information from the sampled individuals and

arranged in the following table

Disease Not Disease

Exposed a b a+b

Not Exposed c d c+d

a+c b+d N

4

Excess Risk (ER)

• -1 ≤ ER ≤ 1• ER = 0 is the null value, D and E are independent• ER > 0 indicates greater risk of disease when exposed than

when unexposed• ER < 0 is the lower risk of disease

• ER is the excess number of cases as a fraction of the population size when population members are all exposed compared to them all being unexposed

5

Example: Excess Risk (ER)

Risk of Pancreatic cancer associated with coffee drinking is .07Or,We would expect the pancreatic cancer increase by 7% as all

people appear as coffee drinker compared to all people being non-coffee drinker

Pancreatic Cancer

D not D

Coffee Drinking

E 7 52 59not E 7 134 141

14 186 200

6

Relative Risk (RR)

• RR is always non-negative• RR = 1 is the null value• RR > 1 indicates greater risk of disease when exposed than

when unexposed• RR < 1 is the lower risk of disease

• RR is not symmetric over the role of D and E

7

Example: Relative Risk (RR)

The risk of having pancreatic cancer is 2.4 time higher among coffee drinkers than non-coffee drinkers

Pancreatic Cancer

D not D

Coffee Drinking

E 7 52 59not E 7 134 141

14 186 200

8

Odds Ratio (OR)

• OR is always non-negative• OR = 1 is the null value• OR > 1 indicates greater risk of disease when exposed than

when unexposed• OR < 1 is the lower risk of disease • OR > RR• OR is symmetric over the role of D and E

• Under rare disease assumption, OR closely approximates RR

9

Example: Odds Ratio (OR)

Coffee drinkers are 2.6 time more likely to have cancer than non coffee drinkers

Pancreatic cancer patients are 2.6 time more likely to be exposed in coffee drinking than non-diseased

Pancreatic Cancer

D not D

Coffee Drinking

E 7 52 59not E 7 134 141

14 186 200

10

Does associations indicate causal association?

Not necessarily

11

Why causality assessment is so important in health research?• We are interested to identify the risk factors that influence

the risk of disease

• The objective of health research is to develop a sustainable intervention to reduce the risk of disease

• Interventions are designed to reduce the exposure to the risk factors to control the infection

12

Causal Association

Association of disease occurrence with the risk factor is said to be causal if

1. The level of exposure is determined before the occurrence of the disease (level of exposure and Chronology)

2. The other conditions for both exposed and unexposed will remain identical

3. The factors which might be the outcome of exposure were not controlled by the identical condition

We might require additional criterion to follow to be an association as causal, We are highlighting these three because these are linked with study designs.

13

Ideal experiment for causal association

An ideal experiment

Group Coffee (E) Not Coffee (E') Number1 D D Np12 D D' Np23 D' D Np34 D' D' Np4

Think about the causal effect of Coffee drinking (E) and Pancreatic Cancer (D)

14

Real world Experiment

A real world experiment

• Unobserved response is called counterfactual

• Measures of causal effect depends on the distribution of counterfactuals

Counterfactual of subject 1



15


Let us consider a specific distribution of counterfactual

Group 1 and 3 we observed the response to coffee drinking and 2 and 4 to non coffee drinking

Group Coffee (E) Not Coffee (E') Number1 D Np12 D' Np23 D' Np34 D' Np4

16


Population data on coffee drinking and pancreatic cancer with specific counterfactual distribution

Group D Not D Total

E Np1 Np3 N(p1+p3)

E’ 0 N(p2+p4) N(p2+p4)

Total Np1 N(p2+p3+p4) NWe might think of different distribution of counterfactual that might give us different values between 0 to inf.

Such specific distribution does not guarantee RR = RRcausal

17


Suppose for each group, the toss of a coin determines whether an individual of a group whether he/she is a coffee drinker or abstainer. We can expect the following 2 X 2 table

Group D Not D Total

E N(p1+p2)/2 N(p1+p4)/2 N/2

E’ N(p1+p3)/2 N(p2+p4)/2 N/2

Total N(2p1+p2+p3)/2 N(p2+p3+2p4)/2 N

Even if the distribution of exposure is random, Then on average, the real world experiment might determine the causal association

18

Study Designs

We design our studies following the basic requirements so that our observed association could translate into causal association.

The basic study designs in epidemiological research are as follows

• Observational• Cross Sectional• Case-Control study• Cohort Study

• Experimental• Randomized control trial

19

• Select a random sample of size n from the study population

• Example: Coffee drinking and pancreatic cancer

Cross sectional survey

Pancreatic Cancer

D not D

Coffee Drinking

E 7 52 59not E 7 134 141

14 186 200

20

• Choice of an individual (under either exposure) is random

• This fits with the required condition to be

• In addition, Joint, marginal and conditional probabilities can be estimated using data from the cross sectional survey which really useful to translate the disease process.


21

Disease Status

D not D

Coffee Drinking

E 7 52 59

not E 7 134 141

14 186 200

0.1197

,drinker coffee among diseaseof Proportion 591R

0.05114

7R abstainer, coffee among diseaseof Proportion 0

2.390.05

0.119RR

RR0

1


22

• Not suitable for infectious or high fatal disease

• Exposure and outcome examined simultaneously

• Cannot infer the temporal sequence of E-D


23

Case-Control Studies

• Identify the two sub-group of population of diseased and not diseased

• Select a random samples from diseased and not diseased group separately

• Measure subsequently the presence and absence of E of each individuals in the past in both samples

24


• The distribution of exposure condition is random

• Confirms that exposure happened before disease

• We can only estimate the conditional probabilities of exposure for given either disease condition

• We can only estimate the Odds Ratio using the symmetric properties of OR

• In rare disease condition, OR closely approximates RR

25


Example: Possible data from a Case-Control study (nd = nd’ = 100) Pancreatic cancer

Low Normal

Coffee Drinking

E 50 28 78not E 50 72 122

100 100 2002.5728507250

bcad

OR

Ratio, Odds

The probability of getting coffee drinker is 2.6 time more likely in diseased group than not diseased group

26

Limitations of Case-Control Studies

• Exposure measurement error

• Selection bias through inappropriate choice of controls

• Clear definition of case and control and predefined method of control selection could reduce the chance of bias but we have very few tools to control the exposure measurement error

27

Variants of Case-Control Studies

• Risk set sampling

• In a dynamic cohort, all disease does not happened at the same time. If a patient was identified at time t, we select controls from the population of individuals who are at risk at time t.

• Case cohort design

• In this approach we select a set of diseased individuals within a time interval [0, T] through risk set sampling.

• m controls are chosen at random from the entire population at risk

28

Cohort Studies

• Identify two sub-groups of the population on the basis of their exposure level and follow-up

• Take a random sample from each of the two subgroups

• Measure subsequently the presence or absence of disease of the selected individuals of the both subgroups

Sampling is carried out separately for sub populations at different levels of exposure

29

Example data from Cohort Studies

Pancreatic Cancer

D not D

Coffee Drinking

E 12 88 100

not E 5 95 100

17 183 200• Joint and marginal probabilities are not estimable

• Only conditional probabilities conditioning on exposure are estimable

• RR, OR and ER are estimable from such study design

• Possible data from a cohort study (n=200)

30

future

time

Studypopulation

Exposure

Non-exposure

outcome

no outcome

outcome

no outcome

baseline

RANDOMIZATION

Randomized Control Trial (RCT)

31

A Randomized Control Trial

• Provides high degree of validity to establish causality

• Maintain all the advantage of cohort design

• Ethical challenges of experimental research

• Most costly & time consuming

– Randomization, blinding…

• Not feasible for outcomes that are rare or have long lag times

Randomized Control Trial (RCT)

32

Hierarchy of Epidemiologic Study Designs

Tower & Spector, 2007

Case reports

Case series

Ecologic studies

Cross-sectional studies

Case-control studies

Cohort studies

Randomized controlled trials

Generates hypotheses

Establish causality

33

Common Errors and biases in causal association

All study design for causal association suffers the following Errors and biases

• Measurement error of disease and exposures

• Information bias

• Selection Bias

• Confounder

34

Take Home Messages

• Statistical association are not necessarily causal association

• Study designs provides situations so that the real world experiments close approximate the outcomes to ideal experiments

• Different study designs provide different levels of confident for causal association