Confounding Lecture

144
Preparatory lecture Confounding and Bias

description

bias and confounders in medical studies

Transcript of Confounding Lecture

Page 1: Confounding Lecture

Preparatory lecture

Confounding and Bias

Page 2: Confounding Lecture

Random Error 2

Introduction

• Most epidemiological studies measure disease frequency in two (or more) groups that differ only on the exposure of interest. The two measures of disease frequency are combined into a single measure of association – risk or rate ratio, odds ratio, risk or rate difference

Page 3: Confounding Lecture

Random Error 3

Introduction

• The next step is to evaluate whether the result that has been observed in the data is true, or whether the observed result is false and there is an alternate explanation. This is the process of assessing validity of a study result.

Page 4: Confounding Lecture

Accuracy vs. precision

Accuracy: obtaining results close to truth

Survey 1

Survey 2

Survey 3

Real population

value

Page 5: Confounding Lecture

Accuracy vs. precision

Precision: obtaining similar results with repeated measurement (may or may not be accurate)

Page 6: Confounding Lecture

Accuracy vs. precision

Poor precision (from small sample size) with reasonable accuracy (without bias):

Page 7: Confounding Lecture

Accuracy vs. precision

Good precision (from small sample size) with reasonable accuracy (without bias):

Page 8: Confounding Lecture

Accuracy vs. precision

Good precision (from large sample size), but with poor accuracy (with bias):

Page 9: Confounding Lecture

In sum…• Sampling error

– Difference between survey result and population value due to random selection of sample

– Greater with smaller sample sizes– Induces lack of precision

• Bias– Difference between survey result and population value due to error

in measurement, selection of non-representative sample or other factors

– Due to factors other than sample size– Therefore, a large sample size cannot guarantee absence of bias– Induces lack of accuracy, even with good precision

Page 10: Confounding Lecture

Definitions ERROR: 1. A false or mistaken result obtained in a study

or experiment2. Random error is the portion of variation in

measurement that has no apparent connection to any other measurement or variable, generally regarded as due to chance

3. Systematic error which often has a recognizable source, e.g., a faulty measuring instrument, or pattern, e.g., it is consistently wrong in a particular direction

(Last)

Page 11: Confounding Lecture

Bias• Deviation of results or inferences from the

truth, or processes leading to such deviation. Any trend in the collection, analysis, interpretation, publication, or review of data that can lead to conclusions that are systematically different from the truth. (Last)

• A process at any stage of inference tending to produce results that depart systematically from true values (Fletcher)

Page 12: Confounding Lecture

What is meant by bias inresearch?

• Bias is the term used to describe differences between the study findings and truth

• “Any effect at any stage of investigation orinference tending to produce results that departsystematically from the true values (to bedistinguished from random error)”

Page 13: Confounding Lecture

Bias

• Bias is a systematic error in an epidemiologic study that results in an incorrect estimation of the association between exposure and outcome

Page 14: Confounding Lecture

What can be wrong in the study?

Random error

Results in low precision of the epidemiological measure measure is not precise, but true

1 Imprecise measuring2 Too small groups

Systematic errors(= bias)

Results in low validity of the epidemiological measure measure is not true

1 Selection bias2 Information bias

3 Confounding

Page 15: Confounding Lecture

Random errors

Page 16: Confounding Lecture

Errors in epidemiological studies

Error

Study size

Systematic error (bias)

Random error (chance)

Page 17: Confounding Lecture

Estimation

• When we measure OR, we estimate a point estimate– Will never know the true value

• Confidence interval indicates precision or amount of random error– Wide interval low precision– Narrow interval high precision

• OR = 4.5 (2.0 – 10)

Page 18: Confounding Lecture

Classification of bias

There are three broad categories of bias:• • selection bias• • confounding• • measurement bias

Page 19: Confounding Lecture

Systematic error

•Does not decrease with increasing sample size

• Selection bias• Information bias• Confounding

Page 20: Confounding Lecture

BiasSystematic deviations in study findings

from the truth–Results from errors in the collection,

analysis, interpretation, publication, or review of data

Page 21: Confounding Lecture

Selection BiasError due to systematic difference between the characteristics of the people selected for a study and

those who are not.

Page 22: Confounding Lecture

Selection bias

• Errors due to systematic differences in characteristics between those who are selected for study and those who are not.

(Last; Beaglehole)• When comparisons are made between groups

of patients that differ in ways other than the main factors under study, that affect the outcome under study. (Fletcher)

Page 23: Confounding Lecture

What is Selection Bias?

“ Error due to systematic differences in characteristics between those who are selected for study and those who are not.”

Page 24: Confounding Lecture

Examples of Selection bias• Subjects: hospital cases under the care of a

physician• Excluded: 1. Die before admission – acute/severe disease.2. Not sick enough to require hospital care3. Do not have access due to cost, distance etc.• Result: conclusions cannot be generalized• Also known as ‘Ascertainment Bias’

)Last(

Page 25: Confounding Lecture

Ascertainment Bias

• Systematic failure to represent equally all classes of cases or persons supposed to be represented in a sample. This bias may arise because of the nature of the sources from which the persons come, e.g., a specialized clinic;

Page 26: Confounding Lecture

Case ascertainment• Who is your case?

– Patient?– Deceased person?

• What is the definition of the case?– Cancer (clinically? Pathologically?)– Virus carriers (Asymptomatic patients)

→ You need to screen the antibody

Page 27: Confounding Lecture

Who will be controls?• Control   ≠ non-case

– Controls are also at risk of the disease in his(her) future.

– In a case-control study of gastric cancer, a person who has received the gastrectomy cannot be a control.

– In a case-control study of car accident, a person who does not drive a car cannot be a control.

Page 28: Confounding Lecture

Selection bias with ‘volunteers’

• Also known as ‘response bias’ • Systematic error due to differences in

characteristics b/w those who choose or volunteer to take part in a study and those who do not

Page 29: Confounding Lecture

Selection bias with ‘Survival Cohorts’

• Patients are included in study because they are available, and currently have the disease

• For lethal diseases patients in survival cohort are the ones who are fortunate to have survived, and so are available for observation

• For remitting diseases patients are those who are unfortunate enough to have persistent disease

• Also known as ‘Available patient cohorts’

Page 30: Confounding Lecture

Selection bias due to ‘Loss to Follow-up’

• Also known as ‘Migration Bias’• In nearly all large studies some members of

the original cohort drop out of the study• If drop-outs occur randomly, such that

characteristics of lost subjects in one group are on an average similar to those who remain in the group, no bias is introduced

• But ordinarily the characteristics of the lost subjects are not the same

Page 31: Confounding Lecture

Healthy worker effect

• A phenomenon observed initially in studies of occupational diseases: workers usually exhibit lower overall death rates than the general population, because the severely ill and chronically disabled are ordinarily excluded from employment. Death rates in the general population may be inappropriate for comparison if this effect is not taken into account.

)Last(

Page 32: Confounding Lecture

Example…. ‘healthy worker effect ’

• Question: association b/w formaldehyde exposure and eye irritation

• Subjects: factory workers exposed to formaldehyde

• Bias: those who suffer most from eye irritation are likely to leave the job at their own request or on medical advice

• Result: remaining workers are less affected; association effect is diluted

Page 33: Confounding Lecture

Information Bias(Observation Bias,

Measurement Bias)Error due to systematic differences in the way data on exposure or outcome are obtained from various groups leading to misclassification of

study subjects

Page 34: Confounding Lecture

Measurement bias• Systematic error arising from inaccurate

measurements (or classification) of subjects or study variables. (Last)

• Occurs when individual measurements or classifications of disease or exposure are inaccurate (i.e. they do not measure correctly what they are supposed to measure)

(Beaglehole)• If patients in one group stand a better chance of

having their outcomes detected than those in another group. (Fletcher)

Page 35: Confounding Lecture

Measurement / (Mis) classification

• Exposure misclassification occurs when exposed subjects are incorrectly classified as unexposed, or vice versa

• Disease misclassification occurs when diseased subjects are incorrectly classified as non-diseased, or vice versa

)Norell(

Page 36: Confounding Lecture

Causes of misclassification

1. Measurement gap: gap between the measured and the true value of a variable

- Observer / interviewer bias- Recall bias- Reporting bias2. Gap b/w the theoretical and empirical

definition of exposure / disease

Page 37: Confounding Lecture

Example… ‘gap b/w definitions’

Theoretical definition•Exposure: passive

smoking – inhalation of tobacco smoke from other people’s smoking

•Disease: Myocardial infarction – necrosis of the heart muscle tissue

Empirical definition•Exposure: passive

smoking – time spent with smokers (having smokers as room-mates)

•Disease: Myocardial infarction – certain diagnostic criteria (chest pain, enzyme levels, signs on ECG)

Page 38: Confounding Lecture

Exposure misclassification – Non-differential

•Misclassification does not differ between cases and non-cases

•Generally leads to dilution of effect, i.e. bias towards RR=1 (no association)

Page 39: Confounding Lecture

Example…Non-differential Exposure Misclassification

+nt-ntTotal+nt4080120-nt

100004000050000

+nt-ntTotal+nt6060120-nt

200003000050000

EXPOSUREX-ray exposure

EXPOSUREX-ray exposure

DIS

EASE

Bre

ast C

ance

r

RR= 40/10000 80/40000 = 2

RR= 60/20000 60/30000 = 1.5

Page 40: Confounding Lecture

An example of non-differential misclassification in an exposure variable

We want to compare mean of blood pressure levels between cases and controls.

The blood pressure checker has a problem and always gives 5mmHg-higher than true values.

All subjects were examined by the same blood pressure checker.

→ no problem for internal comparison

Page 41: Confounding Lecture

Exposure misclassification - Differential

• Misclassification differs between cases and non-cases

• Introduces a bias towards RR= 0 (negative / protective association), or

RR= α (infinity)(strong positive association)

Page 42: Confounding Lecture

Example…Differential Exposure Misclassification

+nt-ntTotal+nt4080120-nt99603992049880

100004000050000

+nt-ntTotal+nt4080120-nt199402994049880

199803002050000

EXPOSUREX-ray exposure

EXPOSUREX-ray exposure

DIS

EASE

Bre

ast C

ance

r

RR= 40/10000 80/40000 = 2

RR= 40/19980 80/30020 = 0.75

Page 43: Confounding Lecture

Causes of Differential Exposure Misclassification

• Recall Bias:Systematic error due to differences in accuracy or completeness of recall to memory of past events or experience.

For e.g. patients suffering from MI are more likely to recall and report ‘lack of exercise’ in the past than controls

Page 44: Confounding Lecture

Causes of Differential Exposure Misclassification

• Measurement bias:e.g. analysis of Hb by different methods

(cyanmethemoglobin and Sahli's) in cases and controls.

e.g.biochemical analysis of the two groups from two different laboratories, which give consistently different results

Page 45: Confounding Lecture

Causes of Differential Exposure Misclassification

• Interviewer / observer bias: systematic error due to observer variation (failure of the observer to measure or identify a phenomenon correctly)

e.g. in patients of thrombo-embolism, look for h/o OCP use more aggressively

Page 46: Confounding Lecture

Confounding 1. A relationship b/w the effects of two or

more causal factors as observed in a set of data such that it is not logically possible to separate the contribution that any single causal factor has made to an effect

(Last)

Page 47: Confounding Lecture

Confounding

When another exposure exists in the study population (besides the one being studied) and is associated both with disease and the

exposure being studied. If this extraneous factor – itself a determinant of or risk factor for health outcome is unequally distributed b/w the exposure subgroups, it can lead to

confounding)Beaglehole(

Page 48: Confounding Lecture

Confounding

Confounders are risk factors for the outcome.

Confounders are related to exposure of your interest.

Confounders are NOT in the process of causal relationship between the exposure and the outcome of your interest.

Page 49: Confounding Lecture

Example of “not” confounder- pineal hormone is not a confounder-

Breast cancer

Down regulationof pineal hormoneCausation?

EMF

EMF: electro-magnetic field

EMF exposure induces down

regulation of pineal hormone

Decrease of pineal hormonemay be the risk of breast ca.

If EMF exposure cause breast cancer only through down regulation of pineal hormone, this is not a confounder.

Page 50: Confounding Lecture

Examples … confounding

SMOKING LUNG CANCER

AGE)If the average ages of the smoking and

non-smoking groups are very different(

)As age advanceschances of lungcancer increase(

Page 51: Confounding Lecture

Examples … confounding

COFFEE DRINKING HEART DISEASE

SMOKING

)Coffee drinkers are more likely to smoke(

)Smoking increasesthe risk of heart ds(

Page 52: Confounding Lecture

Examples … confounding

ALCOHOLINTAKE

MYOCARDIALINFARCTION

SEX

)Men are more at risk for MI(

)Men are more likelyto consume alcohol

than women(

Page 53: Confounding Lecture

Why do we have to consider confounding?

We want to know the “real” causal association but a

distorted relationship remains if you do not adjust

for the effects of confounding factors.

Page 54: Confounding Lecture

Example … multiple biases• Study: ?? Association b/w regular exercise and

risk of CHD• Methodology: employees of a plant offered an

exercise program; some volunteered, others did not

coronary events detected by regular voluntary check-ups, including a careful history, ECG, checking routine heath records

• Result: the group that exercised had lower CHD rates

Page 55: Confounding Lecture

Biases operating

• Selection: volunteers might have had initial lower risk (e.g. lower lipids etc.)

• Measurement: exercise group had a better chance of having a coronary event detected since more likely to be examined more frequently

• Confounding: if exercise group smoked cigarettes less, a known risk factor for CHD

Page 56: Confounding Lecture

Methods for controlling Selection Bias

During Study Design1. Randomization2. Restriction3. MatchingDuring analysis1. Stratification2. Adjustmenta) Simple / standardizationb) Multiple / multivariate adjustment

Page 57: Confounding Lecture

Randomization

• The only way to equalize all extraneous factors, or ‘everything else’ is to assign patients to groups randomly so that each has an equal chance of falling into the exposed or unexposed group

• Equalizes even those factors which we might not know about!

• But it is not possible always

Page 58: Confounding Lecture

Restriction

• Subjects chosen for study are restricted to only those possessing a narrow range of characteristics, to equalize important extraneous factors

Page 59: Confounding Lecture

Example… restriction

• Study: effect of age on prognosis of MI• Restriction: Male / White / Uncomplicated

anterior wall MI• Important extraneous factors controlled

for: sex / race / severity of disease• Limitation: results not generalizable to

females, people of non-white community, those with complicated MI

Page 60: Confounding Lecture

• For example:  “Babies who are breast-fed have less illness

than babies who are bottle-fed.” 

Which illnesses? How is feeding type defined? How large a difference in risk?

• A better example:  “Babies who are exclusively breast-fed for

three months or more will have a reduction in the incidence of hospital admissions for gastroenteritis of at least 30% over the first year of life.”

Page 61: Confounding Lecture

Matching - definition •The process of making a study group and a

comparison group comparable with respect to extraneous factors (Last)

•For each patient in one group there are one or more patients in the comparison group with same characteristics, except for the factor of

interest (Fletcher)

Page 62: Confounding Lecture

Types of Matching• Caliper matching: process of matching

comparison group to study group within a specific distance for a continuous variable (e.g., matching age to within 2 years)

• Frequency matching: frequency distributions of the matched variable(s) be similar in study and comparison groups

• Category matching: matching the groups in broad classes such as relatively wide age ranges or occupational groups

Page 63: Confounding Lecture

•Matching is often done for age, sex, race, place of residence, severity of disease, rate of progression of disease, previous treatment received etc.

•Limitations:-controls for bias for only those factors involved

in the match-Usually not possible to match for more than a

few factors because of the practical difficulties of finding patients that meet all matching criteria

-If categories for matching are relatively crude, there may be room for substantial differences b/w matched groups

Page 64: Confounding Lecture

Stratification • The process of or the result of separating a

sample into several sub-samples according to specified criteria such as age groups, socio-economic status etc. (Last)

• The effect of confounding variables may be controlled by stratifying the analysis of results

• After data are collected, they can be analyzed and results presented according to subgroups of patients, or strata, of similar characteristics (Fletcher)

Page 65: Confounding Lecture

Examples … confounding

+nt-nt+nt140100-ntTotal 3000030000

+nt-ntmalefemalemalefemale

+nt120206040-ntTotal20000100001000020000

Exposure-alcohol

Exposure-alcohol

Dis

ease

M

ID

i se a

se

M

I

RR = 140/30000 100/30000 = 1.4

RR = 120/20000(M) 60/10000

= 1RR = 20/10000

(F) 40/20000 = 1

Page 66: Confounding Lecture

Standardization

A set of techniques used to remove as far as possible the effects of differences in age or

other confounding variables when comparing two or more populations

The method uses weighted averaging of rates specific for age, sex, or some other potentially

confounding variable(s), according to some specified distribution of these variables

)Last(

Page 67: Confounding Lecture

Example … direct standardization

PreopPts Deaths %High 500306

Medium400164Low 3002.67

Total 1200484PreopPts RateExp.deathsHigh 400624

Medium400416Low 400.672.68

Total 120042.68 (3.6%)

HOSPITAL ‘A’

HOSPITAL ‘Std’

Page 68: Confounding Lecture

Multivariate adjustment• Simultaneously controlling the effects of

many variables to determine the independent effects of one

• Can select from a large no. of variables a smaller subset that independently and significantly contributes to the overall variation in outcome, and can arrange variables in order of the strength of their contribution

• Only feasible way to deal with many variables at one time during the analysis phase

Page 69: Confounding Lecture

Examples… Multivariate adjustment

• CHD is the joint result of lipid abnormalities, HT, smoking, family history, DM, exercise, personality type.

• Start with 2x2 tables using one variable at a time

• Contingency tables, i.e. stratified analyses, examining the effect of one variable changed in the presence/absence of one or more variables

Page 70: Confounding Lecture

Dealing with measurement bias

1. Blinding- Subject- Observer / interviewer- Analyser 2. Strict definition / standard definition for

exposure / disease / outcome3. Equal efforts to discover events equally in all

the groups

Page 71: Confounding Lecture

Controlling confounding

•Similar to controlling for selection bias•Use randomization, restriction, matching,

stratification, standardization, multivariate analysis etc.

Page 72: Confounding Lecture

How can we solve the problem of confounding?

“Prevention” at study design LimitationRandomization in an

intervention studyMatching in a cohort

study But not in a case-control study

Page 73: Confounding Lecture

How can we solve the problem of confounding?

“Treatment “ at statistical analysis

Stratification by a confounderMultivariate analysis

Page 74: Confounding Lecture

Error & Bias• Error: random error

• Bias : systematic error–differential misclassification

–non-differential misclassification This is a problem!

Page 75: Confounding Lecture

EXAMPLES OF RANDOM ERROR, BIAS, MISCLASSIFICATION AND

CONFOUNDING IN THE SAME STUDY:

STUDY: In a cohort study, babies of women who bottle feed and women who

breast feed are compared, and it is found that the incidence of gastroenteritis, as recorded in medical records, is lower in

the babies who are breast-fed.

Page 76: Confounding Lecture

EXAMPLE OF RANDOM ERROR

By chance, there are more episodes of gastroenteritis in the bottle-fed group in the study sample ,

Or, also by chance, no difference in risk was found ,

Page 77: Confounding Lecture

EXAMPLE OF RANDOM MISCLASSIFICATION

Lack of good information on feeding history results in some breast-feeding mothers being randomly classified as bottle-feeding, and vice-

versa ..

Page 78: Confounding Lecture

EXAMPLE OF BIAS

The medical records of bottle-fed babies only are less complete (perhaps bottle fed babies go to the doctor less) than those of breast fed babies, and thus record fewer episodes of gastro-enteritis in them only. This is called bias because the observation itself is in error.

Page 79: Confounding Lecture

EXAMPLE OF CONFOUNDING The mothers of breast-fed babies are of higher social class, and the babies thus have better hygiene, less crowding and perhaps other factors that protect against gastroenteritis. Crowding and hygiene are truly protective against gastroenteritis, but we mistakenly attribute their effects to breast feeding. This is called confounding. because the observation is correct, but its explanation is wrong.

Page 80: Confounding Lecture

•Sampling•Sample Size•Study design

•Sources of data collection•Methods of data collection

•Content of information

Prevention of Bias

Page 81: Confounding Lecture

Selection bias

• Error because the associationexposure disease

is different for participants and non-participants in the study

• Errors in the– procedures to select participants– factors that influence participation

Page 82: Confounding Lecture

Examples of selection bias

• Self-selection bias• Non-response• Loss to follow-up

Page 83: Confounding Lecture

Self selection bias

• Selection bias is the distortion of statistics by the way in which a sample is selected.

• Self-selection bias is the distortion caused when the sample chooses itself — certain characteristics are over-represented because theycorrelate with willingness to be included.

Page 84: Confounding Lecture

Non response

• non-response occurs when certain questions in a survey are not answered by a respondent.

• non-response takes place also when a randomly sampled individual cannot be contacted or refuses to participate in a survey.

Page 85: Confounding Lecture

Sources of selection bias

Inappropriate selection of study subjects from the study population

• – non-random selection of subjects from the same population

• – selection of subjects from different or ill-defined study populations

• – failure to locate or unwillingness of people to participate

• – loss of persons from the study population because of the health outcome eg selective survival

Page 86: Confounding Lecture

example selection biassuppose we would like to conduct a case–control study of the association

between liver cancer and smoking. Cases (those identified as having liver cancer) could be all available

individuals in all the hospitals in town during the year of the study.Controls (individuals without history of liver cancer) would berecruited by local mass media advertisements—hence they would be

volunteers. The study results would most probably show a strong association between smoking and liver cancer, not necessarily because smoking and liver cancer are related, but because the selection process was different for cases and controls. Although the cases were arguably sampled from the population at large, the controls were sampled from a population of volunteers!

Page 87: Confounding Lecture

Preventing selection bias

• Same selection criteria• High response-rate• High rate of follow-up

Page 88: Confounding Lecture

Information bias

• Error because the measurement of exposure or disease

is different between the comparison groups.• Errors in the

– procedures to measure exposure– procedures to diagnose disease

Page 89: Confounding Lecture

Examples of information bias

• Diagnostic bias• Recall bias• Researcher influence

Page 90: Confounding Lecture

Measurement bias

Inaccurate measurement of study variables can lead to bias

Sources of inaccurate measurement:• • subject error – error within the individual for any

reason, eg imperfect recall of past exposures• • Instrument error – eg equipment not properly

calibrated, wording of question• • Observer error – error in use of instrument or

recording

Page 91: Confounding Lecture

Types of measurement error

Non-differential error• the inaccuracies of measurement are the same among

subgroups of subject• Non-differential measurement error in exposure and

outcome will always lead to bias towards finding no effect

Differential error• the inaccuracies of measurement are different among

subgroups of subject can lead to bias towards or away from no effect

Page 92: Confounding Lecture

Misclassification

Dog No dog

TBE-cases 20 20 OR = ad/bc = 3,0

Controls 20 60

Dog No dog

TBE-cases 24 16 OR = ad/bc = 4,5

Controls 20 60

Dog No dog

TBE-cases 24 16 OR = ad/bc = 2,8

Controls 28 52

True

Differential

Non-differential

Page 93: Confounding Lecture

Non-differential misclassification

• Same degree of misclassification in both cases and controls

• OR will be underestimated– True value is higher

• If no causal effect found, ask:– Could it be due to non-differential

misclassification?

Page 94: Confounding Lecture
Page 95: Confounding Lecture

Preventing information bias• Clear definitions• Good measuring methods• Blinding• Standardised procedures• Quality control

Page 96: Confounding Lecture

Minimising measurement bias1. use valid reliable tools to measure all studysubjects2. train staff and monitor their use of researchtools3. regular quality checks of research tools4. blinding of study subjects and assessors5. subjects in C-C study unaware of studyhypothesis6. consider sub-study to determine validity andreliability of measurements

Page 97: Confounding Lecture

Confounding

It occurs when there is a confounder, which is associated with both exposure

and disease independently .

Exposure Disease

Confounder

SLIDE 97

Page 98: Confounding Lecture

Confounding

• defined as: a situation in which the measure of effect of exposure on disease is distorted because of the association of the study factor with other factors that influence the outcome.

• These other factors are called confounders

Page 99: Confounding Lecture

A variable can be a confounder if all the following conditions are met:

• It is associated with the exposure of interest (causally or not).

• It is causally related to the outcome.

• AND ... It is not part of the exposure outcome causal pathway

Page 100: Confounding Lecture

Alcohol Lung cancer

Smoking

Confounding: ExampleConfounding: Example

Page 101: Confounding Lecture

Confounding: exampleConfounding: example

Drinker

Non-drinker

100 200

Lung cancer No lung cancer

50 50

50 150

50% of cases are drinkers, but only 25% of controls are drinkers.

Therefore, it appears that drinking is strongly associated with lung cancer.

Page 102: Confounding Lecture

Confounding: exampleConfounding: example

Drinker

Non-drinker

Lung cancer No lung cancer

45 15

30 10

Drinker

Non-drinker

Lung cancer No lung cancer

5 35

20 140

Smoker

Non-smoker

Among smokers, 45/75=60% of lung

cancer cases drink and

15/25=60% of controls drink.

Among non-smokers 5/25=20% of lung

cancer cases drink and

35/175=20% of controls drink.

75

25

25

175

Page 103: Confounding Lecture

Stratification: “Series of 2x2 tables”

Idea: Take a 2x2 table and break it into a series of smaller 2x2 tables (one table at each of J

levels of the confounder yields J tables).

Example: in testing for an association between lung cancer and alcohol drinking (yes/no),

separate smokers and non-smokers.

Page 104: Confounding Lecture

An Example

Maternal coffee consumption during

pregnancy

Delivery of low birth weight infant

?

Page 105: Confounding Lecture

Example

Low Birth Weight

Normal Birth Weight

Coffee17096

No Coffee9088

Crude OR = (170)(88) / (96)(90) = 1.73Crude OR = (170)(88) / (96)(90) = 1.73

Page 106: Confounding Lecture

SmokersLow Birth Weight

Normal Birth Weight

Coffee16016

No Coffee808

Stratum-specific OR = (160)(8) / (16)(80) = 1.00Stratum-specific OR = (160)(8) / (16)(80) = 1.00

Page 107: Confounding Lecture

Non-smokersLow Birth Weight

Normal Birth Weight

Coffee1080

No Coffee1080

Stratum-specific OR = (10)(80) / (80)(10) = 1.00Stratum-specific OR = (10)(80) / (80)(10) = 1.00

Page 108: Confounding Lecture

Evidence of ConfoundingORcrude = 1.73

ORsmokers = 1.00

ORnon-smokers = 1.00

The association between coffee consumption and having a low birth weight baby is

confounded by smoking. This is demonstrated by the lack of effect in each stratum.

Page 109: Confounding Lecture

Strategy #1: Does the variable meet the criteria to be a confounder?

Hypothetical case-control study of risk factors for malaria. 150 cases, 150 controls; gender distribution.

Cases ControlsMales 88 68

Females 62 82150 150

Question:Is male gender causally related to the risk of malaria?

Yes

No

Further study is needed

OR= [88 x 82] ÷ [68 x 62] = 1.71

Page 110: Confounding Lecture

Malaria

Malegender

?

Confounder for a male gender-malaria association?

?

Page 111: Confounding Lecture

Malaria

Malegender

?

Confounder for a male gender-malaria association?

Outdooroccupation

Page 112: Confounding Lecture

Malaria

Malegender

?Outdooroccupation

?

First criterion: Is the putative confounder associated with exposure?

Page 113: Confounding Lecture

. Males Females N (%) N (%)

Outdoor 68 (43.5) 13 (9.0) Indoor 88 131

156 (100) 144 (100)

Question:Is outdoor occupation associated with male gender?

Yes

No

OR=7.8

First criterion: Is the putative confounder associated with exposure?

Page 114: Confounding Lecture

Malaria

Malegender

?Outdooroccupation

?

Second criterion: Is the putative confounder associated with the outcome (case-control status)?

Page 115: Confounding Lecture

. Cases Controls N (%) N (%)

Outdoor 63 (42.0) 18 (12.0) Indoor 87 132

150 (100) 150 (100)

Question:Is outdoor occupation (or something for which this variable is a marker of --e.g., exposure to mosquitoes) causally related to malaria?

Yes

No

OR=5.3

Malaria

Second criterion: Is the putative confounder associated with case-control status?

Page 116: Confounding Lecture

Third criterion: Is the putative confounder in the causal pathway exposure outcome?

.

Malaria

Malegender

?Outdoor

occupation

?

Yes, it could be

Probably not

Note: Judgment and knowledge about the socio-cultural context are critical to answer

this question

Page 117: Confounding Lecture

Question :Provided that:• Crude association between male gender and malaria: OR=1.71

and • ... Outdoor occupation is more frequent among males, and• ... Outdoor occupation is associated with greater risk of malaria …

What would be the expected magnitude of the association between male gender and malaria after controlling for occupation (i.e., assuming the same degree of outdoor occupation in males and females)?

The (adjusted) association estimate will be smaller than 1.71

The (adjusted) association estimate will =1.71

The (adjusted) association estimate will greater than 1.71

Page 118: Confounding Lecture

Controlling confounding

In the design• Restriction of the study• Matching

In the analysis•Restriction of the analysis•Stratification•Multivariable regression

Page 119: Confounding Lecture

StrategyAdvantagesDisadvantagesSpecification“Include only non-smokers.”

• Easily understood• Limits generalizability• May limit sample size

Matching“Match smoking status of cases and controls”

• Useful for eliminating influence of strong constitutional confounders like age and sex

• Decision to match must be made when designing and can have irreversible adverse effects on analysis• Time consuming• Can not analyze associations of matched variables with the outcome

SLIDE 119

Control confounding at the designing stage

Page 120: Confounding Lecture

StrategyAdvantagesDisadvantagesStratification“Conduct analysis separately for smokers and non-smokers.”

• Easily understood• Reversible

• May be limited by sample size for each stratum• Difficult to control for multiple confounders

Statistical adjustment“Conduct multivariate analysis controlling (adjusting) for smoking status.”

• Multiple confounders can be controlled.• Reversible

• Need advanced statistical techniques• Results may be difficult to understand

SLIDE 120

Control confounding at the analysis stage

Page 121: Confounding Lecture

Restriction

We study only mothers of a certain age

Many children Downs’

35 year old mothers

Page 122: Confounding Lecture

Matching

“Selection of controls to be identical to the cases with respect to distribution of one or

more potential confounders”.

Many children Downs’

Maternal age

Page 123: Confounding Lecture

Multivariable regression

• Analyse the data in a statistical model that includes both the presumed cause and possible confounders

• Measure the odds ratio OR for each of the exposures, independent from the others

• Logistic regression is the most common model in epidemiology

Page 124: Confounding Lecture

Example miners exposure and lung cancer

• one group of miners exposed to the underground environment and the other group not exposed

• Hence any differences in lung cancer rate would be due to exposure to working underground

Page 125: Confounding Lecture

ExampleSelection bias and confounding

Bias occurs when the exposed and nonexposed groups have different risks ofdeveloping the outcome of interest forreasons other than being exposed.This can be due to selection bias orconfoundingeg. more underground workers smoke

Page 126: Confounding Lecture

Confounding variables

In our study of miners:1. .smoking is an independent risk factor(cause) of the disease (lung cancer)2. .more underground miners smoke – iesmoking is unevenly distributed amongthe exposed and non-exposed3. .smoking is not on the causal pathwaybetween exposure and disease

Page 127: Confounding Lecture

•When examining the relationship between an explanatory factor and an outcome, we are interested in identifying factors that may modify the factor's effect on the outcome (effect modifiers). We must also be aware of potential bias or confounding in a study because these can cause a reported association (or lack thereof) to be misleading. Bias and confounding are related to the measurement and study design. Let 's define these terms:

Page 128: Confounding Lecture

•If the method used to select subjects or collect data results in an incorrect association. ,

•THINK >> Bias !•If an observed association is not correct because a different

(lurking) variable is associated with both the potential risk factor and the outcome, but it is not a causal factor itself ,

•THINK >> Confounding !

•If an effect is real but the magnitude of the effect is different for different groups of individuals (e.g., males vs females or blacks vs whites).

•THINK >> Effect modification!

Page 129: Confounding Lecture

A confounding factors

• is one that affects both the exposure and the disease-that is (has an association with both the disease and the risk factor under study) that may distort relationships between the two and confound (confuse) the study results.

Page 130: Confounding Lecture

Exposure Outcome

Third variable

ConfoundingConfounding

Page 131: Confounding Lecture

Coffee CHD

Smoking

ConfoundingConfounding

Smoking is correlated with coffee drinking and a risk factor even for those who do not drink coffee

Page 132: Confounding Lecture

Confounding factor :

– Drinking coffee causes CHD

– Drinking coffee may not be the cause of CHD, but rather the fact that smokers are also coffee drinkers.

Page 133: Confounding Lecture

133

Confounding

Risk FactorIndependent

VariableCoffee

Disease Dependent Variable

CHD

CovariableConfounder

Smoking

Page 134: Confounding Lecture

Example:

• In a study of the association between tobacco smoking and lung cancer, age would be a confounding factor if the average ages of the non-smoking and smoking groups in the study population were very different, since lung cancer incidence increases with age.

Page 135: Confounding Lecture

Another example:

• the possible association between meat consumption and cancer colon may be due to other accompanying factors such as decreased intake of vegetables or increased intake of fat rather than the meat consumption itself.

Page 136: Confounding Lecture

problem•The annual report of POF Hospital for the year 2006 shows

200 cases of Myocardial Infarction, 35 cases of Cholecystitis, 105 cases of Pneumonia and 350 cases of Acute Gastroenteritis. The result of this report cannot be generalized on the total population of Faisalabad on account of: a. Confounding bias b. Memory bias c. Selection bias d. Berkesonian bias e. Interviewer’s bias

Key: True: d 

Page 137: Confounding Lecture

•Mother’s education is therefore a potential confounding variable.

•In order to give a true picture of the relationship between bottle-feeding and diarrhea of under-twos, the influence of mother’s education should be controlled.

•This could either be addressed in the research design, e.g., by selecting only mothers with a specific level of education, or it could be taken into account during the analysis of the findings by analyzing the relation between bottle-feeding and diarrhea separately for mothers with different levels of education.

Page 138: Confounding Lecture

A study was done to compare the lung capacity of coal miners to the lung capacity of farm workers. The researcher studied 200 workers of each type. Other factors that might affect lung capacity are smoking habits and exercise habits. The smoking habits of the two worker types are similar, but the coal miners generally exercise less than the farm workers

1 .Which of the following is the explanatory variable in this study?a. Exerciseb. Lung capacityc. Smoking or notd. Occupation

2 .Which of the following is a confounding variable in this study?a. Exerciseb. Lung capacityc. Smoking or notd. Occupation

Page 139: Confounding Lecture

• Essential principles (features) of properly designed clinical trials:

• Control of variables surrounding the experimental subjects

• The investigator has control of the subjects, the intervention, outcome measurements, and sets the conditions under which the experiment is conducted. In particular, the investigator determines who will be exposed to the intervention and who will not. This selection is done in such a way that the comparison of outcome measure between the exposed and unexposed groups is as free of bias as possible.

Page 140: Confounding Lecture

• Randomization refers to the practice of assigning subjects to experimental or treatment groups in a completely random manner. Thus each subject has an equal chance of being placed in the experimental group. This avoids the potential bias of the researcher choosing subjects s/he feels would be most likely to benefit from the intervention for the intervention group, and a similar possible bias if the choice were left up to the subjects.

Page 141: Confounding Lecture

• Blindness refers to the practice in which the researcher remains uninformed and unaware

of the identities of experimental and control groups throughout the period of

experimentation and data gathering. Thus, the researcher can remain unbiased in judging the

responses of any particular subject or group.•

Page 142: Confounding Lecture

•When studies involve human subjects, it is important that the subjects also remain

uninformed as to whether they have been placed in the experimental group (receiving

the treatment) or control group (receiving the placebo). Such procedure is referred to as

double-blind (neither researcher nor subjects know who is receiving the treatment) .

Page 143: Confounding Lecture

• ). This is important because some people begin to feel better if they believe they have received a treatment. Only at the end of the study would the ‘code’ (known by the statistician) be broken and the results analyzed according to who had been taking the drug and who had not.

Page 144: Confounding Lecture

•So, the gold standard design for clinical trials, i.e., the least prone to bias, is the randomized

double-blind controlled trial.