R.A. Spasoff, MD Epidemiology & Community Medicine

105
April 2011 1 Back to Basics, 2011 POPULATION HEALTH (1): Epidemiology Methods, Critical Appraisal, Biostatistical Methods R.A. Spasoff, MD Epidemiology & Community Medicine Other resources available on Individual & Population Health web site

description

Back to Basics, 2011 POPULATION HEALTH (1): Epidemiology Methods, Critical Appraisal, Biostatistical Methods. R.A. Spasoff, MD Epidemiology & Community Medicine Other resources available on Individual & Population Health web site. THE PLAN. - PowerPoint PPT Presentation

Transcript of R.A. Spasoff, MD Epidemiology & Community Medicine

Page 1: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 1

Back to Basics, 2011POPULATION HEALTH (1):

Epidemiology Methods, Critical Appraisal,

Biostatistical Methods

R.A. Spasoff, MDEpidemiology & Community Medicine

Other resources available on Individual & Population Health web site

Page 2: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 2

THE PLAN

• These lectures are based around the MCC Objectives for Qualifying Examination

• Emphasis is on core ‘need to know’ rather than on depth and justification

• Focus is on topics not well covered in the Toronto Notes (UTMCCQE)

Page 3: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 3

THE PLAN(2)

• First class– mainly lectures

• Other classes– About 2 hours of lectures– Review MCQs for 60 minutes

• A 10 minute break about half-way through• You can interrupt for questions, etc. if

things aren’t clear.

Page 4: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 4

THE PLAN (3)

• Session 1 (April 11, 13:00-16:00)– Diagnostic tests

• Sensitivity, specificity, validity, PPV

– Critical Appraisal– Intro to Biostatistics– Brief overview of epidemiological research

methods

Page 5: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 5

Reliability

• = reproducibility. Does it produce the same result every time?

• Related to chance error

• Averages out in the long run, but in patient care you hope to do a test only once; therefore, you need a reliable test

Page 6: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 6

Validity

• Whether it measures what it purports to measure in long run, viz., presence or absence of disease

• Normally use criterion validity, comparing test results to a gold standard

• Link to SIM web on validity

Page 7: R.A. Spasoff, MD Epidemiology & Community Medicine

March 30, 2010 7

Reliability and Validity: the metaphor of target shooting. Here, reliability is represented by consistency, and validity by aim

Reliability Low High

Low

Validity

High

••

• •

••

•••

•••

•• ••••

Page 8: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 8

Test Properties (1)Diseased Not diseased

Test +ve 90 5 95

Test -ve 10 95 105

100 100 200

True positives False positives

False negatives True negatives

Page 9: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 9

2x2 Table for Testing a Test (columns)

Gold standard

Disease Disease

Present Absent

Test Positive a (TP) b (FP)

Test Negative c (FN) d (TN)

Sensitivity Specificity

= a/(a+c) = d/(b+d)

Page 10: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 10

Test Properties (2)

Diseased Not diseased

Test +ve 90 5 95

Test -ve 10 95 105

100 100 200

Sensitivity = 0.90 Specificity = 0.95

Page 11: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 11

Test Properties (6)• Sensitivity =Pr(test positive in a person

with disease)• Specificity = Pr(test negative in a person

without disease)• Range: 0 to 1

– > 0.9: Excellent– 0.8-0.9: Not bad– 0.7-0.8: So-so– < 0.7: Poor

Page 12: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 12

Test Properties (7)

• Values depend on cutoff point

• Generally, high sensitivity is associated with low specificity and vice-versa.

• Not affected by prevalence, if severity is constant

• Do you want a test to have high sensitivity or high specificity?– Depends on cost of ‘false positive’ and ‘false negative’

cases

– PKU – one false negative is a disaster

– Ottawa Ankle Rules: insisted on sensitivity of 1.00

Page 13: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 13

Test Properties (8)

• Sens/Spec not directly useful to clinician, who knows only the test result

• Patients don’t ask: “If I’ve got the disease, how likely is a positive test?”

• They ask: “My test is positive. Does that mean I have the disease?”

• → Predictive values.

Page 14: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 14

Predictive Values

• Based on rows, not columns

– PPV = a/(a+b); interprets positive test

– NPV = d/(c+d); interprets negative test

• Depend upon prevalence of disease, so must be determined for each clinical setting

• Immediately useful to clinician: they provide the probability that the patient has the disease

Page 15: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 15

2x2 Table for Testing a Test (rows)

Gold standard

Disease Disease

Present Absent

Test + a (TP) b (FP) PPV = a/(a+b)

Test - c (FN) d (TN) NPV= d/(c+d)

a+c b+d N

Page 16: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 16

Test Properties (9)Diseased Not diseased

Test +ve 90 5 95

Test -ve 10 95 105

100 100 200

PPV = 0.95

NPV = 0.90

Page 17: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 17

Prevalence of Disease

• Is your best guess about the probability that the patient has the disease, before you do the test

• Also known as Pretest Probability of Disease

• (a+c)/N in 2x2 table

• Is closely related to Pre-test odds of disease: (a+c)/(b+d)

Page 18: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 18

Test Properties (10)Diseased Not diseased

Test +ve a b a+b

Test -ve c d c+d

a+c b+d a+b+c+d =N

Prevalence odds

Prevalence proportion

Page 19: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 19

Prevalence and Predictive Values

• Predictive values of a test are dependent on the pre-test prevalence of the disease

– Tertiary hospitals see more pathology then FP’s; hence, their tests are more often true positives.

• How to ‘calibrate’ a test for use in a different setting?

• Relies on the stability of sensitivity & specificity across populations.

Page 20: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 20

Methods for Calibrating a Test

Four methods can be used:– Apply definitive test to a consecutive series of

patients (rarely feasible)– Hypothetical table– Bayes’s Theorem– Nomogram

You need to be able to do one of the last 3. By far the easiest is using a hypothetical table. E.g., sens = 0.90, spec =0.95

Page 21: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 21

Calibration by hypothetical table

Fill cells in following order:

“Truth”

Disease Disease Total PV

Present Absent

Test Pos 4th 7th 8th 10th

Test Neg 5th 6th 9th 11th

Total 2nd 3rd 1st (10,000)

Page 22: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 22

Test Properties (11)

Diseased Not diseased

Test +ve 450 25 475

Test -ve 50 475 525

500 500 1,000

Tertiary care: research study. Prev=0.5

PPV = 0.89

Sens = 0.90 Spec = 0.95

Page 23: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 23

Test Properties (12)

Diseased Not diseased

Test +ve

Test -ve

10,000

Primary care: Prev=0.01

PPV = 0.1538

9,900

90

10

100

495

9,405

585

9,415

Sens = 0.90 Spec = 0.95

Page 24: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 24

Calibration by Bayes’ Theorem

• You don’t need to learn Bayes’ theorem

• Instead, work with the Likelihood Ratio (+ve)

• (Equivalent process exists for Likelihood Ratio (–ve), but we shall not calculate it here)

Page 25: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 25

Test Properties (13)Diseased Not

diseased

Test +ve

90 5 95

Test -ve

10 95 105

100 100 200 Pre-test odds = 1.00

Post-test odds (+ve) = 18.0

Post-test odds (+ve) = LR(+) * Pre-test odds = 18.0 * 1.0 = 18.0, but of course you do not know the LR(+)

Page 26: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 26

Calibration by Bayes’s Theorem

• You can convert sens and spec to likelihood ratios

LR(+) = sens/(1-spec)LR(+) is fixed across populations just like

sensitivity & specificity.• Bigger is better.• Posttest odds(+) = pretest odds * LR(+)

– Convert to posttest probability if desired…

Page 27: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 27

Converting odds to probabilities

• Pre-test odds = prevalence/(1-prevalence)– if prevalence = 0.20, then pre-test odds

= .20/0.80 = 0.25

• Post-test probability = post-test odds/(1+post-test odds)

– if post-test odds = 0.25, then prob = .25/1.25 = 0.20

Page 28: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 28

Calibration by Bayes’s Theorem

• How does this help?• Remember:

– Post-test odds(+) = pretest odds * LR(+)

• To ‘calibrate’ your test for a new population:– Use the LR(+) value from the reference source

– Estimate the pre-test odds for your population

– Compute the post-test odds

– Convert to post-test probability to get PPV

Page 29: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 29

Example of Bayes’s Theorem(‘new’ prevalence 1%, sens 90%, spec 95%)

• LR(+) = .90/.05 = 18 (>>1, pretty good)

• Pretest odds = .01/.99 = 0.0101

• Positive Posttest odds = .0101*18 = .1818

• PPV = .1818/1.1818 = 0.1538 = 15.38%

• Compare to the ‘hypothetical table’ method (PPV=15.38%)

Page 30: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 30

Calibration with Nomogram

• Graphical approach avoids some arithmetic• Expresses prevalence and predictive values

as probabilities (no need to convert to odds)• Draw lines from pretest probability

(=prevalence) through likelihood ratios; extend to estimate posttest probabilities

• Only useful if someone gives you the nomogram!

Page 31: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 31

Example of Nomogram (pretest probability 1%, LR+ 18, LR– 0.105)

Pretest Prob. LR Posttest Prob.

1%

18

.105

15%

0.01%

Page 32: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 32

Are sens & spec really constant?

• Generally, assumed to be constant. BUT…..• Sensitivity and specificity usually vary with

severity of disease, and may vary with age and sex • Therefore, you can use sensitivity and specificity

only if they were determined on patients similar to your own

• Risk of spectrum bias (populations may come from different points along the spectrum of disease)

Page 33: R.A. Spasoff, MD Epidemiology & Community Medicine

Cautionary Tale #1: Data Sources

April 2011 33

The Government is extremely fond of amassinggreat quantities of statistics. These are raised to the nth degree, the cube roots are extracted, and

the results are arranged into elaborate and impressive displays. What must be kept ever in

mind, however, is that in every case, the figures are first put down by a village watchman, and he puts

down anything he damn well pleases!

Sir Josiah Stamp,Her Majesty’s Collector of Internal Revenue.

Page 34: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 34

78.2: CRITICAL APPRAISAL (1)

• “Evaluate scientific literature in order to critically assess the benefits and risks of current and proposed methods of investigation, treatment and prevention of illness”

• UTMCCQE does not present hierarchy of evidence (e.g., as used by Task Force on Preventive Health Services)

Page 35: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 35

Hierarchy of evidence(lowest to highest quality, approximately)

• Expert opinion• Case report/series• Ecological (for individual-level exposures)• Cross-sectional• Case-Control• Historical Cohort• Prospective Cohort• Quasi-experimental• Experimental (Randomized)

}similar/identical

Page 36: R.A. Spasoff, MD Epidemiology & Community Medicine

Cautionary Tale #2: Analysis

April 2011 36

Consider a precise number: the normal body temperature of 98.6F. Recent investigations involving millions of measurements have shown that this number is wrong: normal body temperature is actually 98.2F. The fault lies not with the original measurements - they were averaged and sensibly rounded to the nearest degree: 37C. When this was converted to Fahrenheit, however, the rounding was forgotten and 98.6 was taken as accurate to the nearest tenth of a degree.

Page 37: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 37

BIOSTATISTICSCore concepts (1)

• Sample: A group of people, animals, etc. which is used to represent a larger ‘target’ population.– Best is a random sample

– Most common is a convenience sample.• Subject to strong risk of bias.

• Sample size: the number of units in the sample• Much of statistics concerns how samples relate to

the population or to each other.

Page 38: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 38

BIOSTATISTICSCore concepts (2)

• Mean: average value. Measures the ‘centre’ of the data. Will be roughly in the middle.

• Median: The middle value: 50% above and 50% below. Used when data is skewed.

• Variance: A measure of how spread out the data are. Defined by subtracting the mean from each observation, squaring, adding them all up and dividing by the number of observations.

• Standard deviation: square root of the variance.

Page 39: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 39

Core concepts (3)

• Standard error: SD/n, where n is sample size. Is the standard deviation of the sample mean, so measures the variability of that mean.

• Confidence Interval: A range of numbers which tells us where we believe the correct answer lies. For a 95% confidence interval, we are 95% sure that the true value lies in the interval, somewhere.– Usually computed as: mean ± 2 SE

Page 40: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 40

Example of Confidence Interval

• If sample mean is 80, standard deviation is 20, and sample size is 25 then:– SE = 20/5 = 4. We can be 95% confident that

the true mean lies within the range 80 ± (2*4) = (72, 88).

• If the sample size were 100, then SE = 20/10 = 2.0, and 95% confidence interval is 80 ± (2*2) = (76, 84). More precise.

Page 41: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 41

Core concepts (4)

• Random Variation (chance): every time we measure anything, errors will occur. In addition, by selecting only a few people to study (a sample), we will get people with values different from the mean, just by chance. These are random factors which affect the precision (SD) of our data but not the validity. Statistics and bigger sample sizes can help here.

Page 42: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 42

Core concepts (5)

• Bias: A systematic factor which causes two groups to differ. For example, a study uses a collapsible measuring scale for height which was incorrectly assembled (with a 1” gap between the upper and lower section).– Over-estimates height by 1” (a bias).

• Bigger numbers and statistics don’t help much; you need good design instead.

Page 43: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 43

BIOSTATISTICSInferential Statistics

• Draws inferences about populations, based on samples from those populations. Inferences are valid only if samples are representative (to avoid bias).

• Polls, surveys, etc. use inferential statistics to infer what the population thinks based on talking to a few people.

• RCTs use them to infer treatment effects, etc.• 95% confidence intervals are a very common way

to present these results.

Page 44: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 44

Population from which sample is drawn Sample

Target population

Inferences drawn

(Confidence intervalused to indicate

accuracy of extrapolating

results to broaderpopulation from which

sample was drawn)

Your practicepatients

Page 45: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 45

┼ ┼

Increasing random error

Increasing systematic error (bias)

Population parameter

Results from different samples

Effects of bias and random error on study results

┼ ┼

Page 46: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 46

Hypothesis Testing

• Used to compare two or more groups.– We first assume that the two groups are the same.

– Compute some statistic which, under this null hypothesis (H0), should be ‘0’.

– If we find a large value for the statistic, then we can conclude that our assumption (hypothesis) is unlikely to be true (reject the null hypothesis).

• Formal methods use this approach by determining the probability that the value you observe could occur (p-value). Reject H0 if that value exceeds the critical value expected from chance alone.

Page 47: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 47

Hypothesis Testing (2)

• Common methods used are:– T-test– Z-test– Chi-square test– ANOVA

• Approach can be extended through the use of regression models– Linear regression

• Toronto notes are wrong in saying this relates 2 variables. It can relate many independent variables to one dependent variable.

– Logistic regression– Cox models

Page 48: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 48

Hypothesis Testing (3)• Interpretation requires a p-value and

understanding of type 1 and type 2 errors.• P-value: the probability of observing a value of

your statistic which is as big or bigger than you would find IF the null hypothesis were true– This is not quite the same as saying the chance that the

difference is ‘real’

• Power: The chance you will find a difference between groups when there really is a difference (of a given amount). Depends on how big a difference you treat as important

Page 49: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 49

Hypothesis testing (4)

No effect Effect

No effect No error Type 2 error (β)

Effect Type 1 error (α)

No error

Actual Situation

Results of Stats Analysis

Page 50: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 50

Example of significance test

• Association between sex and smoking: 35 of 100 men smoke but only 20 of 100 women smoke

• Calculated chi-square is 5.64. The critical value is 3.84 (from table, for α = 0.05). Therefore reject H0

• P=0.018 (from a table). Under H0 (chance alone), a chi-square value as large as 5.64 would occur only 1.8% of the time.

Page 51: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 51

How to improve your chance of finding a difference

• Increase sample size

• Improve precision of the measurement tools used (reduces standard deviation)

• Use better statistical methods

• Use better designs

• Reduce bias

Page 52: R.A. Spasoff, MD Epidemiology & Community Medicine

Cautionary Tale #3: Anecdotes

April 2011 52

Laboratory and anecdotal clinical evidence suggest that some common non-antineoplastic drugs may affect the course of cancer. The authors present two cases that appear to be consistent with such a possibility: that of a 63-year-old woman in whom a high-grade angiosarcoma of the forehead improved after discontinuation of lithium therapy and then progressed rapidly when treatment with carbamezepine was started, and that of a 74-year-old woman with metastatic adenocarcinoma of the colon which regressed when self-treatment with a non-prescription decongestant preparation containing antihistamine was discontinued. The authors suggest ...... ‘that consideration be given to discontinuing all nonessential medications for patients with cancer.’

Page 53: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 53

Epidemiology overview

• Key study designs to examine (SIM web link)

– Case-control– Cohort– Randomized Controlled Trial (RCT)

• Confounding• Relative Risks/odds ratios

– All ratio measures have the same interpretation• 1.0 = no effect• < 1.0 protective effect• > 1.0 increased risk

– Values over 2.0 are of strong interest

Page 54: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 54

The Epidemiological Triad

Host Agent

Environment

Page 55: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 55

Terminology

• Incidence: The probability (chance) that someone without the outcome will develop it over a fixed period of time. Relates to new cases of disease. Useful for studying causes of illness.

• Prevalence: The probability that a person has the outcome of interest today. Relates to existing cases of disease. Useful for measuring burden of illness.

Page 56: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 56

Prevalence

• On July 1, 2007, 140 graduates from the U. of O. medical school start working as interns.

• Of this group, 100 had insomnia the night before.

• Therefore, the prevalence of insomnia is:

100/140 = 0.72 = 72%

Page 57: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 57

Incidence Proportion (risk)• On July 1, 2007, 140 graduates from the U.

of O. medical school start working as interns.

• Over the next year, 30 develop a stomach ulcer.

• Therefore, the incidence proportion (risk) of an ulcer is:

30/140 = 0.21 = 214/1,000 over 1 yr

Page 58: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 58

Incidence Rate (1)• Incidence rate is the ‘speed’ with which

people get ill.• Everyone dies (eventually). It is better to

die later death rate is lower.• Compute with person-time denominator:

PT = # people * duration of follow-up

# new casesIR = ---------------------------/year PT of follow-up

Page 59: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 59

Incidence rate (2)• 140 U. of O. medical students, followed

during their residency– 50 did 2 years of residency– 90 did 4 years of residency– Person-time = 50 * 2 + 90 * 4 = 460 PY’s

• During follow-up, 30 developed ‘stress’.• Incidence rate of stress is:

30IR = -------- = 0.065/PY = 65/1,000 PY 460

Page 60: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 60

Prevalence & incidence

• As long as conditions are ‘stable’ and disease is fairly rare, we have this relationship:

• That is, prevalence ≈ incidence rate * average disease duration

P ≈ IR * d

Page 61: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 61

Case-control study• Selects subjects based on their final outcome.

– Select a group of people with the outcome/disease (cases)

– Select a group of people without the outcome (controls)

– Ask them about past exposures

– Compare the frequency of exposure in the two groups• If exposure increases risk, there should be more exposed cases

than controls

– Compute an Odds Ratio

– Under many conditions, OR ≈ RR

Page 62: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 62

Case-control (2)

YES NO

YES a b a+b

NO c d c+d

a+c b+d N

Disease?

Exp?

ODDS RATIO

Odds of exposure in cases = a/cOdds of exposure in controls = b/d

If exposure increases rate of getting disease, you would to find more exposed cases than exposed controls. That is, the odds of exposure for case would be higher (a/c > b/d). This can be assessed by the ratio of one to the other:

Exp odds in casesOdds ratio (OR) = ----------------------------- Exp odds in controls

ad = (a/c)/(b/d) = ----------

bc

Page 63: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 63

Yes No

Low 0-3 42 18

OK 4-6 43 67

85 85

Apgar

Odds of exp in cases: = 42/43 = 0.977Odds of exp in controls: = 18/67 = 0.269

Odds ratio (OR) = Odds in cases/odds in controls

= 0.977 / 0.269 = (42*67) / (43*18)

= 3.6

Case-control (3)Disease

Page 64: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 64

Disease(cases)

No disease(controls)

Exposed

Unexposed

Exposed

Unexposed

The study begins by selecting

subjects based on

Reviewrecords

Reviewrecords

Page 65: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 65

Cohort study

• Selects non-diseased subjects based on their exposure status; follows to determine outcome.– Select a group of people with the exposure of interest– Select a group of people without the exposure

• Can also simply select a group of people and study a range of exposures.

– Follow the group to determine what happens to them.– Compare the incidence of the disease in exposed and

unexposed people• If exposure increases risk, there should be more

cases in exposed subjects than unexposed subjects– Compute a relative risk.

Page 66: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 66

Cohorts (2)

YES NO

YES a b a+b

NO c d c+d

a+c b+d N

Disease

Exp

RISK RATIO

Risk in exposed: = a/(a+b)Risk in Non-exposed = c/(c+d)

If exposure increases risk, you would expect a/(a+b) to be larger than c/(c+d). How much larger can be assessed by the ratio of one to the other: Exp riskRisk ratio (RR) = ---------------------- Non-exp risk

= (a/(a+b))/(c/(c+d)

a/(a+b)= -------------- c/(c+d)

Page 67: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 67

Cohorts (3)

YES NO

Low 0-3 42 80 122

OK 4-6 43 302 345

85 382 467

Death

Apgar

Risk in exposed: = 42/122 = 0.344Risk in Non-exposed = 43/345 = 0.125

Exp riskRisk ratio (RR) = ---------------------- Non-exp risk

= 0.344/0.125

= 2.8

Page 68: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 68

Exposed group

Unexposedgroup

No disease

Disease

No disease

Disease

time

Study begins Outcomes

Page 69: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 69

Confounding

• Mixing of effects of two causes. Can be positive or negative

• Confounder is an extraneous factor which is associated with both exposure and outcome, and is not an intermediate step in causal pathway

Page 70: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 70

The Confounding Triangle

Exposure Outcome

Confounder

Causal

Association

Page 71: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 71

Birth of an infantwith Down’ssyndrome

Birth order4th or higher

1. Study observes a statistical tendency for children born fourth or later in order among their siblings (exposure) and having

Down’s syndrome (outcome).

2. When maternal age is taken into account, it becomes clear that higher birth order is linked to higher maternal age, and that

maternal age is a more plausible causal factor for Down’s syndrome.

3. Adjusting statistically for maternal age causes the association between birth order and Down’s to disappear, suggesting that maternal age was,

indeed, a confounding factor.

Maternal age> 35 years

Exposure Outcome

Confounder

Page 72: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 72

Confounding (example)

• Does heavy alcohol drinking cause mouth cancer? We get OR=3.4 (95% CI: 2.1-4.8).

• Smoking causes mouth cancer• Heavy drinkers tend to be heavy smokers.• Smoking is not part of causal pathway for alcohol.• Therefore, we have confounding.• We do a statistical adjustment (logistic regression

is most common): OR=1.3 (95% CI: 0.92-1.83)

Page 73: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 73

Attributable Risk (SIM web link)

• Set upper limit on amount of preventable disease. Meaningful only if association is causal.

• Tricky area since there are several measures with similar names.

• Attributable risk. The amount of disease due to exposure in the exposed subjects. The same as the risk difference. Can also express as attributable fraction.

• Can also look at the risk attributed to the exposure in the general population (depends on how common the exposure is).

Page 74: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 74

• In exposed subjects

Attributable risks (2)

ExpUnexp

Risk Difference or Attributable Risk

Iexp

Iunexp

RD = AR = Iexp - Iunexp

Iexp – Iunexp

AR(%)=AF= -----------------------

Iexp

Page 75: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 75

Attributable risks (3)

ExpUnexp

Population Attributable Risk

Iexp

Iunexp

Population

Ipop

PRD = PAR = Ipop - Iunexp

Ipop – Iunexp

PAR(%)=PAF= ----------------------- Ipop

• In population

Page 76: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 76

Randomized Controlled Trials

• Basically a cohort study where the researcher decides which exposure (treatment) the subject get.– Recruit a group of people meeting pre-specified

eligibility criteria.– Randomly assign some subjects (usually 50% of

them) to get the control treatment and the rest to get the experimental treatment.

– Follow-up the subjects to determine the risk of the outcome in both groups.

– Compute a relative risk or otherwise compare the groups.

Page 77: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 77

Randomized Controlled Trials (2)

• Some key design features– Allocation concealment– Blinding (masking)

• Patient• Treatment team• Outcome assessor• Statistician

– Monitoring committee• Two key problems

– Contamination• Control group gets the new treatment

– Co-intervention• Some people get treatments other than those under study

Page 78: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 78

Randomized Controlled Trials: Analysis

• Outcome is an adverse event• RR is expected to be <1• Absolute risk reduction, ARR =

Incidence(control) - Incidence(treatment) (=|attributable risk|)

• Relative risk reduction, RRR = ARR/incidence(control) = 1 - RR

• Number needed to treat, NNT (to prevent one adverse event) = 1/ARR

Page 79: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 79

RCT – Example of Analysis

Asthma No Total Inc

attack attack

Treatment 15 35 50 .30

Control 25 25 50 .50

Relative Risk = 0.30/0.50 = 0.60

Absolute Risk Reduction = 0.50-0.30 = 0.20

Relative Risk Reduction = 0.20/0.50 = 40%

Number Needed to Treat = 1/0.20 = 5

Page 80: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 80

Epidemiology Methods

Multiple Choice Questionsfor discussion

Page 81: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 81

A group of 50 people are exposed to virus “A”. Of those 50 people, 9 develop a mild infection, 10 become seriously ill, and 3 die. The attack rate of virus “A” in the population would be:a) 22/50

b) 9/50

c) 10/50

d) 19/50

e) 13/50

Page 82: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 82

Examples of secondary prevention would include all of the following EXCEPT:a) Pap smear for cervical cancer

b) chemoprophylaxis in a recent TB converter

c) proctoscopy for rectal cancer

d) immunization for Haemophilus influenzae B

e) mammography for breast cancer

Page 83: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 83

Active immunization was important in control of each of the following childhood communicable diseases EXCEPT:a) diphtheria

b) polio

c) measles

d) scarlet fever

e) pertussis

Page 84: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 84

The occurrence of an illness at a rate of above that expected is called:a) hyperendemic

b) epidemic

c) endemic

d) enzootic

e) pandemic

Page 85: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 85

The purpose of randomization is to:a) make sure that there are equal numbers of men

and women in test and control groups

b) increase the chances of getting a statistically significant difference

c) ensure that the numbers of cases and controls are equal

d) limit bias

e) all of the above

Page 86: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 86

Which of the following types of studies usually provides only a measure of prevalence?

a) descriptive

b) cross-sectional

c) randomized controlled trial

d) cohort

e) none of the above

Page 87: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 87

The major advantage of cohort studies over case-control studies is that:

a) they take less time and are less costly

b) they can utilize a more representative population

c) it is easier to obtain controls who are not exposed to the factor

d) they permit estimation of risk of disease in those exposed to the factor

e) they can be done on a “double-blind” basis

Page 88: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 88

The classical “epidemiological triad” of disease causation consists of factors which fall into which of the following categories:a) host, reservoir, environment

b) host, vector, environment

c) host, agent, environment

d) reservoir, agent, vector

e) host, age, environment

Page 89: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 89

The incidence of a particular disease is greater in men than in women, but the prevalence shows no sex difference. The most probable explanation is that:

a) the mortality rate is greater in women

b) the case fatality rate is higher in women

c) the duration of the disease is longer in women

d) women receive less adequate medical care for the disease

e) this diagnosis is more often missed in women

Page 90: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 90

Page 91: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 91

Page 92: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 92

Page 93: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 93

The following indicate the results of screening test “Q” in screening for disease “Z”:

The specificity of test “Q” would be:a) 40/70b) 120/130c) 40/50d) 120/150e) 40/200

Page 94: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 94

13) The positive predictive value would be:a) 40/70b) 120/130c) 40/50d) 120/150e) 70/200

Page 95: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 95

In which of the following study designs is the odds ratio the statistic typically used to show an association between cause and effect?

a) a cross sectional/prevalence studyb) a randomized controlled trial

c) a cohort study

d) a case study

e) a case control study

Page 96: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 96

Alpha error is:a) the probability of declaring a difference to be

absent when it in fact is present

b) the probability of declaring a difference to be present when it is not

c) the probability of declaring a difference to be absent when it is indeed absent

d) the probability of declaring a difference to be present when it does exist

e) none of the above

Page 97: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 97

Which one of the following descriptors of a diagnostic test is influenced by the prevalence of the disease being tested for:a) specificity

b) sensitivity

c) accuracy

d) positive predictive value

e) reliability

Page 98: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 98

Each of the following statements applies to case control studies EXCEPT:a) starts with disease

b) suitable for rare diseases

c) relatively inexpensive

d) prolonged follow-up required

e) there may be a problem in selecting and matching controls

Page 99: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 99

A clinician who has been examining the patterns of mortality in your community says that the rates for heart disease and lung cancer are higher in this community than in an adjacent community. Which of the following questions should you ask first?

a) how did the clinician choose the comparison community?

b) have the rates been standardized for age?c) are tobacco sales significantly different in the two

communities?d) are the facilities to treat these diseases comparable in

the two areas?e) are the numbers of deaths comparable in each area?

Page 100: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 100

The effectiveness of a preventive measure is assessed in terms of:a) the effect in people to whom the measure is

offered

b) the effect in people who comply with the measure

c) availability and the optimal use of resources

d) the cost in dollars versus the benefits in improved health status

e) all of the above

Page 101: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 101

All of the following statements about the Canada Health Act (1984) are true EXCEPT:

a) it did not define all medically necessary hospital and physician services

b) the CHA replaced the Hospital Insurance and Diagnostic Services Act of 1957

c) the CHA banned all forms of extra billing

d) according to the CHA, provinces must meet all the terms and conditions of Medicare to qualify for federal transfer payments

e) none of the above

Page 102: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 102

Of the five items listed below, the one which provides the strongest evidence for causality in an observed association between exposure and disease is:a) a large attributable risk

b) a large relative risk

c) a small p-value

d) a positive result from a cohort study

e) a case report

Page 103: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 103

During a clinical trial, the difference in the success rates of two drugs was not statistically significant. This means that:

a) there is no difference in drug effectiveness

b) there is a sizeable probability that the demonstrated difference in the drugs’ effectiveness could occur due to chance alone

c) the demonstrated difference in the drugs’ effectiveness is too small to be clinically meaningful

d) the two samples of patients on which the drugs were tested came from the same population

e) none of the above is true

Page 104: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 104

All of the following statements about statistical tests are true EXCEPT:

a) linear regression is used to describe the relationship between two continuous variables

b) a confidence interval is a range of values giving information about the precision of an estimate

c) ANOVA tables are used to make comparisons among the means of 3 or more groups simultaneously

d) in a normal distribution, the mean, median and mode are equal

e) the chi-square test evaluates the statistical significance of 2 or more percentages of categorical outcomes

Page 105: R.A. Spasoff, MD Epidemiology & Community Medicine

April 2011 105

More MCQs

• Here are some more questions that students can use to test their own knowledge:

http://www.medicine.uottawa.ca/sim/data/Self-test_Qs_Epi_Methods_e.htm

• (The questions contain comments on the answers, to illustrate why a given response is not correct)