MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival...

38
MBP1010 – Lecture 8: March 1, 2011 1. Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources) Ch 10 Multifactorial Analyses

Transcript of MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival...

Page 1: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

MBP1010 – Lecture 8: March 1, 2011

1. Odds Ratio/Relative Risk

• Logistic Regression

• Survival Analysis

Reading: papers on OR and survival analysis (Resources)Ch 10 Multifactorial Analyses

Page 2: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Assignment 3

Due: March 8

-solutions will be posted after due date -but marks will not likely available prior to exam

Page 3: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Observational Studies with Binary Outcomes

Case/control and cohort studies- common in cancer research

Outcome: cancer/ no cancer, dead/alive

- cross-sectional studies - classify subjects into categories of 2 binary variables

Page 4: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

X

XX X

X

XX

0

X

XX

0

0

0 00

0

00

0

Exposureeg diet

Case Control Study

Exposureeg diet

Measure of risk: odds ratio (OR)

Page 5: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

0 0

0

0 00

0

00

0

0

0

0

0

0 0

0

X 00

0

X0

0

X

0

0

X

Cohort Study

Exposureeg diet Cancer (yes/no)

Measure of risk: RR or OR

Page 6: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Cross-sectional Study

• Subjects NOT selected on exposure or outcome

• Classify subjects into exposure and outcome

• OR or RR can be used to describe association with binary outcome

Page 7: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Observational Studies with Binary Outcomes

-case/control, cohort studies, cross-sectional studies

Ways to examine association:

•chi square test for association (2 x 2 contingency table)• X2

• odds ratio (OR) or relative risk (RR)• X2 and magnitude of risk and CI

• logistic regression• X2, magnitude of risk, CI and can include other variables of interest

Page 8: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Relative RiskProspective Cohort Studies

RR = 1.0 no association

RR = 1.4 1.4 times the risk 40% higher risk

RR = 0.8 20% lower risk

RR = p1/p2

P1 = probability of disease for exposed individualsP2 = probability of disease for unexposed individuals

Page 9: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

MDM2 protein expression and breast cancer prognosis - cohort study

- women with invasive breast cancer at BCCA

- TMA stained for MDM2 protein expression

- data on outcome (dead/alive) available

Turbin et al, Modern Pathology 2006

Page 10: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

MDM2 protein expression and breast cancer prognosis

Prospective Cohort Study

p1 = 28/49 = 0.57 p2 = 94/313 = 0.30

X2 = 12.75 = 12.75, df = 1, p-value = < 0.01)

RR = (28/49)/(94/313) = 1.90

Women with MDM2 protein expression were at 1.9 times the risk of dying from breast cancer compared to women without MDM2 protein expression (p<0.01).

Page 11: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

(from lecture 4)

Page 12: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Case Control Study of Family History and Breast Cancer

- cases of breast cancer identified by cancer registry

- controls identified through provincial screening program

- data collected by questionnaire (after diagnosis in cases)

Page 13: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Case Control Study of Family History and Breast Cancer

2 x 2 Contingency Table

Chi-square results with Yates’ continuity correction:

X2 = 9.60, df = 1, p-value = 0.00195 (< 0.01)

We conclude that there is a statistically significant association between first degree family history of breast cancer and breast cancer risk (p<0.01). 22% of women with breast cancer have a first degree family history of breast cancer compared to 16% of women without breast cancer.

Page 14: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Estimate of Risk from Case-Control Study

• we fixed the number with and without breast cancer

• we cannot estimate of the probabilities of breast cancer in women with and without family history

- Relative Risk cannot be estimated

What can we do?

Page 15: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Gamblers calculate their chances of winning using a term called the odds

Suppose that the horse is a favourite and it is declared to have a 1 in 4 chance of winning [1 / (1 + 3)].

The gambler might say that the horse had an odds of 1 in 3 of winning. However, gamblers are much more likely to say that the odds of the horse losing are 3 to 1.

A horse that is a longshot may have only a 1 in 50 chance of winning. On the tote board the gambler will read that it has 49 to 1 odds against winning.

A day at the racetrack.....

Page 16: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Estimate of risk: Odds Ratio

If the probability of an event = p, then:

The odds in favour of an event = p/(1-p)

•ratio of probability that event occurs to probabilitythat is does not

Odds Ratio:

Odds in favour of disease for the exposed groupOdds in favour of disease for the unexposed group

Page 17: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

odds of breast cancer with FHX = 238/4181-(238/418)

= 1.32

odds of breast cancer with no FHX = 862/17821-(862/1782)

= 0.94

OR = 1.32/0.94

= 1.41

odds = p/(1-p)

Odds Ratio

Page 18: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

OR = (a/b)/(c/d) = (238/180)/ 862/920 = 1.41

Alternate equation: (a*d)/(b*c) = (238*920)/(180*862) = 1.41

Ratio of the number times event occurs to number of times it doesn’t

Simple method for calculating OR:

Page 19: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

- OR has a skewed distribution - limited at lower end because it can’t be negative but not limited at the upper end

- log(OR) however can take any value and has anapproximately normal distribution

SE for ln(OR) = sqrt (1/a + 1/b + 1/c + 1/d) = sqrt(1/238 + 1/180 + 1/862 + 1/920) = 0.109

ln(1.41) ± 1.96 x 0.109

0.23459 to 0.55723

1.26 to 1.75 95% CI

Confidence Interval for OR

Calculate limits on log(OR)

and then “exponentiate”

Page 20: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

What is the interpretation of the OR?

The odds of breast cancer in women with a family history is about 1.41 times of that in women without a family history.

Strictly speaking OR should be expressed as “odds” (as above):

However, when the outcome is rare (as it is generally for cancer),the OR is approximately equal to RR and results are often expressed as risk (ie more or less likely at risk to develop cancer).

Page 21: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Disease Odds Ratio:

Odds in favour of disease for the exposed groupOdds in favour of disease for the unexposed group

Exposure Odds Ratio:

Odds in favour of being exposed for diseased subjectsOdds in favour of being exposed for non diseased subjects

OR is reversible

= 1.41

= 1.41

Page 22: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

MDM2 protein expression and breast cancer prognosis

Prospective Cohort Study

p1 = 28/49 = 0.57 p2 = 94/313 = 0.30

X2 = 12.75 = 12.75, df = 1, p-value = < 0.01)

RR = (28/49)/(94/313) = 1.90

OR = (28*219)/21*122) = 3.11

Proportion dying = 34%

Page 23: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Caution about Case/Control Studies

“Recall” biassubjects with disease may recall their exposuresdifferently from controls

- Biological samples collected after diagnosis may be affected by presence of disease

-Selection of controls extremely important (different population?)

-Treatment of samples from cases and controls must be the same

-Posted paper: Sources of Bias in Specimens for Research about Molecular markers for cancer

Page 24: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Copyright © American Society of Clinical Oncology

Ransohoff, D. F. et al. J Clin Oncol; 28:698-704 2010

Fig 1. The fundamental comparison in experimental and observational study design

- paper posted on website under resources

Page 25: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Page 26: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Nested Case-Control Study

Measure of risk: OR

0 0

0

0 00

0

00

0

0

0

0

0

0 0

0

X 00

0

X0

0

X

0

0

X

XX X

XX

0 0

0

0

0

cohort

select cases &subset ofcontrols

measure exposure follow to identify cases

Page 27: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Relative Risk

RR = p1/p2

P1 = probability of disease for exposed individualsP2 = probability of disease for unexposed individuals

Nested Case-Control Study

• Do a prospective cohort study

• Identify cases

• Select controls (randomly) from the cohort study - usually matched to case - followed same length of time as case - match on other characteristics (eg age, site etc)

• perform measurements of exposure

• Analyze as case-control (Odds Ratio)

- Still requires cohort study; but less measurements required - Control from same population as cases-Measurements from baseline (no recall bias)

Page 28: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)
Page 29: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

• a generalization of chi square to examine association of a binary variable with one or more independent variables (categorical or continuous)

• Logistic regression quantifies the relationship between a risk factor for (or treatment) and a disease, after adjusting for other variables.

• Binary dependent variable: an event which is either present or absent (“success” or “failure”)

• Goal is to examine factors associated with the probability of an event

• uses method of maximum likelihood rather than least squares

Logistic Regression

Page 30: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

How does logistic regression work?

• Logistic regression finds an equation that predicts an outcome variable that is binary from one or more x variables.

Outcome = probability of disease (p)

p = β0 + β1X1 + β2X2…

But…probabilities can only range from 0 to 1 and the right hand side could be < 0 or > 1 for some values of X : Use logit transformation

Page 31: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

How does logistic regression work?

logit transformation : logit(p) = ln(p/1-p)

Natural logarithm of the odds can take on any value (negative or positive).

Ln(Odds) = β0 + β1X1 + β2X2…

Logistic Regression Model:

Page 32: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Logistic Regression family history example

Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.06512 0.04740 -1.374 0.16953 fhx 0.34443 0.10956 3.144 0.00167

ln(Odds)= -0.065 + 0.344x

Intercept (β0): log odds in baseline group (x = 0)

Slope (β): difference between ln(odds) for 1 unit of x variable

To Interpret – use transformation:

= eβ = e0.344 = 1.41OddsFHX

Oddsno FHX

Page 33: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Case Control Study of Family History and Breast Cancer

Since there are only 2 values for x (family history: yes/no):

For women with family history: ln(Odds) = β0 + β1 (x=1) For women with no family history: ln(Odds) = β0 (x=0)

ln(Odds)= -0.065 + 0.344x

A little more detail on interpretation….

Page 34: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

odds of breast cancer with FHX = 238/4181-(238/418)

= 1.32

odds of breast cancer with no FHX = 862/17821-(862/1782)

= 0.94

OR = 1.32/0.94

= 1.41

Page 35: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Case Control Study of Family History and Breast Cancer

Since there are only 2 values for x (family history: yes/no):

For women with family history: ln(Odds) = β0 + β1 (x=1)

= -0.065 + 0.344 = 0.279 = ln(1.32) For women with no family history: ln(Odds) = β0 (x=0) = -0.065 = ln(0.94)

LN(Odds) = -0.065 +0.344x

β1 = difference in ln(odds) between categories = ratio of odds = 0.279 - (-0.065) = 0.344

OR = 1.32/0.94 = 1.41; e0.344 = 1.41

Page 36: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -0.015944 0.372387 -0.043 0.96585 fhx 0.355486 0.109996 3.232 0.00123 age -0.003756 0.004749 -0.791 0.42897 bmi 0.003721 0.010040 0.371 0.71092 HRT 0.204735 0.091312 2.242 0.02495

Multiple Logistic Regression – Family History Example

Note: z test used for coefficients.For 95% CI can use 1.96 x se

Page 37: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Multiple Logistic Regression – Family History Example

lower 95% CI higher 95% CI

OR 2.5 % 97.5 %(Intercept) 0.9841825 0.4741266 2.042309fhx 1.4268734 1.1508620 1.771679age 0.9962506 0.9870106 1.005564bmi 1.0037280 0.9841700 1.023710HRT 1.2271996 1.0262560 1.468051

Interpretation:The odds of a woman with family history developing breast cancer is 1.43 times (95% CI 1.15 to 1.77) that of a woman without a family history, after adjustment for age, BMI and HRT use.

Page 38: MBP1010 – Lecture 8: March 1, 2011 1.Odds Ratio/Relative Risk Logistic Regression Survival Analysis Reading: papers on OR and survival analysis (Resources)

Studies with Binary Outcomes - Summary

Ways to examine association:

•chi square test for association (2 x 2 contingency table)

• odds ratio (OR) or relative risk (RR)* - test of association, magnitude of risk and CI

• logistic regression• OR as measure as risk, CI and can include other variables of interest

* for case-control study only OR is appropriate; for cohort and cross-sectional both OR and RR are valid; if probability of outcome is rare - OR and RR will be similar