Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South...

39
Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental Science, Policy & Management University of California at Berkeley

Transcript of Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South...

Page 1: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Introduction to Likelihood

Meaningful Modeling of Epidemiologic Data, 2012AIMS, Muizenberg, South Africa

Steve Bellan, MPH, PhDDepartment of Environmental Science, Policy & Management

University of California at Berkeley

Page 2: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

barplot(dbinom(x = 0:100, size = 100, prob = .3), names.arg = 0:size)

In a population of 1,000,000 people with a true prevalence of 30%, the probability distribution of number of positive individuals if 100 are sampled:

Page 3: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

In a population of 1,000,000 people with a true prevalence of 30%, the probability distribution of number of positive individuals if 100 are sampled:

> rbinom(n = 1, size = 100, prob = .3)[1] 28

We sample 100 people once and 28 are positive:

Page 4: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

> rbinom(n = 1, size = 100, prob = .3)[1] 28

We sample 100 people once and 28 are positive:

We don’t know the true prevalence!

But we can calculate the probability of 28 or a more extreme value occurring for a given prevalence.

Page 5: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

We sample 100 people once and 28 are positive.

p-value = 0.74

> 2*pbinom(28,100,.3))

[1] 0.7535564

Cumulative Probability & P Values

for 30% prevalence:

p(28 or a more extreme value occurring) =

Page 6: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

x2

Page 7: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

x2

Page 8: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

x2

Page 9: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

x2

Page 10: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

x2

Page 11: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

x2

Page 12: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

x2x2

x2 x2x2

x2

Page 13: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Which hypotheses do we reject?

IF GIVEN THE HYPOTHESIS

p value < cutoff

THEN REJECT HYPOTHESIS

Cutoff usually chosen as α = 0.05

Page 14: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Which hypotheses do we reject?

Page 15: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Which hypotheses do we NOT reject: CONFIDENCE INTERVAL

Page 16: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

We don’t know the true prevalence, but the probability that we had exactly 28/100 with 30% prevalence is:

> dbinom(x = 28, size = 100, prob = .3)[1] 0.08041202

> rbinom(n = 1, size = 100, prob = .3)[1] 28

We sample 100 people once and 28 are positive:

Let’s take another approach

Page 17: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.
Page 18: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.
Page 19: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.
Page 20: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.
Page 21: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.
Page 22: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.
Page 23: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Which prevalence gives the greatest probability of observing exactly 28/100?

Page 24: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Which of these prevalence values is most likely given our data?

Page 25: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Maximum Likelihood Estimate parameter value giving greatest probability of the data having occurred.

MLE = 28/100 = 0.28

What do you think is the MLE here?

true unknown value = 0.30different null hypotheses

Page 26: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Defining Likelihood

• L(parameter | data) = p(data | parameter)

• Not a probability distribution.

• Probabilities taken from many different distributions.

function of x

PDF:

function of p

LIKELIHOOD:

Page 27: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Deriving the Maximum Likelihood Estimatemaximize

maximize

minimize

Page 28: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Deriving the Maximum Likelihood Estimate

Page 29: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Deriving the Maximum Likelihood Estimate

The proportion of positives!X X

Page 30: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Maximum Likelihood Estimate

Page 31: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Maximum Likelihood Estimate

Page 32: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Building Confidence IntervalsLikelihood Ratio Test

> qchisq(p = .95, df = 1)[1] 3.841459

So if our α = .05, then we reject any null hypothesis for which

When lMLE - lnull > 1.92,we reject that null hypothesis.

If the null hypothesis were true then

Page 33: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Let’s zoom in…

Building Confidence IntervalsLikelihood Ratio Test

Maximum Likelihood Estimate

Page 34: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Maximum Likelihood Estimate

Building Confidence IntervalsLikelihood Ratio Test

Page 35: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Building Confidence IntervalsLikelihood Ratio Test

Page 36: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Building Confidence IntervalsLikelihood Ratio Test

Page 37: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Comparing Confidence Intervals

Page 38: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

Advantages of Likelihood

• Practical method for estimating parameters estimating variance of our estimates

• Easily adaptable to different probability distributions & dynamic models

Page 39: Introduction to Likelihood Meaningful Modeling of Epidemiologic Data, 2012 AIMS, Muizenberg, South Africa Steve Bellan, MPH, PhD Department of Environmental.

This presentation is made available through a Creative Commons Attribution-Noncommercial license. Details of the license and permitted uses are available at

http://creativecommons.org/licenses/by-nc/3.0/

© 2010 Steve Bellan and the Meaningful Modeling of Epidemiological Data Clinic

Title: Introduction to LikelihoodAttribution: Steve Bellan, Clinic on the Meaningful Modeling of Epidemiological Data

Source URL: http://lalashan.mcmaster.ca/theobio/mmed/index.php/

For further information please contact Steve Bellan ([email protected]).