4.3 GENERALIZED LINEAR MODELS FOR COUNTS

28
1 STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear Introduction to Generalized Linear Models Models 4.3 GENERALIZED LINEAR MODELS FOR COUNTS count data - assume a Poisson distribution counts in contingency tables with categorical response variables. modeling count or rate data for a single discrete response variable.

description

4.3 GENERALIZED LINEAR MODELS FOR COUNTS. count data - assume a Poisson distribution counts in contingency tables with categorical response variables. modeling count or rate data for a single discrete response variable. 4.3.1 Poisson Loglinear Models. - PowerPoint PPT Presentation

Transcript of 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

Page 1: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

1STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3 GENERALIZED LINEAR MODELS FOR COUNTS

count data - assume a Poisson distribution

counts in contingency tables with categorical response variables.

modeling count or rate data for a single discrete response variable.

Page 2: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

2STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3.1 Poisson Loglinear Models

The Poisson distribution has a positive mean µ. Although a GLM can model a positive mean using the

identity link, it is more common to model the log of the mean.

Like the linear predictor , the log mean can take any real value.

The log mean is the natural parameter for the Poisson distribution, and the log link is the canonical link for a Poisson GLM.

A Poisson loglinear GLM assumes a Poisson distribution for Y and uses the log link.

Page 3: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

3STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Log linear model

The Poisson loglinear model with explanatory variable X is

For this model, the mean satisfies the exponential relationship x

A 1-unit increase in x has a multiplicative impact of on µ

The mean at x+1 equals the mean at x multiplied by .

Page 4: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

4STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3.2 Horseshoe Crab Mating Example

Page 5: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

5STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 6: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

6STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3.2 Horseshoe Crab Mating Example a study of nesting horseshoe crabs. Each female horseshoe crab had a male

crab resident in her nest. AIM: factors affecting whether the

female crab had any other males, called satellites, residing nearby.

Explanatory variables are : C - the female crab’s color, S - spine condition, Wt - weight, W - carapace width.

Outcome: number of satellites (Sa) of a female crab.

For now, we only study W (carapace width)

Page 7: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

7STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

number of satellites (Sa) = f (W)

Scatter plot – weakly linear ? (N=173)

Grouped plot: To get a clearer picture, we grouped the female crabs into width categories

and calculated the sample mean number of satellites for female crabs in each category.

Figure 4.4 plots these sample means against the sample mean width for crabs in each category.

The sample means show a strong increasing trend.

WHY?

Page 8: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

8STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 9: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

9STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 10: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

10STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 11: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

11STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 12: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

12STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 13: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

13STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 14: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

14STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

SAS code

data table4_3;

input C S W Wt Sa@@; cards;

2 3 28.3 3.05 8 3 3 22.5 …

;

proc genmod data=table4_3;

model Sa=W/dist=poisson link=identity;

ods output ParameterEstimates=PE1;

run;

proc genmod data=table4_3;

model Sa=w/dist=poisson link=log;

ods output ParameterEstimates=PE2;

run;

Page 15: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

15STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Modelsdata _NULL_; set PE1;

if Parameter="Intercept" then

call symput("intercp1", Estimate);

if Parameter="W" then call symput("b1", Estimate);

data _NULL_; set PE2;

if Parameter="Intercept" then

call symput("intercp2", Estimate);

if Parameter="W" then call symput("b2", Estimate);

run;

data tmp;

do W=22 to 32 by 0.01;

mu1=&intercp1 + &b1*W;

mu2=exp(&intercp2 + &b2*W);

output;

end;

run;

Page 16: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

16STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Graphs

proc sort data=table4_3; by W;

data tmp1; merge table4_3 tmp; by W; run;

symbol1 i=join line=1 color=green value=none;

symbol2 i=join line=2 color=red value=none;

symbol3 i=none line=3 value=circle;

proc gplot data=tmp1;

plot mu1*W mu2*W Sa*W / overlay;

run;

Page 17: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

17STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 18: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

18STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Group data/*group data*/

data table4_3a; set table4_3;

W_g=round(W-0.75)+0.75;

*if W<23.25 then W_g=22.5;

*if W>29.25 then W_g=30.5;

run;

proc sql;

create table table4_3g as

select W_g, count(W_g) as Num_of_Cases,

sum(Sa) as Num_of_Satellites,

mean(Sa) as Sa_g, var(sa) as Var_SA

from table4_3a group by W_g;

quit;

proc print; run;

Page 19: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

19STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

SAS output

Num_of_ Num_of_

Obs W_g Cases Satellites Sa_g Var_SA

1 20.75 1 0 0.00000 .

2 21.75 1 0 0.00000 .

3 22.75 12 14 1.16667 3.0606

4 23.75 14 20 1.42857 8.8791

5 24.75 28 67 2.39286 6.5437

6 25.75 39 105 2.69231 11.3765

7 26.75 22 63 2.86364 6.8853

8 27.75 24 93 3.87500 8.8098

9 28.75 18 71 3.94444 16.8791

10 29.75 9 53 5.88889 9.8611

11 30.75 2 6 3.00000 0.0000

12 31.75 2 6 3.00000 2.0000

13 33.75 1 7 7.00000 .

Page 20: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

20STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Graphs

data tmp2; merge table4_3g(rename=(W_g=W)) tmp; by W; run;

symbol1 i=join line=1 color=green value=none;

symbol2 i=join line=2 color=red value=none;

symbol3 i=none line=3 value=circle;

proc gplot data=tmp2;

plot mu1*W mu2*W Sa_g*W / overlay;

run;

Page 21: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

21STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 22: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

22STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3.3 Overdispersion for Poisson GLMs

Page 23: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

23STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Solution?

Page 24: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

24STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3.4 Negative binomial GLMs

Page 25: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

25STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

Page 26: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

26STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

/*fit negative binomial with identical link to count for overdispersion*/

proc genmod data=table4_3;

model Sa=W/dist=NEGBIN link=identity;

ods output ParameterEstimates=PE3;

run;

Page 27: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

27STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models

4.3.6 Poisson GLM of independence in I × J contingence tables

Page 28: 4.3 GENERALIZED LINEAR MODELS FOR COUNTS

28STA 517 – Chp4 STA 517 – Chp4 Introduction to Generalized Linear ModelsIntroduction to Generalized Linear Models