9.7 Poisson regressions for rates

22
1 STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit Models Loglinear/Logit Models 9.7 Poisson regressions for rates In Section 4.3 we introduced Poisson regression for modeling counts. When outcomes occur over time, space, or some other index of size, it is more relevant to model their rate of occurrence than their raw number. We use GLM with log link, Poisson distribution, log(index) as offset

description

9.7 Poisson regressions for rates. In Section 4.3 we introduced Poisson regression for modeling counts. When outcomes occur over time, space, or some other index of size, it is more relevant to model their rate of occurrence than their raw number. - PowerPoint PPT Presentation

Transcript of 9.7 Poisson regressions for rates

Page 1: 9.7 Poisson regressions for rates

1STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

9.7 Poisson regressions for rates

In Section 4.3 we introduced Poisson regression for modeling counts. When outcomes occur over time, space, or some other index of size, it is more relevant to model their rate of occurrence than their raw number.

We use GLM with log link, Poisson distribution, log(index) as offset

Page 2: 9.7 Poisson regressions for rates

2STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

9.7.1 Analyzing Rates Using Loglinear Models with Offsets

When a response count ni has index equal to ti , the sample rate is ni/ti. Its expected value is µi/ti.

With an explanatory variable x, a loglinear model for the expected rate has form

This model has equivalent representation

The adjustment term, -log ti , to the log link of the mean is called an offset. The fit correspond to using log ti as a predictor on the right-hand side and forcing its coefficient to equal 1.0.

Page 3: 9.7 Poisson regressions for rates

3STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

Then

is proportional to the index, with proportionality constant depending on the value of x.

Another model is to use identity link, it is less useful as the fitting process may fail because the negative fitted value

However, the log link may also possibly cause the fitted probability >1.

Page 4: 9.7 Poisson regressions for rates

4STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

9.7.2 Modeling Death Rates for Heart Valve Operations

Laird and Olivier (1981) analyzed patient survival after heart valve replacement operations.

A sample of 109 patients were classified by type of heart valve (aortic, mitral) and by age (<55, >55).

Follow-up observations occurred until the patient died or the study ended.

Operations occurred throughout the study period, and follow-up observations covered lengths of time varying from 3 to 97 months.

Response: death and corresponding follow up time

Page 5: 9.7 Poisson regressions for rates

5STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

The time at risk for a subject is their follow-up time of observation.

For a given age and valve type, the total time at risk is the sum of the times at risk for all subjects in that cell (those who died and those censored).

Page 6: 9.7 Poisson regressions for rates

6STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

We now model effects of age and valve type on the rate.

where a – age, v – type of valve. Or identity link

Page 7: 9.7 Poisson regressions for rates

7STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

SAS code

data table9_11;

input age $ vtype $ death totaltime;

logtime=log(totaltime);

cards;

<55 aortic 4 1259

<55 mitral 1 2082

55+ aortic 7 1417

55+ mitral 9 1647

;

Page 8: 9.7 Poisson regressions for rates

8STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

Model fitproc genmod data=table9_11; class age vtype;

model death = age vtype/ dist = poi link = log offset=logtime lrci type3 obstats;

proc genmod data=table9_11; class age vtype;

model death = age / dist = poi link = log offset=logtime lrci type3 obstats;

proc genmod data=table9_11; class age vtype;

model death = vtype/ dist = poi link = log offset=logtime lrci type3 obstats;

/*identity link*/

proc genmod data=table9_11; class age vtype;

model death/totaltime = age vtype/ dist = poi link = identity lrci type3 obstats;

ods output obstats=obstats Modelfit=Modelfit;

run;

Page 9: 9.7 Poisson regressions for rates

9STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

It is an estimated difference in death rates between the older and younger age groups for each valve type.

Page 10: 9.7 Poisson regressions for rates

10STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

Another example

2004 birth vital statistics merged to death data in Florida

The predictors: smoking, drinking, education, marital status, Medicaid.

The response: infant death

Purpose: to indentify the maternal characteristics of Medicaid beneficiaries that are significantly associated with infant death so that health care and related services can be focused on risk factors that contribute to the adverse outcome

Page 11: 9.7 Poisson regressions for rates

11STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models# Smoking Drinking Mothers

Education

Marital Status,

married? Medicaid Total #ideath

1 Yes Yes < HS No Yes 102 0

2 Yes Yes < HS No No 15 0

3 Yes Yes < HS Yes Yes 36 1

4 Yes Yes < HS Yes No 12 0

5 Yes Yes HS No Yes 80 1

6 Yes Yes HS No No 21 0

7 Yes Yes HS Yes Yes 37 0

8 Yes Yes HS Yes No 35 0

9 Yes Yes > HS No Yes 51 0

10 Yes Yes > HS No No 19 0

11 Yes Yes > HS Yes Yes 17 0

12 Yes Yes > HS Yes No 45 0

13 Yes No < HS No Yes 3,964 35

46 No No > HS No No 6,920 49

47 No No > HS Yes Yes 12,289 62

48 No No > HS Yes No 64,730 265

Page 12: 9.7 Poisson regressions for rates

12STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

/*raw table*/

proc sql;

create table rawtable as

select 'smoking' as varlabel, smoking as varlevel, sum(total) as totalsumple, sum(infdth) as totalinfdth from birth2004 group by smoking

union select 'drk' as varlabel, drk as varlevel, sum(total) as totalsumple, sum(infdth) as totalinfdth from birth2004 group by drk

union select 'edu ' as varlabel, edu as varlevel, sum(total) as totalsumple, sum(infdth) as totalinfdth from birth2004 group by edu

union select 'ms ' as varlabel, ms as varlevel, sum(total) as totalsumple, sum(infdth) as totalinfdth from birth2004 group by ms

union select 'med ' as varlabel, med as varlevel, sum(total) as totalsumple, sum(infdth) as totalinfdth from birth2004 group by med;

data rawtable; set rawtable;

percentage=totalinfdth/totalsumple*100;

proc print; run;

Page 13: 9.7 Poisson regressions for rates

13STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

Page 14: 9.7 Poisson regressions for rates

14STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

/*backward model selection starting from main+2fis*/

proc genmod data=birth2004; class smoking drk edu ms med;

model infdth = smoking drk edu ms med

smoking*drk smoking*edu smoking*ms smoking*med

drk*edu drk*ms drk*med

edu*ms edu*med ms*med

/ dist = poi link = log offset=logtotal lrci type3;

ods output type3=type3;

run;

proc sort data=type3; by ProbChiSq; run;

proc print data=type3; run;

Page 15: 9.7 Poisson regressions for rates

15STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

Main effects + 2 factor-interactions

It is not lack of fit

Model might be too complicated

Page 16: 9.7 Poisson regressions for rates

16STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

Backward model selection

Sort the Type 3 table by p-value, delete drk*ms

Page 17: 9.7 Poisson regressions for rates

17STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

Continue the backward procedure, but keep the main effect even it is not significant but it is included in an interaction

deleting in order smoking*med drk*med smoking*edu edu*med edu*ms smoking*drk smoking*ms drk*edu drk

Page 18: 9.7 Poisson regressions for rates

18STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

Final model

proc genmod data=birth2004;

class smoking drk edu ms med;

model infdth = smoking edu ms med ms*med

/ dist = poi link = log offset=logtotal lrci type3;

ods output type3=type3;

run;

proc sort data=type3; by ProbChiSq; run;

proc print data=type3; run;

Page 19: 9.7 Poisson regressions for rates

19STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

Page 20: 9.7 Poisson regressions for rates

20STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

Page 21: 9.7 Poisson regressions for rates

21STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

lsmeans smoking edu ms med ms*med /diff;

Effect smoking edu ms med _smoking _edu _ms _med Estimate Std Errsmoking No Yes -0.27 0.0875edu <HS >HS 0.2856 0.0787edu <HS HS 0.0156 0.0691edu >HS HS -0.27 0.069ms No Yes 0.4503 0.0637med No Yes -0.0169 0.0653ms*med No No No Yes 0.1082 0.0921ms*med No No Yes No 0.5754 0.0983ms*med No No Yes Yes 0.4334 0.105ms*med No Yes Yes No 0.4672 0.0751ms*med No Yes Yes Yes 0.3252 0.0786ms*med Yes No Yes Yes -0.1419 0.0881

Page 22: 9.7 Poisson regressions for rates

22STA 617 – Chp9 STA 617 – Chp9 Loglinear/Logit ModelsLoglinear/Logit Models

Relative RisksEffect smok

ingedu ms med _smo

king_edu _ms _med RR LB UB

smoking No Yes 0.763 0.643 0.906edu <HS >HS 1.331 1.14 1.552edu <HS HS 1.016 0.887 1.163edu >HS HS 0.763 0.667 0.874ms No Yes 1.569 1.385 1.778med No Yes 0.983 0.865 1.118ms*med No No No Yes 1.114 0.93 1.335ms*med No No Yes No 1.778 1.466 2.155ms*med No No Yes Yes 1.543 1.256 1.895ms*med No Yes Yes No 1.596 1.377 1.848ms*med No Yes Yes Yes 1.384 1.187 1.615ms*med Yes No Yes Yes 0.868 0.73 1.031