The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA...

26
Ph.d. course in Epidemiology: Fall 2012 Regression models Clayton & Hills, Ch. 22-26 23, 30 October, 6 November 2012 www.biostat.ku.dk/~nk/epiE12 Per Kragh Andersen and Henrik Ravn 1 The Clayton & Hills book 1. Introductory concepts and methods, ch. 1-11. 2. Simple methods for cohort and case-control studies, ch. 13-19. 3. Regression models, ch. 22-27. In the first part a general method to estimate parameters in statistical models is introduced: LIKELIHOOD In some examples we may be able to come up with a suggestion to the parameter estimate without this general tool: Mean blood pressure in patients with heart disease, probability of heads in coin tossing, risk of death in cohort study, cancer rate among asbestos workers 2 But: 1. What about confidence intervals and significance tests? 2. What about examples like: Effect of drug on blood pressure for heart patients adjusted for sex, age and weight? Risk of death in cohort study and its dependence on age, cholesterol, stage of disease? Cancer rate adjusted for age, degree of exposure, duration of exposure? In these examples we are not able to suggest a simple, obvious method and we have to rely on our great tool! 3 Regression models. - Outcome variable is related to - Explanatory variables. In epidemiology: binary outcome rate Poisson, Cox odds or probability logistic In other fields, some times quantitative outcome: mean value Gaussian C & H, Ch. 34 4

Transcript of The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA...

Page 1: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Ph.d

.co

urs

ein

Epid

em

iolo

gy:

Fall

2012

Regre

ssio

nm

odels

Cla

yto

n&

Hills

,C

h.

22-2

6

23,30

Oct

ober,

6N

ovem

ber

2012

www.biostat.ku.dk/~nk/epiE12

Per

Kra

ghA

nder

sen

and

Hen

rik

Rav

n

1

The

Cla

yto

n&

Hills

book

1.In

troduct

ory

conce

pts

and

met

hods,

ch.

1-1

1.

2.Sim

ple

met

hods

for

cohort

and

case

-contr

olst

udie

s,ch

.13

-19.

3.R

egre

ssio

nm

odel

s,ch

.22

-27.

Inth

efirs

tpar

ta

gener

alm

ethod

toes

tim

ate

para

met

ers

in

stat

isti

calm

odel

sis

intr

oduce

d:

LIK

ELIH

OO

D

Inso

me

exam

ple

sw

em

aybe

able

toco

me

up

wit

ha

sugges

tion

to

the

para

met

eres

tim

ate

wit

hout

this

gen

eralto

ol:

•M

ean

blo

od

pre

ssure

inpati

ents

wit

hhea

rtdis

ease

,

•pro

bability

ofhea

ds

inco

into

ssin

g,

•ri

skof

dea

thin

cohort

study,

•ca

nce

rra

team

ong

asbes

tos

wor

ker

s

2

But:

1.W

hat

abou

tco

nfiden

cein

terv

als

and

sign

ifica

nce

test

s?

2.W

hat

abou

tex

ample

slike:

•E

ffec

tof

dru

gon

blo

od

pre

ssure

for

hea

rtpat

ients

adju

sted

for

sex,age

and

wei

ght?

•R

isk

ofdea

thin

cohor

tst

udy

and

its

dep

enden

ceon

age,

chol

este

rol,

stag

eof

dis

ease

?

•C

ance

rra

tead

just

edfo

rag

e,deg

ree

ofex

pos

ure

,dura

tion

of

expos

ure

?

Inth

ese

exam

ple

sw

ear

enot

able

tosu

gges

ta

sim

ple

,ob

vio

us

met

hod

and

we

hav

eto

rely

on

our

grea

tto

ol!

3

Regre

ssio

nm

odels

.

-O

utc

om

eva

riable

isre

late

dto

-E

xpla

nato

ryva

riable

s.

Inepid

em

iolo

gy:

bin

ary

outc

om

e

⎧ ⎨ ⎩ra

tePois

son,C

ox

odds

or

pro

bability

logis

tic

Inot

her

fiel

ds,

som

eti

mes

quan

tita

tive

outc

ome:

mea

nva

lue

Gauss

ian

C&

H,C

h.

34

4

Page 2: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Regre

ssio

nm

odels

.

Inepid

em

iolo

gy:

expla

nat

ory

vari

able

s

⎧ ⎨ ⎩ex

pos

ure

s

confo

under

s

tech

nic

ally

trea

ted

iden

tica

lly

inre

gres

sion

model

s(a

sop

pos

edto

stra

tified

analy

sis)

:

WE

AK

NE

SS/ST

RE

NG

TH

?

5

Exam

ple

:A

ge

stra

tified

com

pari

son

oftw

ora

tes

Age

-ban

d,

t=

0,1,

2:

λt 1

=θλ

t 0

λt 1∼

expos

edλ

t 0∼

not

expose

d.

Expos

ure

isas

sum

edto

hav

eth

esa

me

mult

iplica

tive

effec

tin

all

age-

ban

ds;

θ(=

θ 1)

isth

iseff

ect

(Table

22.1

):

Expos

ure

Exposu

re

Age

01

Age

01

0 0λ

0 0θ 1

0 0λ

0 0θ 1

1 0λ

1 0θ 1

0 0ϕ

0 0ϕ

1θ 1

2 0λ

2 0θ 1

0 0ϕ

0 0ϕ

2θ 1

Sim

ilar

ly,th

eag

eeff

ect

isdes

crib

edusi

ng

mult

iplica

tive

effec

ts

rela

tive

toa

bas

elin

ele

vel

(=A

ge

0)

(∼Table

22.2

).

6

Exam

ple

:A

ge

stra

tified

com

pari

son

oftw

ora

tes

Fin

ally:

anew

nam

Cin

stea

dof

λ0 0

for

the

rate

inth

ere

fere

nce

gro

up

(Tab

le22.3

).

Expos

ure

Age

01

Cθ 1

1θ 1

2θ 1

Exerc

ise

22.1

,p.

219

7

Exerc

ise

22.1

.

Exposu

re

Age

01

C=

5.0

λC

θ 1=

15.0

1=

12.0

λC

ϕ1

θ 1=

36.

0

2=

30.0

λC

ϕ2

θ 1=

90.

0

What

are

the

valu

esofθ 1

,ϕ1,ϕ

2,λ

C?

8

Page 3: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Gre

ekle

tter

sar

esu

itab

lefo

rdoi

ng

the

mat

h,not

asnam

esin

com

pute

rpro

gram

s

Inst

ead,use

nam

eslike

–A

ge

–E

xposu

re

Expos

ure

Age

01

0C

orner

Cor

ner

×E

xpos

ure

(1)

1C

orner

×A

ge(1

)C

orner

×A

ge(1

Expos

ure

(1)

2C

orner

×A

ge(2

)C

orner

×A

ge(2

Expos

ure

(1)

Rat

e=

Cor

ner

×E

xpos

ure

×A

ge

Mor

eof

ten

the

par

amet

ers

are

wri

tten

ona

log-

scal

e.

log(

Rat

e)=

Cor

ner

+E

xpos

ure

+A

ge

9

An

exam

ple

:

Table

22.6

.E

ner

gyin

take

and

IHD

inci

den

ceper

1000

per

son-y

ears

Expose

dU

nex

pose

d

Curr

ent

(<2750

kca

l)(≥

2750

kca

l)

age

Case

sP

-yrs

.R

ate

Case

sP

-yrs

.R

ate

RR

40–49

2311.9

6.4

14

607.9

6.5

80.9

7

50–59

12

878.1

13.6

75

1271.1

3.9

33.4

8

60–69

14

667.5

20.9

78

888.9

9.0

02.3

3

Tota

l28

1857.5

15.0

717

2768.9

6.1

42.4

5

RA

TE

=C

OR

NE

EX

PO

SU

RE

×A

GE

The

regre

ssio

nm

odel

claim

sth

era

tera

tios

tobe

const

ant

over

age

bands.

10

Est

imate

s:

Table

22.7

.E

stim

ated

valu

esof

the

par

amet

ers

for

the

IHD

dat

a

Par

amet

erE

stim

ate

Cor

ner

0.00

444

Expos

ure

(1)

×2.

39

Age

(1)

×1.

14

Age

(2)

×2.

00

obta

ined

from

the

likel

ihood

funct

ion.

Com

par

epre

dic

ted

rate

sw

ith

obse

rved

rate

s:

Exerc

ise

22.3

11

Exerc

ise

22.3

:so

luti

on.

Table

22.6

.E

ner

gyin

take

and

IHD

inci

den

ceper

1000

per

son-y

ears

Expose

dU

nex

pose

d

Curr

ent

(<2750

kca

l)(≥

2750

kca

l)

age

Case

sP

-yrs

.R

ate

Case

sP

-yrs

.R

ate

40–49

2311.9

6.4

1(1

0.6

1)

4607.9

6.5

8(4

.44)

50–59

12

878.1

13.6

7(1

2.1

0)

51271.1

3.9

3(5

.06)

60–69

14

667.5

20.9

7(2

1.2

2)

8888.9

9.0

0(8

.88)

12

Page 4: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

The

esti

mat

esar

efr

equen

tly

given

ona

log

scale

:

log(R

AT

E)=

CO

RN

ER

+E

XP

OSU

RE

+A

GE

Table

22.8

.E

stim

ated

par

amet

ers

and

SD

son

alo

gsc

ale

Par

amet

erE

stim

ate

(M)

SD

(S)

Cor

ner

-5.4

180

0.44

20

Expos

ure

(1)

0.8697

0.3

080

Age

(1)

0.12

900.

4753

Age

(2)

0.69

200.

4614

Her

e,SD

’sar

eob

tain

edby

Gauss

ian

appro

xim

ations

toth

elo

gpro

file

likelihood.

13

Appro

xim

ate

90%

confiden

cein

terv

alfo

reff

ect

ofex

posu

re:

exp(0

.8697)/

×ex

p(1

.645·0

.3080)

Fro

m2.3

9/1.6

6=

1.4

4to

2.3

9·1.6

6=

3.9

6

Exerc

ise

22.4

14

Exerc

ise

22.4

:so

luti

on.

Appro

xim

ate

90%

confiden

cein

terv

alfo

reff

ect

ofex

pos

ure

:

exp(0

.869

7)/×

exp(1

.645

·0.3

080)

Fro

m2.3

9/1.6

6=

1.4

4to

2.3

9·1.6

6=

3.9

6

–an

dfo

rth

efirs

tag

eeff

ect:

exp(0

.129

0)/×

exp(1

.645

·0.4

753)

.

Fro

m1.1

4/2.1

9=

0.5

2to

1.1

4·2.1

9=

2.4

9

15

Pois

son

regre

ssio

n

Table

22.6

again

Pois

son

log

likelihood

for

one

“ce

ll”

Dlo

g(λ

)−

λ

e.g.

,ex

pos

ed=

level

1,ag

eban

d=

level

2

14

log(λ

2 1)−

667.5×

λ2 1

=14{C

OR

NE

R+

EX

PO

SU

RE

(1)

+A

GE

(2)}

–667.5

exp{C

OR

NE

R+

EX

PO

SU

RE

(1)

+A

GE

(2)}

16

Page 5: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Pois

son

regre

ssio

n

Tot

allo

glikel

ihood

can

be

com

pute

dfr

omth

efr

equen

cyre

cord

s:

Table

23.1

.T

he

IHD

dat

aas

freq

uen

cyre

cord

s

Cas

esPer

son-y

ears

Age

Expos

ure

4607.9

00

2311.9

01

51272.1

10

12

878.1

11

8888.9

20

14

667.5

21

and

max

imis

ed.

17

SA

Spro

gra

m

/*

*****DATASTEP******/

dataihd;

inputekspalderpyrscases;

lpyrs=log(pyrs);

datalines;/*or,alternatively,readfromwwworfromfile*/

02

311.92

01

878.112

00

667.514

12

607.94

11

1272.15

10

888.98

; run;

/*

*****PROCedureSTEP

******/

procgenmoddata=ihd;

classekspalder;

modelcases=ekspalder/dist=poioffset=lpyrstype3;

run;

18

SA

Soutp

ut

(edit

ed)

TheGENMODProcedure

ModelInformation

DataSet

WORK.IHD

Distribution

Poisson

LinkFunction

Log

DependentVariable

cases

OffsetVariable

lpyrs

ObservationsUsed

6

ClassLevelInformation

Class

Levels

Values

eksp

20

1

alder

30

12

19

CriteriaForAssessingGoodnessOfFit

Criterion

DF

Value

Value/DF

Deviance

21.6727

0.8364

ScaledDeviance

21.6727

0.8364

PearsonChi-Square

21.6516

0.8258

ScaledPearsonX2

21.6516

0.8258

LogLikelihood

52.5435

20

Page 6: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

AnalysisOfParameterEstimates

Standard

Wald95%

Chi-

Parameter

DF

Estimate

Error

ConfidenceLimits

Square

Intercept

1-5.4177

0.4421

-6.2841

-4.5513

150.20

eksp

01

0.8697

0.3080

0.2659

1.4734

7.97

eksp

10

0.0000

0.0000

0.0000

0.0000

.

alder

01

0.6920

0.4614

-0.2123

1.5964

2.25

alder

11

0.1290

0.4754

-0.8027

1.0607

0.07

alder

20

0.0000

0.0000

0.0000

0.0000

.

Parameter

Pr

>ChiSq

Intercept

<.0001

eksp

00.0048

eksp

1.

alder

00.1337

alder

10.7861

alder

2.

21

LR

StatisticsForType3

Analysis

Chi-

Source

DF

Square

Pr>

ChiSq

eksp

18.30

0.0040

alder

24.02

0.1342

22

Data

asin

div

idualre

cord

s:ex

ample

Per

son

dat

eof

dat

eof

dat

eof

exit

ener

gyag

eat

age

at

no.

bir

then

try

exit

statu

sin

take

entr

yex

it

···

0104

2701

0171

0107

931

2600

43.7

566

.25

� Age

40

50

60

70

Fig

.23.1

.Splitt

ing

the

follow

-up

reco

rd.

23

Contr

ibuti

on

tolo

glikel

ihood:

log(λ

0 1)

-6.2

λ0 1

+0×

log(λ

1 1)

-10

×λ

1 1

+1×

log(λ

2 1)

-6.2

λ2 1

Sum

min

gov

erin

div

iduals

giv

esto

tallo

glikel

ihood

24

Page 7: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

SA

Spro

gra

m-

usi

ng

the

Lexis

MA

CR

Oto

the

die

tdata

.

data

ihdindiv;

filename

dietdata

url

‘‘http://www.biostat.ku.dk/~pka/epidata/diet.txt’’;

infile

dietdata

firstobs=2;

input

id

doe

dox

chd

dob

job

month

energy

height

weight

fat

fibre;

informat

doe

dox

dob

mmddyy10.;

format

doe

dox

dob

entry

exit

ddmmyy10.;

exposure=energy<2.75;

fail=chd;

drop

job

month

energy

height

weight

fat

fibre;

run;

/*

We

include

the

MACRO

*/

filename

lexispr

url

‘‘http://www.biostat.ku.dk/~pka/epidata/Lexis.sas’’;

%inc

lexispr;

25

/*

We

compute

cases

and

person-years

in

10

year

age-intervals

using

the

MACRO

*/

%lexis(data=ihdindiv,out=ald,entry=doe,exit=dox,fail=fail,

breaks=40

to

70

by

10,

origin=dob,

scale=365.25,

left=ageinterval);

/*

Finally,

weapply

GENMOD

to

the

individual

data

*/

proc

genmod

data=ald;

class

exposure

ageinterval;

model

fail=exposure

ageinterval/dist=poi

offset=lrisk

type3;

run;

26

Logis

tic

regre

ssio

n

Table

23.2

.C

ases

ofle

pro

syan

dco

ntr

ols

by

age

and

BC

Gsc

ar

Lep

rosy

Hea

lthy

Odds

case

spopula

tion

rati

o

BC

G−

+−

+es

tim

ate

Age

0–4

11

7593

11719

0.6

5

Age

5–9

11

14

7143

10184

0.8

9

Age

10–14

28

22

5611

7561

0.5

8

Age

15–19

16

28

2208

8117

0.4

8

Age

20–24

20

19

2438

5588

0.4

1

Age

25–29

36

11

4356

1625

0.8

2

Age

30–34

47

65245

1234

0.5

4

Tota

l159

101

34594

46028

0.4

8

27

ω=

odds

that

aper

son

(in

the

study)

isa

case

esti

mat

edby

case

/con

trol

rati

os:

Table

23.3

.C

ase/

contr

olra

tio

(×103)

by

age

and

BC

Gsc

ar

BC

Gsc

ar

Age

Abse

nt

Pre

sent

OR

0-4

0.1

30.0

80.6

5

5-9

1.5

41.3

70.8

9

10-1

44.9

92.9

10.5

8

15-1

97.2

53.4

50.4

8

20-2

48.2

03.4

00.4

1

25-2

98.2

66.7

70.8

2

30-3

48.9

64.8

60.5

8

28

Page 8: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Ber

noullilo

glikel

ihood

for

one

“ce

ll”

log(ω

)−

log(1

+ω)

=D

log(ω

)−

(D+

H)lo

g(1

+ω)

Model

:

log(o

dds)

=C

OR

NE

R+

BC

G+

AG

E

e.g.

,Sca

r+,A

ge10

-14:

22×

log(ω

)−

(22

+75

61)×

log(1

+ω)

=22×

{CO

RN

ER

+B

CG

(1)

+A

GE

(2)}

−75

83×

log(1

+ex

p{C

OR

NE

R+

BC

G(1

)+

AG

E(2

)})

29

Data

as

frequency

reco

rds:

Table

23.4

.T

he

BC

Gdata

as

freq

uen

cyre

cord

s

Case

sTota

lScar

Age

17594

00

111720

10

11

7154

01

14

10198

11

28

5639

02

22

7583

12

16

2224

03

28

8145

13

20

2458

04

19

5607

14

36

4392

05

11

1636

15

47

5292

06

61240

16

30

Maxim

um

likelihood

est

imate

s

ofpar

amet

ers

inth

em

odel

:

log(o

dds)

=C

OR

NE

R+

AG

E+

BC

G

Table

23.5

Para

mete

rE

stim

ate

SD

Corn

er

-8.8

80

0.7

093

Age(1

)2.6

24

0.7

340

Age(2

)3.5

83

0.7

203

Age(3

)3.8

24

0.7

228

Age(4

)3.9

00

0.7

244

Age(5

)4.1

56

0.7

224

Age(6

)4.1

58

0.7

213

BC

G(1

)-0

.547

0.1

409

Inte

rpre

tati

on:

Exerc

ise

23.1

31

Exerc

ise

23.1

:so

luti

on.

Table

23.3

.C

ase/

contr

olra

tio

(×103)

by

age

and

BC

Gsc

ar

BC

Gsc

ar

Age

Abse

nt

Pre

sent

OR

raw

OR

ML

E=

exp(−

0.5

47)

OR

MH

0-4

0.13

0.08

0.6

50.5

79

0.5

87

5-9

1.54

1.37

0.8

90.5

79

0.5

87

10-1

44.

992.

910.5

80.5

79

0.5

87

15-1

97.

253.

450.4

80.5

79

0.5

87

20-2

48.

203.

400.4

10.5

79

0.5

87

25-2

98.

266.

770.8

20.5

79

0.5

87

30-3

48.

964.

860.5

80.5

79

0.5

87

32

Page 9: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

We

know

that

if

π=

risk

offa

ilure

inth

est

udy

base

then

ω=

π

1−

π

wher

e

K=

Pro

b(a

“fai

lure

”is

incl

uded

asca

se)

Pro

b(a

“su

rviv

or”

isin

cluded

asco

ntr

ol)

33

Then

log

π

1−

π=

Cor

ner

+A

ge+

BC

G

=⇒

log(ω

)=

log(K

)+

Cor

ner

+A

ge+

BC

G

sam

eodds

rati

osw

hen

Kdoes

not

dep

end

on

Age

and

BC

G.

But

the

esti

mat

edC

orner

par

amet

erca

nnot

be

inte

rpre

ted.

34

SA

Spro

gra

m

databcgdata;

filenamebcgfileurl’http://www.biostat.ku.dk/~pka/epidata/bcgalldata.txt’;

infilebcgfilefirstobs=2;

inputagescarstatus$

n;

run;

procgenmoddata=bcgdata;

wherestatus=’case’orstatus=’conall’;

classagescar;

modelstatus=agescar/dist=binlink=logittype3;

weightn;

run;

35

TheGENMODProcedure

ModelInformation

DataSet

WORK.BCGDATA

Distribution

Binomial

LinkFunction

Logit

DependentVariable

status

ScaleWeightVariable

n

Numberof

ObservationsRead

28

Numberof

ObservationsUsed

28

SumofWeights

80882

Numberof

Events

14

Numberof

Trials

28

ClassLevelInformation

Class

Levels

Values

age

71

234567

scar

20

1

36

Page 10: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

ResponseProfile

Ordered

Total

Value

status

Frequency

1case

260

2conall

80622

PROCGENMODismodelingtheprobabilitythatstatus=’case’.Oneway

tochangethistomodeltheprobabilitythatstatus=’conall’is

to

specifytheDESCENDINGoptioninthePROCstatement.

CriteriaForAssessingGoodnessOfFit

Criterion

DF

Value

Value/DF

Deviance

20

3288.0412

164.4021

ScaledDeviance

20

3288.0412

164.4021

PearsonChi-Square

20

81842.1402

4092.1070

ScaledPearsonX2

20

81842.1402

4092.1070

LogLikelihood

-1644.0206

37

AnalysisOfParameterEstimates

Standard

Wald95%

Chi-

Parameter

DF

Estimate

Error

ConfidenceLimits

Square

Intercept

1-5.2695

0.1855

-5.6330

-4.9060

807.26

age

11

-4.1576

0.7222

-5.5731

-2.7422

33.14

age

21

-1.5341

0.2475

-2.0193

-1.0489

38.40

age

31

-0.5745

0.2028

-0.9720

-0.1771

8.03

age

41

-0.3335

0.2193

-0.7633

0.0963

2.31

age

51

-0.2575

0.2210

-0.6906

0.1756

1.36

age

61

-0.0020

0.2014

-0.3967

0.3927

0.00

age

70

0.0000

0.0000

0.0000

0.0000

.

scar

01

0.5471

0.1409

0.2709

0.8232

15.07

scar

10

0.0000

0.0000

0.0000

0.0000

.

Scale

01.0000

0.0000

1.0000

1.0000

38

AnalysisOf

Parameter

Estimates

Parameter

Pr

>ChiSq

Intercept

<.0001

age

1<.0001

age

2<.0001

age

30.0046

age

40.1283

age

50.2439

age

60.9920

age

7.

scar

00.0001

scar

1.

Scale

NOTE:Thescaleparameterwasheldfixed.

39

LR

StatisticsForType3

Analysis

Chi-

Source

DF

Square

Pr>

ChiSq

age

6181.18

<.0001

scar

115.30

<.0001

40

Page 11: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Data

as

indiv

idualre

cord

s:

Log

likel

ihood

contr

ibuti

on

for

one

per

son

log(ω

)−

log(1

+ω)

Sum

min

gov

erper

sons

gives

tota

llo

glikel

ihood.

41

SA

Spro

gra

m-

mela

nom

adata

.

data

mel;

filename

melfile

url

‘‘http://www.biostat.ku.dk/~pka/epidata/melanom.txt’’;

infile

melfile

firstobs=2;

input

casecon

sex

brevald

agr

hudfarve

hair

eyes

fregner

akutrea

kronrea

nvsmall

nvlarge

nvtot

ant15;

run;

proc

genmod

data=mel

descending;

class

hudfarve;

model

casecon

=hudfarve/

dist=bin

type3;

run;

42

Age

matc

hin

g(g

roup

matc

hin

g)

Table

23.6

.A

sim

ula

ted

grou

p-m

atch

edst

udy

Cas

esC

ontr

ols

BC

G−

+−

+

Age

0–4

11

35

5–9

1114

4852

10–1

428

2267

133

15–1

916

2846

130

20–2

420

1950

106

25–2

936

1112

662

30–3

447

617

438

Her

e,K

does

dep

end

onA

ge!

43

Case

/co

ntr

olra

tios:

Abse

nt

Pre

sent

0-4

0.3

30.2

0

5-9

0.2

30.2

7

10-1

40.4

20.1

7

15-1

90.3

50.2

2

20-2

40.4

00.1

8

25-2

90.2

90.1

8

30-3

40.2

70.1

6

44

Page 12: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

We

know

that

we

shou

ldco

rrec

tfo

rag

eth

ough

the

Age

esti

mat

esin

the

model

:

log(ω

)=

Cor

ner

+A

ge+

BC

G

cannot

be

inte

rpre

ted

(rel

ated

toth

est

udy

bas

e)

Case

sC

ontr

ols

Odds

Str

atu

m+

−+

−ra

tio

189

1180

202.

0

267

3350

502.

0

333

6720

802.

0

Tot

al18

911

115

015

01.

7

Table

18.4

.B

ias

due

toig

nori

ng

matc

hin

g

45

Matc

hed

and

unm

atc

hed

analy

sis.

Unm

atch

edA

ge-m

atch

ed

Par

amet

erE

stim

ate

SD

Est

imat

eSD

Cor

ner

-8.8

800.7

093

-1.0

670

0.8

00

Age

(1)

2.62

40.7

340

-0.0

421

0.8

27

Age

(2)

3.58

30.7

203

0.0

119

0.8

12

Age

(3)

3.82

40.7

228

0.0

713

0.8

14

Age

(4)

3.90

00.7

244

0.0

244

0.8

16

Age

(5)

4.15

60.7

224

-0.1

628

0.8

14

Age

(6)

4.15

80.7

213

-0.2

380

0.8

13

BC

G(1

)-0

.547

0.1

409

-0.5

721

0.1

55

46

Can

we

ever

inte

rpre

tth

eC

orner

par

amet

er?

Yes

,in

cum

ula

tive

inci

den

ceor

pre

vale

nce

studie

s

wher

eth

eabso

lute

risk

/pre

vale

nce

can

be

esti

mat

ed.

47

Hypoth

esi

ste

sts,

ch.

24.

Wald

test

for

asi

ngle

para

met

er:

( M−

0

S

) 2∼

χ2 1

(chi-sq

uare

)

dir

ectl

ybas

edon

com

pute

routp

ut:

Table

24.1

.P

rogra

moutp

ut

for

the

die

tdata

Par

amet

erE

stim

ate

SD

W

Cor

ner

-5.4

180

0.4

420

Expos

ure

(1)

0.8

697

0.3

080

7.9

7

Age

(1)

0.1

290

0.4

753

0.0

7

Age

(2)

0.6

920

0.4

614

2.2

5

48

Page 13: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Lik

elihood

ratio

test

:co

mpare

max.

log

likel

ihoods

“under

”and

“outs

ide”

the

hypot

hes

is.

Tes

tst

atis

tic

=-2

×diff

eren

cebet

wee

nm

ax.

log

likel

ihoods

Model

Max.

log

likel

ihood

#para

met

ers

Cor

ner

+A

ge+

Expos

ure

-247

.03

4

Cor

ner

+E

xpos

ure

-249

.04

2

Cor

ner

+A

ge-2

51.1

83

#par

amet

ers

rem

oved

=#

d.f.

inχ

2

Expos

ure

:8.

30,1

d.f.

Age

:4.

02,2

d.f.

Model

shav

eto

be

”nes

ted”

-w

eca

nnot

com

pare

the

last

two

model

s

inth

eta

ble

.

49

Max.

log

likelihood:

mea

sure

ofgoodnes

sof

fit

ofa

model

:

larg

erlo

glikel

ihood

=⇒

bet

ter

fit

Inte

rpre

tati

onof

–247

.03

?N

o!

Som

eti

mes

,th

edevia

nce

isin

troduce

das

asu

pple

men

tto

the

max.

log

likel

ihood.

50

Inte

ract

ion,se

ct.

24.3

We

hav

eas

sum

edth

atth

eeffect

ofexposu

reis

const

ant

over

age

bands

(and

vic

ever

sa).

Isth

atre

ason

able

?

Or

isth

ere

inte

raction

bet

wee

nag

ean

dex

pos

ure

?

log(R

ate

)=C

orn

er+

Exposu

re+

Age

+E

xpos

ure·A

ge

Not

eth

ere

lati

onsh

ipw

ith

the

Bre

slow

-Day

test

for

hom

ogen

eity

over

age

stra

ta.

How

ever

,w

enow

:

•ge

ta

quan

tifica

tion

ofhet

erog

enei

ty

•are

able

toad

just

for

oth

erex

pla

nato

ryva

riable

sw

hen

exam

inin

gin

tera

ctio

n

51

Table

24.5

.E

stim

ates

ofpar

amet

ers

inth

em

odel

wit

hin

tera

ctio

n

Par

amet

erE

stim

ate

SD

Cor

ner

-5.0

237

0.50

0

Expos

ure

(1)

-0.0

258

0.8

66

Age

(1)

-0.5

153

0.6

71

Age

(2)

0.3

132

0.6

12

Age

(1)·E

xposu

re(1

)1.2

720

1.0

20

Age

(2)·E

xposu

re(1

)0.8

719

0.9

73

Test

for

no

inte

raction:

Max.

log

likel

ihood

for

Corn

er+

Age

+E

xposu

re+

Age.

Exposu

re

is-2

46.1

9le

adin

gto

the

LR

test

1.6

7(2

d.f.)

52

Page 14: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Illu

stra

tive

exam

ple

without

inte

raction

Table

22.4

Expos

ure

Age

01

05.

015.0

112.0

36.0

230.0

90.0

05.

05.0

×3.0

112.0

12.0

×3.0

230.0

30.0

×3.0

05.

05.0

×3.0

15.

2.4

5.0×

2.4

×3.0

25.

6.0

5.0×

6.0

×3.0

Cor

ner

=5.

0A

ge(1

)=

2.4

Expos

ure

(1)

=3.0

Age(

2)

=6.0

53

Exam

ple

:Illu

stra

tive

valu

es

ofra

tes

wit

hin

tera

ctio

n

Table

24.2

.D

efinit

ion

ofin

tera

ctio

ns

inte

rms

ofexposu

re

Exposu

re

Age

01

05.

015.0

112.0

42.0

230.0

135.0

05.

05.0

×3.0

112.0

12.0

×3.5

230.0

30.0

×4.5

05.

05.0

×3.0

112.0

12.0

×3.0

×1.1

67

230.0

30.0

×3.0

×1.5

inte

ract

ion

para

met

ers

54

Exam

ple

:Illu

stra

tive

valu

es

ofra

tes

wit

hin

tera

ctio

n

Table

24.3

.D

efinit

ion

ofin

tera

ctio

ns

inte

rms

ofage

Expos

ure

Age

01

05.

015.0

112.0

42.0

230.0

135.0

05.

015.0

15.

2.4

15.0

×2.8

25.

6.0

15.0

×9.0

05.

015.0

15.

2.4

15.0

×2.4

×1.1

67

25.

6.0

15.0

×6.0

×1.5

inte

ract

ion

par

amet

ers

55

Table

24.4

.D

efinit

ion

ofin

tera

ctio

ns

inte

rms

ofexposu

reand

age

Exposu

re

Age

01

05.

05.0

×3.0

15.

2.4

5.0

×3.0

×2.4

×1.1

67

25.

6.0

5.0

×3.0

×6.0

×1.5

Exerc

ise

24.4

.

56

Page 15: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Exerc

ise

24.4

:so

luti

on.

Par

amet

erE

stim

ate

SD

Cor

ner

-5.0

237

0.50

0

Expos

ure

(1)

-0.0

258

0.8

66

Age

(1)

-0.5

153

0.67

1

Age

(2)

0.31

320.

612

Age

(1)·E

xpos

ure

(1)

1.2720

1.0

20

Age

(2)·E

xpos

ure

(1)

0.8719

0.9

73

log

4

607.9

=−

5.02

37,l

og

2311.9

4607.9

=−

0.02

58

(exce

pt

for

roundin

ger

rors

)

57

SA

Spro

gra

m

dataihd;

inputekspalderpyrscases;

lpyrs=log(pyrs);

datalines;/*or,alternatively,readfromwww*/

02

311.92

01

878.112

00

667.514

12

607.94

11

1272.15

10

888.98

; run;

procgenmoddata=ihd;

classekspalder;

modelcases=ekspaldereksp*alder/dist=poioffset=lpyrs

type3;

run;

58

TheGENMODProcedure

ModelInformation

DataSet

WORK.IHD

Distribution

Poisson

LinkFunction

Log

DependentVariable

cases

OffsetVariable

lpyrs

ObservationsUsed

6

ClassLevelInformation

Class

Levels

Values

eksp

20

1

alder

30

12

59

CriteriaForAssessingGoodnessOfFit

Criterion

DF

Value

Value/DF

Deviance

00.0000

.

ScaledDeviance

00.0000

.

PearsonChi-Square

00.0000

.

ScaledPearsonX2

00.0000

.

LogLikelihood

53.3799

60

Page 16: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

AnalysisOfParameterEstimates

Standard

Wald95%

Chi-

Parameter

DF

Estimate

Error

ConfidenceLimits

Square

Intercept

1-5.0237

0.5000

-6.0037

-4.0437

100.95

eksp

01

-0.0258

0.8660

-1.7232

1.6716

0.00

eksp

10

0.0000

0.0000

0.0000

0.0000

.

alder

01

0.3132

0.6124

-0.8871

1.5134

0.26

alder

11

-0.5153

0.6708

-1.8301

0.7995

0.59

alder

20

0.0000

0.0000

0.0000

0.0000

.

eksp*alder

00

10.8719

0.9728

-1.0349

2.7786

0.80

eksp*alder

01

11.2720

1.0165

-0.7204

3.2643

1.57

61

AnalysisOf

Parameter

Estimates

Parameter

Pr>

ChiSq

Intercept

<.0001

eksp

00.9762

eksp

1.

alder

00.6091

alder

10.4424

alder

2.

eksp*alder

00

0.3701

eksp*alder

01

0.2108

62

LR

StatisticsForType3

Analysis

Chi-

Source

DF

Square

Pr>

ChiSq

eksp

13.09

0.0790

alder

24.37

0.1125

eksp*alder

21.67

0.4333

63

Table

24.5

.R

epor

ting

esti

mate

sfr

om

the

model

wit

hin

tera

ctio

n:

Rep

aram

etri

zein

tose

par

ate

effec

tsofE

xposu

rew

ithin

each

Age

band.

Par

amet

erE

stim

ate

SD

RR

Cor

ner

-5.0

237

0.50

0

Expos

ure

(1)·

Age(

0)

-0.0

258

0.8

66

0.9

7

Expos

ure

(1)·

Age(

1)

1.2

461

0.5

32

3.4

8

Expos

ure

(1)·

Age(

2)

0.8

461

0.4

43

2.3

3

Age

(1)

-0.5

153

0.6

71

0.6

0

Age

(2)

0.3

132

0.6

12

1.3

7

64

Page 17: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Inte

ract

ions:

whic

hto

study?

When

the

model

conta

ins

pco

vari

ates

ther

ear

ep(p

−1)

/2poss

ible

two-fact

or

inte

ract

ions

(e.g

.,45

for

p=

10).

Itis

out

ofth

eques

tion

tost

udy

them

all,

soa

gener

al

reco

mm

endat

ion

isto

rest

rict

atte

nti

onto

thos

eth

atw

ere

pre

-spec

ified

inth

ere

sear

chpro

toco

l:

“D

on’t

ask

aques

tion

ifyou

are

not

inte

rest

edin

the

reply

!”

Ther

ew

illal

sobe

aty

pe

Ier

ror

pro

ble

m:

“ifyou

ask

too

man

y

ques

tion

syou

willge

tto

om

any

wro

ng

answ

ers”

.

65

Inte

ract

ion

issc

ale

dependent.

Tab

leof

dis

ease

rate

s:

Fac

tor

AFac

tor

B

Abse

nt

Pre

sent

Abse

nt

0.1

0.2

Pre

sent

0.3

λ

Ifλ

=0.

6th

enth

era

tera

tio

asso

ciat

edw

ith

the

pre

sence

offa

ctor

A

is3

bot

hw

hen

fact

orB

isabse

nt

or

pre

sent;

and

the

rate

ratio

asso

ciat

edw

ith

the

pre

sence

offa

ctor

Bis

2both

when

fact

or

Ais

abse

nt

orpre

sent.

How

ever

,th

era

tediff

ere

nce

asso

ciat

edw

ith

the

pre

sence

offa

ctor

A

is0.

2w

hen

fact

orB

isabse

nt

and

0.4

ifit

ispre

sent

and

the

rate

diff

ere

nce

asso

ciat

edw

ith

the

pre

sence

offa

ctor

Bis

0.1

when

fact

or

Ais

abse

nt

and

0.3

ifit

ispre

sent 6

6

Fac

tor

AFac

tor

B

Abse

nt

Pre

sent

Abse

nt

0.1

0.2

Pre

sent

0.3

λ

Ifλ

=0.

4th

enth

era

tediff

ere

nce

asso

ciat

edw

ith

the

pre

sence

of

fact

or

Ais

0.2

bot

hw

hen

fact

orB

isab

sent

orpre

sent;

the

rate

diff

ere

nce

asso

ciat

edw

ith

the

pre

sence

offa

ctor

Bis

0.1

both

when

fact

or

Ais

abse

nt

orpre

sent.

How

ever

,th

era

tera

tio

asso

ciat

edw

ith

the

pre

sence

offa

ctor

Ais

3

when

fact

orB

isabse

nt

and

2if

itis

pre

sent

and

the

rate

ratio

asso

ciat

edw

ith

the

pre

sence

offa

ctor

Bis

2w

hen

fact

orA

isab

sent

and

1.33

ifit

ispre

sent

67

Inte

ract

ion

betw

een

2exposu

res

Table

24.6

.C

ases

(con

trol

s)fo

rora

lcancer

study.

Alc

ohol(o

z/day,1

dri

nk∼

0.3

oz/day).

Tobacco

01

23

(cig

s/day)

00.1

-0.3

0.4

-1.5

1.6

+

0(0

)10

(38)

7(2

7)

4(1

2)

5(8

)

1(1

-19)

11

(26)

16

(35)

18

(16)

21

(20)

2(2

0-3

9)

13

(36)

50

(60)

60

(49)

125

(52)

3(4

0+

)9

(8)

16

(19)

27

(14)

91

(27)

Table

24.7

.C

ase/

contr

olra

tios

for

the

ora

lcancer

data

.

Alc

ohol

Tobacco

01

23

00.2

60.2

60.3

30.6

3

10.4

20.4

61.1

31.0

5

20.3

60.8

31.2

22.4

0

31.1

20.8

41.9

33.3

7

68

Page 18: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Isth

eeff

ect

ofto

bac

coth

esa

me

for

allle

vel

sof

alco

hol

consu

mpti

on?

SY

NER

GIS

M?

=IN

TER

AC

TIO

N

ButC

OR

RELA

TIO

Nis

som

ethin

gco

mple

tely

diff

eren

t

69

Fig

.24.2

.N

est

ing

ofm

odels

.

5.

Corner+

Alc

ohol+

Tobacco+

Alc

ohol.Tobacco

4.

Corner+

Alc

ohol+

Tobacco

2.

Corner+

Alc

ohol

3.

Corner+

Tobacco

1.

Corner

��

�� ��

��

�� ��

��

�� ��

��

�� �� �

Exerc

ise

24.6

70

Exerc

ise

24.6

:Log-lik

elihoods

5.

-577.6

5

4.

-580.9

9

2.

-596.6

23.

-608.5

9

1.

-643.9

3

��

�� ��

��

�� ��

��

�� ��

��

�� �� � 71

Dose

-resp

onse

models

Expla

nato

ryva

riable

sw

ith

ord

ere

dcate

gori

es.

Table

25.1

.A

lcoholand

tobacco

trea

ted

asca

tego

rica

lva

riab

les

Par

amet

erE

stim

ate

SD

Cor

ner

-1.6

090

0.2

654

Alc

ohol(

1)

0.2

897

0.2

327

Alc

ohol(

2)

0.8

437

0.2

383

Alc

ohol(

3)

1.3

780

0.2

256

Tob

acco

(1)

0.5

887

0.2

844

Tob

acco

(2)

1.0

260

0.2

544

Tob

acco

(3)

1.4

090

0.2

823

72

Page 19: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Alt

ern

ati

ve:

monoto

ne

effect

ofto

bacc

oFig

.20.1

.Log

-lin

ear

tren

d

�Log(o

dds)

��

��

01

23

Dose

,z

β

β

73

Look

atsu

ccess

ive

diff

ere

nces

bet

wee

neff

ects

:

Tob

acco

(1),

Tob

acco

(2)-

Tobacc

o(1

),Tobacc

o(3

)-Tobacc

o(2

)

Exerc

ise

25.1

Intr

oduce

avari

able

takin

gva

lues

0,1,

2or

3an

dden

ote

its

effec

tby

[Tob

acco

]

Model

:lo

g(O

dds)

=C

orn

er+

Alc

ohol+

[Tobacc

o]

74

Exerc

ise

25.1

:so

luti

on.

Table

25.1

.A

lcoholand

tobacco

trea

ted

asca

tego

rica

lva

riab

les

Par

amet

erE

stim

ate

SD

Succ

.diff

.

Cor

ner

-1.6

090

0.26

54

Alc

ohol(

1)

0.2

897

0.2

327

0.2

897

Alc

ohol(

2)

0.8

437

0.2

383

0.5

540

Alc

ohol(

3)

1.3

780

0.2

256

0.5

543

Tob

acco

(1)

0.58

870.

2844

0.58

87

Tob

acco

(2)

1.02

600.

2544

0.43

73

Tob

acco

(3)

1.40

900.

2823

0.38

30

75

Model:

log(O

dds)

=C

orn

er

+A

lcohol+

[Tobacc

o]

Table

25.2

.T

he

linear

effect

ofto

bacco

consu

mpti

on

Alc

ohol

Tobacco

log(O

dds)=

Corner

+...

00

-

01

[Tobacco]

02

[Tobacco]

03

[Tobacco]

10

Alc

ohol(

1)

11

Alc

ohol(

1)+

[Tobacco]

12

Alc

ohol(

1)+

[Tobacco]

13

Alc

ohol(

1)+

[Tobacco]

20

Alc

ohol(

2)

21

Alc

ohol(

2)+

[Tobacco]

22

Alc

ohol(

2)+

[Tobacco]

23

Alc

ohol(

2)+

[Tobacco]

30

Alc

ohol(

3)

31

Alc

ohol(

3)+

[Tobacco]

32

Alc

ohol(

3)+

[Tobacco]

33

Alc

ohol(

3)+

[Tobacco] 76

Page 20: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Table

25.3

.Lin

ear

effect

ofto

bacco

per

level

Par

amet

erE

stim

ate

SD

Cor

ner

–1.5

250

0.21

9

Alc

ohol(

1)

0.3

020

0.2

32

Alc

ohol(

2)

0.8

579

0.2

37

Alc

ohol(

3)

1.3

880

0.2

25

[Tob

acco

]0.

4541

0.08

3

77

Sim

ilarl

yw

ith

alc

oholconsu

mption:

intr

oduce

vari

able

wit

hva

lues

=0,1,2

or

3

and

den

ote

its

effec

t[A

lcoh

ol]

Table

25.4

.Lin

ear

effects

ofalc

oholand

tobacco

per

level

Par

amet

erE

stim

ate

SD

Cor

ner

–1.6

290

0.1

860

[Alc

ohol]

0.4

901

0.0

676

[Tob

acco

]0.4

517

0.0

833

Exerc

ise

25.3

78

Exerc

ise

25.3

:so

luti

on.

Tob

acc

o(3

)+A

lcohol(

3)=

2.7

870

[Tob

acco

]+

[Alc

ohol]

=2.

8254

79

Alt

ern

ati

ve

ways

ofsc

ori

ng

Tob

acco

:ci

gare

ttes

/day

(0:

0,1-1

9:

10,20-3

9:

30,40+

:50)

Alc

ohol

:ou

nce

s/day

(0.0

:0,0.1

-0.3

:0.2

,0.4

-1.5

:1.0

,1.6

+:

2.0

)

Table

25.5

.A

lcoh

olin

ounces/

day

and

tobac

coin

cig

are

ttes/

day

Par

amet

erE

stim

ate

SD

Cor

ner

–1.2

657

0.1

539

[Alc

ohol]

0.6

484

0.0

881

[Tob

acco

]0.0

253

0.0

046

80

Page 21: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Test

ing

–Tes

tfo

rlinea

rity

:

1)C

ompar

ing

the

“nes

ted”

model

s:

log(

Odds)

=C

orner

+A

lcoh

ol+

Tob

acco

and

log(

Odds)

=C

orner

+A

lcoh

ol+

[Tob

acco

],

her

e:LR

test

=0.

38,2.

d.f.,

or 2)

elim

inati

ng

[Tob

sq](=

0,1,4,9)

from

log(

Odds)

=C

orner

+A

lcoh

ol+

[Tob

acco

]+

[Tob

sq],

her

eLR

test

=0.

02.

81

–Tre

nd

test

:(1

.d.f.)

Elim

inat

ing

[Tobac

co]fr

omth

em

odel

:

log(

Odds)

=C

orn

er+

Alc

ohol+

[Tobacc

o],

her

eLR

test

=30

.88.

Why

not

use

indiv

idualle

vel

s,th

at

is,a

truly

quanti

tati

ve

cova

riate

and

no

cate

gori

zati

on

atall?

Pro

san

dco

ns

•In

form

atio

nis

lost

by

cate

gori

zati

on

•C

ateg

orie

sm

aybe

more

robust

(e.g

.,sm

okin

g)

•Few

outl

iers

may

hav

ela

rge

influen

ce(“

Casa

nov

aeff

ect”

!)

•M

odel

wit

ha

linea

reff

ect

isno

longer

“nes

ted”

inca

tegori

cal

model

⇒al

tern

ativ

ealt

ernati

ves

are

nee

ded

when

test

ing

linea

rity

82

Indic

ato

rvari

able

s

The

way

inw

hic

hth

eca

tego

rica

lco

vari

ates

are

ente

red

into

the

regr

essi

onm

odel

.

Table

25.8

.In

dic

ator

vari

able

sfo

rth

eth

ree

alco

hol

par

amet

ers

A1

A2

A3

Lev

ello

g(O

dds)

=C

orner

+···

00

00

10

01

Alc

ohol(

1)

01

02

Alc

ohol(

2)

00

13

Alc

ohol(

3)

83

The

use

ofin

dic

ator

vari

able

sen

able

sth

epro

gra

mm

erto

choose

his

/her

pre

ferr

edre

fere

nce

level.

Inte

raction

term

sare

sim

ple

pro

ducts

ofin

dic

ato

rva

riabel

s.

Table

25.1

0.

Indic

ato

rva

riable

sfo

rin

tera

ctio

npara

met

ers

A1

A2

A3

TA

1·T

A2·T

A3·T

00

00

00

0

00

01

00

0

10

00

00

0

10

01

10

0

01

00

00

0

01

01

01

0

00

10

00

0

00

11

00

1

NB

:Tob

acco

isher

eon

2le

vel

sonly

84

Page 22: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Tre

ati

ng

the

zero

leveldiff

ere

ntl

yFig

.25.1

.Sep

arat

ing

zero

expos

ure

from

the

dos

e-re

spon

se.

�Log(o

dds)

��

��

01

23

Dose

,z

Corn

er

Corn

er+

Non-s

moker

Non-s

moker

��

��

85

Corr

esponds

toaddin

ga

new

vari

able

[Non-s

moker

]

Table

25.1

1.

Sep

arat

ing

zero

exposu

refr

om

the

dose

-res

ponse

Tob

acco

Non

-sm

oker

log(

Odds)

=C

orner

+···

01

[Non

-sm

oker

]

10

[Tob

acco

]

20

[Tob

acco

]

30

[Tob

acco

]

(Alt

ernat

ive:

incl

ude

[Sm

oker

]=

1-

[Non-s

moker

])

86

Indic

ator

vari

able

sm

aybe

chos

enin

sever

alw

ays.

E.g

.,to

model

success

ive

diff

ere

nces

Table

25.1

2.

Indic

ator

sto

com

par

eea

chle

vel

wit

hth

eon

ebef

ore

Tob

acco

D1

D2

D3

00

00

11

00

21

10

31

11

Her

e,D

1=

indic

ator

for

Tob

acco

≥1

D2

=in

dic

ator

for

Tob

acco

≥2

D3

=in

dic

ator

for

Tob

acco

≥3

87

Tru

lyquanti

tati

ve

covari

ate

s,x

Ina

model

like

log(

Rat

e)=

Corn

er+

Exposu

re+

[x]

the

effec

tof

xis

assu

med

tobe

linea

r,i.e.

[x]ex

pre

sses

the

change

in

log(

Rat

e)per

1unit

chan

ge

ofx.

To

test

for

linea

rity

,on

em

ayadd

[xsq

]to

the

model

wher

exsq

=x

2.

An

alt

ernati

ve

alte

rnati

ve

isa

linea

rsp

line.

88

Page 23: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Lin

ear

splines

An

alte

rnat

ive

toa

stra

ight

line

isa

bro

ken

line.

Intr

oduce

bre

ak

poi

nts

for

x,e.

g.,

a1,a

2,a

3and

add

the

thre

elinea

r

splines

I 1×

[x−

a1],

I 2×

[x−

a2],

I 3×

[x−

a3]

to[x

]:

Her

e,I 1

=in

dic

ator

for

x≥

a1

I 2=

indic

ator

for

x≥

a2

I 3=

indic

ator

for

x≥

a3

The

par

amet

erfo

rth

esp

line

I 1×

[x−

a1]gi

ves

the

change

inslope

at

the

bre

ak

poi

nt

a1.

Sim

ilarl

yfo

ra2,a

3.

Splines

are

easy

topro

gram

and

par

amet

ers

are

easi

erto

inte

rpre

t

than

for

quadra

tic

term

s(q

uadra

tic

and

cubic

splines

als

oex

ist)

.

89

24

68

10

−2.0 −1.5 −1.0 −0.5 0.0 0.5

x

Linear Predictor

24

68

10

−2.0 −1.5 −1.0 −0.5 0.0 0.5

x

Linear Predictor

24

68

10

−2.0 −1.5 −1.0 −0.5 0.0 0.5

x

Linear predictor

24

68

10

−2.0 −1.5 −1.0 −0.5 0.0 0.5

x

Linear Predictor

90

Inte

ract

ion:

Are

searc

her

’satt

itude

tow

ards

inte

ract

ion

dep

ends

on

the

kin

dof

vari

able

sin

vol

ved

:

–a)

2ex

pos

ure

s

–b)

2co

nfo

under

s

–c)

1ex

pos

ure

and

1co

nfo

under

a)For

exam

ple

the

effec

tof

alc

oholand

tobacco

onora

l

cancer

b)

Pre

vale

nce

ofm

onoclo

nalgam

mapath

yby

occ

upat

ion

91

Table

26.1

.P

reva

lence

ofm

onoclo

nalgam

mapath

y

Agri

cult

ura

l(0

)N

on-a

gri

cult

ura

l(1

)

Age

Mal

e(0

)Fem

ale

(1)

Male

(0)

Fem

ale

(1)

<40

(0)

1/15

901/1926

2/1527

0/712

40-4

9(1

)12

/234

57/2677

3/854

0/401

50-5

9(2

)24

/278

715/2902

5/675

4/312

60-6

9(3

)53

/248

938/3145

3/184

1/80

70+

(4)

95/2

381

63/2918

2/75

0/20

How

toco

ntr

olfo

rse

xan

dage

when

studyin

gth

eeff

ect

ofw

ork

?

92

Page 24: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Str

atification

(Man

tel-H

aensz

el)∼

logis

tic

regre

ssio

nm

odel:

log(

Odds)

=C

orner

+A

ge+

Sex

+A

ge·Sex

+W

ork

Work

:–0

.134

(SD

=0.

244)

(10

par

amet

ers

+W

ork)

Sim

ple

rm

odel:

log(

Odds)

=C

orner

+A

ge+

Sex

+W

ork

(i.e

.,no

Age·Sex

inte

raction):

Work

:–0.1

36

(SD

=0.

243)

Lik

elihood

ratio

test

for

no

inte

raction:

is0.

878

wit

h4

deg

rees

offr

eedom

.

When

the

confo

under

shav

em

any

level

sth

ere

willbe

man

yd.f.’s.

93

c)in

tera

ctio

nbet

wee

nw

ork

and

sex

orw

ork

and

age

Table

26.2

.Tes

ting

for

inte

ract

ion

Model

−lo

gL

Cor

ner

+A

ge+

Sex

+W

ork

36.4

0

Cor

ner

+A

ge+

Sex

+W

ork

+W

ork·A

ge

35.4

8

Cor

ner

+A

ge+

Sex

+W

ork

+W

ork·S

ex36.2

0

Effect-

modifi

cation:

Exerc

ise

26.2

94

Exerc

ise

26.2

:so

luti

on.

LR

test

for

no

Wor

k·A

ge

inte

ract

ion:

1.84,4

d.f.

LR

test

for

no

Wor

k·S

exin

tera

ctio

n:

0.41,1

d.f.

95

Tre

ati

ng

Age

as

aquanti

tati

ve

vari

able

�Log(o

dds)

������

-8-7-6-5-4-3

��

��

40

50

60

70

Age

Fig

.26.1

.Log

pre

vale

nce

odds

by

age:

male

s,agri

cult

ura

lw

ork

ers

Exerc

ise

26.3

96

Page 25: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Incl

udin

g[A

ge](3

5,45

,55

,65

,75

)gi

ves

:

Work

=–0.1

86

Incl

udin

g[A

ge]an

d[A

gesq

](3

5)2,(4

5)2,(5

5)2,(6

5)2,(7

5)2,gi

ves

:

Table

26.3

.A

quadra

tic

rela

tionsh

ipw

ith

age

Par

amet

erE

stim

ate

SD

Cor

ner

-6.6

820.

344

Wor

k(1

)-0

.148

0.2

43

[Age

]1.

204

0.26

4

[Age

sq]

-0.0

840.

049

Sex

(1)

-0.5

830.

115

Tes

tfo

r[A

ges

q]:

3.13

(1d.f.)

The

esti

mat

edeff

ect

ofex

pos

ure

(Wor

k)

may

dep

end

onhow

the

confo

under

(Age

)is

model

led.

97

Tes

tfo

rA

ge-W

ork

inte

raction

usi

ng

[Age]

:

Table

26.4

.In

tera

ctio

nbet

wee

nag

e(q

uan

tita

tive)

and

wor

k

Par

amet

erE

stim

ate

SD

Cor

ner

–6.2

11

0.2

01

Wor

k(1

)–0.2

99

0.4

71

[Age

]0.7

63

0.0

58

Sex

(1)

–0.5

84

0.1

15

[Age

]·W

ork

(1)

0.0

53

0.1

88

Exerc

ises

26.4

-5

98

Exerc

ise

26.4

:so

luti

on.

Wal

dte

stfo

rno

Work·[A

ge]in

tera

ctio

n:

(0.5

3/0.

188)

2=

0.0

79,

1d.f.

99

Table

26.5

.In

tera

ctio

nbet

wee

n[A

ge]and

Work

Par

amet

erE

stim

ate

SD

Cor

ner

–7.0

64

0.5

53

Age

(1)

1.6

66

0.5

67

Age

(2)

2.3

94

0.5

62

Age

(3)

3.2

39

0.5

62

Age

(4)

3.8

60

0.5

59

Sex

(1)

–0.5

85

0.1

15

Wor

k(1

)0.0

46

0.5

44

[Age

]·W

ork

(1)

–0.0

83

0.2

20

Wald

test

for

inte

ract

ion:

( −0.0

83

0.2

20

) 2=

0.14

Lik

elihood

ratio

test

:0.1

4

100

Page 26: The Clayton & Hills book Mean blood pressure in patients ...publicifsv.sund.ku.dk/~nk/epiE12/PKA slides regression 4 pr. side.pdf · The Clayton & Hills book 1. Introductory concepts

Concl

usi

ons:

For

the

exam

ple

:no

evid

ence

what

soev

erof

any

effec

tof

wor

kon

the

outc

ome

Ingenera

l:th

ere

isoft

ena

LA

RG

E

num

ber

ofpos

sibilit

ies

avai

lable

when

anal

ysi

ng

regr

essi

on

model

s:

–w

hic

hva

riable

sto

incl

ude

–how

toin

clude

them

(dos

e-re

spon

se,in

tera

ctio

ns)

Kee

pth

isin

min

dw

hen

read

ing

the

publish

edlite

ratu

re!

Kee

pin

min

dth

epurp

ose

ofth

est

udy

when

analy

sing

the

data

!

101