Solutions to HW 3 PubH 7440 - people.vcu.edu

Solutions to HW 3 PubH 7440

●

●

●

●

●

●

●

●

●

●

●

●

−1.5 −1.0 −0.5 0.0 0.5 1.0

0.5

1.0

1.5

2.0

2.5

3.0

Centered Temp at baseline

Cha

nge

in T

emp

1hr

afte

r A

spiri

n

11 (a) A plot of the raw data shows that a positive linear association between baseline

temperature and the change in temperature one hour after administration of Aspirin

is plausible.

(b) Using BRugs (or R2WinBUGS) we fit the SLR model using vague priors on

all parameters, 1,000 samples of burn-in and 20,000 total samples. The result-

ing posterior expectation and 95% credible interval estimates for β1 are 0.8409

PubH 7440 Spring 2015 1


and (0.3151, 1.3590), respectively. These results confirm the investigators suspi-

cions that initial temperature is related to subsequent reduction in temperature

following the administration of Aspirin. There is strong evidence that higher ini-

tial temperatures are related to larger temperature reductions. The model size is

pD = 2.701 with DIC = 25.25.

R Code:

#I preprocessed the notepad file to contain a column

#titled ’X’ and a column titled ’Y’

aspirin = read.table("data.txt",header=TRUE)

aspirin[13,] <- c(100,NA) # New patient

aspirin[,’Z’] <- aspirin[,’X’] - aspirin[,’Y’] # Change in temp

#Here is the Model File, which can also be written directly in R!

write("model{

for(i in 1:n) {

Z[i] ~ dnorm(mu[i] , tau)

mu[i] <- beta0 + beta1*(X[i] - mean(X[]))

# Approximate standardized residuals:

sresid_apr[i] <- (Z[i] - mu[i]) / sigma

# Flag observations with sresid_apr abs values > 1.5 as outliers:

outlier_apr[i] <- step(sresid_apr[i] - 1.5) + step(-(sresid_apr[i] + 1.5))

# Approximate CPO:



CPO_apr[i] <- sqrt(tau)* exp(-tau/2* (Z[i] - mu[i])* (Z[i] - mu[i]))

}

Z13gt0 <- step(Z[13]) #Indicator for Z[13]>0,

#i.e. new patient having positive temperature change

beta0 ~ dflat()

beta1 ~ dflat()

tau <- 1/(sigma*sigma) # Gelman prior on sigma

sigma ~ dunif(0.01, 100)

}","aspirin_model.txt")

#Using BRugs

library(’BRugs’)

fit.BRugs <- BRugsFit(modelFile="aspirin_model.txt",

data=list(n=dim(aspirin)[1],X=aspirin[,’X’],Z=aspirin[,’Z’]),

inits=list(list(beta0=0,beta1=1,sigma=1, Z=c(rep(NA,12),0)),

list(beta0=.5,beta1=.5, sigma=.5,Z=c(rep(NA,12),.5))),

numChains=2, parametersToSave=c(’beta0’,’beta1’,’mu’,’sresid_apr’,

’outlier_apr’,’CPO_apr’,’Z[13]’,’Z13gt0’),

nBurnin = 1000, nIter=20000,DIC=TRUE)

#or using R2WinBUGS

library(’R2WinBUGS’)



inits <- function(){list(beta0=rnorm(1,0,3),beta1=rnorm(1,0,3),

sigma=runif(1,0.01,100),Z=c(rep(NA,12),rnorm(1,0,3)))}

fit.R2BUGS <- openbugs(data=list(n=dim(aspirin)[1],X=aspirin[,’X’],

Z=aspirin[,’Z’]), inits, parameters.to.save = c(’beta0’,’beta1’,

’mu’,’sresid_apr’,’outlier_apr’,’CPO_apr’,’Z[13]’,’Z13gt0’),

"aspirin_model.txt", n.chains = 2, n.iter = 20000,

n.burnin = 1000, DIC=TRUE)

(c) The posterior density is plotted in Figure 2, with a posterior expectation of

-0.1015 and a 45.33% chance this patient’s temperature will be reduced by Aspirin.

Thus, this model predicts the child’s temperature will increase following aspirin.

Clearly, extrapolation of this fitted model to an initial temperature outside the

range of the observed initial temps is problematic.

(d) All the approximate CPO values are > .45 and the posterior probabilities of

being an outlier are all < .5 (using the absolute value of the standardized residual

> 1.5 criteria). Thus, there does not appear to be any outliers and the linear model

fits the data quite well. Observation 1 has approximate CPO 0.58, standardized

residual 1.42 (0.42 probability of outlier). Observation 11 has approximate CPO

0.56, standardized residual -1.45 (0.45 probability of outlier).



−4 −2 0 2 4

0.0

0.1

0.2

0.3

0.4

Z[13]

Den

sity



14 See solution in back of the book.

15 (a)

p(Y |θ) ∝e−Y/θθ−1 & π(θ) ∝ e−1/(βθ)θ−(α+1)

Thus, q(θ|Y ) ∝e−βY+1βθ θ−((α+1)+1)

∝e−1/( ββY+1

θ)θ−((α+1)+1),

which we recognize as the kernel of the InvGamma(α + 1, ββY+1

) distribution.

(b) The mean and variance of an InvGamma(α, β) distribution are [β(α−1)]−1

and [β2(α−1)2(α−2)]−1 for α > 2, respectively. Plugging in the relevant quantities

we have E[θ|Y ] = βY+1αβ

and V ar[θ|Y ] = (βY+1)2

(αβ)2(α−1)for α > 1.

These sub-parts weren’t required, but here they are anyway:

(c) The mode is the posterior maximum which is the solution to ddθ

log(q(θ|Y )) =

0. So we have, βY+1βθ−2 = (α + 2)θ−1. Thus, θ̂ = βY+1

β(α+2)is the posterior mode.

(d) Finally, there are a few options but∫∞Lq(θ|Y )dθ = 0.975 and

∫ U0q(θ|Y )dθ =

0.975 work, solving for U and L. Then, (U,L) delivers the 95% equal tailed posterior

credible interval.



16 (a)

p(X|θ) ∝θnr(1− θ)∑ni=1Xi & π(θ) ∝ θα−1(1− θ)β−1

Thus, q(θ|X) ∝θnr+α−1(1− θ)∑ni=1Xi+β−1,

which we recognize as the kernel of the Beta(nr + α,∑n

i=1 xi + β) distribution.

(b) Given r = 5, n = 10,n∑i=1

xi = 70, α = 2, and β = 2, we have q(θ|x) =

Beta(52, 72). Now P (H1|x) =.5∫0

q(θ|x)dθ = .965, so P (H2|x) = 1−P (H1|x) = .035

and thus, H1 is a far more likely hypothesis a posteriori. (Note: Compute P (H1|x)

using the R command pbeta(.5,52,72))

(c) BF = P (H1|x)/P (H2|x) = .965/.035 = 27.6 (Hatfield:) Using the fact that

the models are a priori equally probable [P (H1) = P (H2)], since the Beta(2, 2) is

symmetric about 0.5. This Bayes Factor again indicates strong evidence in favor of

H1. Following Jeffreys, we consider log10(BF ) in units of 0.5, where log10(27.6) ≈

1.4.



19 (Groth) Hint: Let Xw be the number of whites that are selected. Let Xb be the

number of blacks selected.

Then, we know that Xw ∼ Bin(nw, θw) and Xb ∼ Bin(nb, θb).

Now, lets write out the joint likelihood. We will assume that we do not have any

information on θw or θb. Specifically, we will assume that π(θw)π(θb) = 1. We know

that a uniform distribution is a special case of the Beta distribution. Specifically

Unif(0, 1) = Beta(1, 1). The uniform(0,1) as a pdf is written as just 1. So this is

equivalent.

∝ θXww (1− θw)nw−XwθXbb (1− θb)nb−Xb ∗ 1

Then, we can quickly recognize that this has the form of a two Beta distributions

multipled by each other. Specifically,

Beta(θw|αw, βw)Beta(θb|αb, βb)

where αw = Xw + 1, αb = Xb + 1, βw = nw −Xw + 1, etc. Now, we can fill in Xb,

Xw, nw, and nb based on the table. This will give you numbers for each α and β.

Now, we need to see if θw = θb. We know the posterior distributions of θw and θb

from the above phrase. Namely,

Beta(θw|αw, βw)

Beta(θb|αb, βb)



Now, in R we can sample a set of θws and a set of θbs. Specifically, can sample G

θws using the rbeta function using parameters αw and βw. Similarly, we can sample

a particular number G of θb. Then we can take these G number of θw and G θb and

subtract them. So do θw − θb . Then we simply determine if θw − θb > 0. If it is

greater than 1, give it a 1. If not give it a 0. Do this for each G and then calculate

the average of this 0 or 1 category. This will tell you roughly θ2 = θb.

Now to formally test if θw = θb you should use Bayes Factor. To do this you should

follow the formula given in the textbook. In particular:

BF =p(y|M1)

p(y|M2)=

∫ (nwXw

)(nbXb

)θXw+Xb(1− θ)nb+nw−Xb−Xwdθ

∫ ∫ (nwXw

)(nbXb

)θXww (1− θw)nw−XwθXbb (1− θb)nb−Xbdθbdθw

Now, we know that if we integrate with respect to θ on top, we can pull a constant

out because we can recognize the kernel of a Beta distribution. Similarly with the

demonimator. This will allow you to arrive at a fraction where you can fill in the

values and obtain a numerical answer.



You can also use

BF ′ =

∫ ( nw + nbXw +Xb

)θXw+Xb(1− θ)nb+nw−Xb−Xwdθ

∫ ∫ (nwXw

)(nbXb

)θXww (1− θw)nw−XwθXbb (1− θb)nb−Xbdθbdθw

This will have you arrive at something multiplied by the previous Bayes Factor.

Please interpret what this means. This model compares to when Xw + Xb ∼

Bin(nb + nw, θ). The first Bayes factor assumes that θw = θb = θ only.

Solution: (Hatfield:) There are several ways to approach this problem. Let us

first assume two independent binomial distributions, with XW ∼ Bin(nW , θW ) the

number of whites selected and XB ∼ Bin(nB, θB) the number of blacks selected.

If we put independent uninformative priors on the θ parameters, the joint prior is

π(θB)π(θW ) = 1. Then the joint posterior distribution is

q(θB, θW |XB, XW ) =θXBB (1− θB)nB−XBθXWW (1− θW )nW−XW

=θ14B (1− θB)30θ41

W (1− θW )39

=Beta(θB|15, 31)Beta(θW |42, 40).

We may wish to compute the posterior probability that the selection probability

is lower for blacks than whites. This may be computed by integrating the joint



distribution,

P (θB < θW |XB, XW ) =

∫ 1

0

∫ θW

0

q(θB, θW |XB, XW )dθBdθW .

Analytically, this integral is difficult, but possible. We may instead compute

the distribution of the difference, ∆ = θW − θB and find the posterior proba-

bility that ∆ > 0. Analytically, this is again tractable but difficult. Taking a

Monte Carlo approach is much simpler, as we can simply simulate from each of

the Beta distributions that make up the posterior, take their difference, and com-

pute quantiles of this distribution. That is, draw θ(1)B , . . . , θ

(G)B ∼ Beta(15, 31) and

θ(1)W , . . . , θ

(G)W ∼ Beta(42, 40), then calculate ∆(g) = θ

(g)W − θ

(g)B for g = 1, . . . , G and

take p̂(∆ > 0|XW , XB) = 1G

G∑g=1

I(∆(g) > 0). In R, we can use

G <- 10000

delta <- rbeta(G,42,40) - rbeta(G,15,31)

p.hat <- sum(delta >0)/G

to obtain an estimate of 0.98 for the probability that the selection probability is

greater for whites than blacks.

Finally, we might calculate a Bayes Factor for H0 : θW = θB = θ versus H1 :

θW 6= θB. If we assume uniform priors on these, the BF is simply the ratio of the

marginals, where we can simplify the expression of the joint distribution under H0



in the numerator

BF =M0(XB, XW )

M1(XB, XW )

=

∫ (nBXB

)(nWXW

)θXB+XW (1− θ)(nB−XB)+(nW−XW )dθ∫ ∫ (

nBXB

)(nWXW

)θXBB θXWW (1− θB)(nB−XB)(1− θW )(nW−XW )dθBdθW

=

∫ 1

0θ55(1− θ)69dθ∫ 1

0

∫ 1

0θ14B θ

41W (1− θB)30(1− θW )39dθBdθW

=

Γ(56)Γ(70)Γ(126)

Γ(15)Γ(31)Γ(46)

Γ(42)Γ(40)Γ(82)

.

Using R,

BF <- (gamma(56)*gamma(70)/gamma(126))/

((gamma(15)*gamma(31)/gamma(46))*(gamma(42)*gamma(40)/gamma(82)))

we obtain a Bayes Factor of .507, meaning the evidence in favor of the model that

allows for discrimination is only 2.1, which is not very impressive.

Alternatively, we may posit a null model that is a single binomial distribution, that

is, XB +XW ∼ Bin(nB +nW ; θ). Then the binomial coefficients in the two models

no longer cancel, so the Bayes Factor is

BF ′ =

∫ (nB+nWXB+XW

)θXB+XW (1− θ)(nB−XB)+(nW−XW )dθ∫ ∫ (

nBXB

)(nWXW

)θXBB θXWW (1− θB)(nB−XB)(1− θW )(nW−XW )dθBdθW

=

(12455

)(4414

)(8041

)(BF ).

which we can calculate in R using

BF.prime <- (choose(124,55)/(choose(44,14)*choose(80,41)))*BF



to yield 29.16, which is much more impressive than the previous answer, but in the

opposite direction, that is, showing stronger evidence in favor of the two samples

being drawn from the same binomial distribution, versus two independent binomi-

als.



20 (Groth)Perform a Bayesian analysis of this data and draw conclusions, assuming

each component of the likelihood to be:

a. Normal

b. Student’s t with 3 degrees of freedom

c. Cauchy (t1)

Exploratory Data Analysis: It may help to plot your responses. Make a histogram.

What do they look like? What distribution might it follow in truth?

Step 1. Enter the data into winbugs. Lets call the increased hours of sleep Y . Let

n=10.

Step 2: Write the main steps of a basic Winbugs Code:

model{

for(i in 1:n){

#Likelihood containing parameters

}

#priors on parameters

}

a). Now, we will set up the likelihood for a normal distribution. As you may recall,

a normal distribution in Winbugs is coded with dnorm(mean, prec).

model{



for(i in 1:n){


Y[i]~dnorm(mu,tau) #mu is the mean, and tau is the precision

}


}

Now we will place priors on the parameters mu and tau because they are not defined

by our dataset. In our dataset, there is not a column called mu or tau. We want

to estimate mu and tau if Y really followed a normal distribution. We will set

weakly-informative priors on mu and tau.

model{

for(i in 1:n){



}


mu~dnorm(0.0001)

sigma~dunif(0.01, 100.0)

#Here we are placing the prior on the standard deviation,

#so we need to define its relationship to tau

tau<-1/(sigma*sigma)



}

We are clearly trying to test here if µ = 0. If a credible interval for the parameter

mu were to include 0, what would that tell us? What would that tell us about the

difference between the two treatments?

Remember you will have to set reasonable initial values and check convergence of

the chains. You may also assess DIC.

I would consider exporting the mu samples with coda and plotting them in R. This

will allow you to compare to the posterior distributions from part b and c.

You may also consider testing if the difference in sleep is really greater than a con-

stant such as 1.5. You can do this by simply adapting your code in this fashion:

model{

for(i in 1:n){



}

gt1_5 <- step(mu-1.5) #then monitor this parameter




mu~dnorm(0.0001)

sigma~dunif(0.01, 100.0)

#Here we are placing the prior on the standard deviation,

# so we need to define its relationship to tau

tau<-1/(sigma*sigma)

}

b. students-t distribution:

I would go about this similarly. Here let

Y[i]~dt(mu,tau,3) #Likelihood with 3 degrees of freedom

As discussed in class, let your priors be:

mu~dnorm(0.0001)

tau~dgamma(0.001,0.001)

These are already written in Winbugs notation. Then, I would take a look at the

back of your book to know what the mean would be or what mu might represent.

I will let you code up the problem similar to part a. Once again, you are answering

the same question here. You are merely assuming that the data follows a different

distribution. We still want to test µ = 0.

c. Cauchy Distribution



Here we must use the 1s trick. As you may recall from stat theory, a cauchy dis-

tribution is a special case of a t-distribution. However, Winbugsdoes not allow the

degrees of freedom to be set so low or for a cauchy to be coded directly.

Therefore, we must use the ones trick.

Start by writing out a Likelihood L (up to a proportionality constant):

L[i] <- tau * 1/( 1 + ((y[i]-mu)*tau)*((y[i]-mu)*tau) )

This can be based on the back of the book.

Now define C to be a large number. Like C = 1000. Then, we use the trick:

ones[i] <- 1

p[i] <- L[i] / C

ones[i] ~ dbern(p[i])

Do this all inside the likelihood loop (for (i in 1:n)). C=1000 can be outside. Use

the priors:

mu ~ dnorm(0.0, 0.0001)

tau ~ dgamma(0.001,0.001)

Here we will still want to draw inference on mu. Mu as described by the back of the

book is the median but no mean exists. DIC values will not be accurate because



of performing the trick here. So DO NOT use DIC here. Once again, extract the

samples and plot the posterior of mu.

Solution:

(Hatfield:) Code to fit the first three models (Normal, t3andCauchy) is found

below. Note that the Cauchy model uses the “ones trick” described in the WinBUGS

manual for distributions not available in the software. WinBUGS requires the degrees

of freedom for a t distribution to be ≥ 2, so a Cauchy is excluded. To use the trick,

we write the likelihood of each observation by hand and divide by a large number

to ensure ∈ (0, 1). Then each scaled likelihood is used as the parameter, p[i], for

a Bernoulli distribution on a success observation (the vector of ones), so that we

obtain the correct likelihood contributions (up to a proportionality constant).

# Normal model

write("model{

for (i in 1:n) {

y[i] ~ dnorm(mu, tau) # Likelihood for normal

}

gt1_5 <- step(mu-1.5)

# Priors for Normal:

mu ~ dnorm(0.0, 0.0001)

sigma ~ dunif(0.01, 100.0)

tau <- 1/ (sigma*sigma)



}","sleep_nmodel.txt")

# t_3 model

write("model{

for (i in 1:n) {

y[i] ~ dt(mu, tau, 3) # Likelihood for t_3

}

gt1_5 <- step(mu-1.5)

# Priors for t_3:

mu ~ dnorm(0.0, 0.0001)

tau ~ dgamma(0.001,0.001)

}","sleep_t3model.txt")

# t_1 (Cauchy) model

write("model { # Using the ones trick

C <- 10000 # Some large constant

for (i in 1:n) {

# Compute the likelihood contribution of each observation

# (up to proportionality constant) by hand

L[i] <- tau * 1/( 1 + ((y[i]-mu)*tau)*((y[i]-mu)*tau) )

ones[i] <- 1

p[i] <- L[i] / C

ones[i] ~ dbern(p[i])

}



gt1_5 <- step(mu-1.5)

# Priors for t_1:

mu ~ dnorm(0.0, 0.0001)

tau ~ dgamma(0.001,0.001)

}","sleep_t1model.txt")

sleep_norm <- BRugsFit(modelFile="sleep_nmodel.txt",

data=list(y=c(1.2,2.4,1.3,1.3,0.0,1.0,1.8,0.8,4.6,1.4),

n=10),

inits=list(list(mu = 0.0, sigma = 0.5),

list(mu = 1.0, sigma = 1.0)),

numChains=2,

parametersToSave=c(’mu’,’sigma’,’gt1_5’),


samplesCoda("mu","sleep_n_")

sleep_t3 <- BRugsFit(modelFile="sleep_t3model.txt",

data=list(y=c(1.2,2.4,1.3,1.3,0.0,1.0,1.8,0.8,4.6,1.4),

n=10),

inits=list(list(mu = 0.0, tau = 1),

list(mu = 1.0, tau = 0.5)),

numChains=2,

parametersToSave=c(’mu’,’tau’,’gt1_5’),




samplesCoda("mu","sleep_t3_")

sleep_t1 <- BRugsFit(modelFile="sleep_t1model.txt",

data=list(y=c(1.2,2.4,1.3,1.3,0.0,1.0,1.8,0.8,4.6,1.4),

n=10),

inits=list(list(mu = 0.0, tau = 0.5),

list(mu = 1.0, tau = 1.0)),

numChains=2,

parametersToSave=c(’mu’,’tau’,’gt1_5’),


samplesCoda("mu","sleep_t1_")

In these analyses, the null hypothesis is that the drugs do not differ, that is, H0 :

µ = 0 versus one of the drugs is superior, i.e., H1 : µ 6= 0. Since the posterior

credible intervals using each of these distributions exclude zero, we reject the null

and conclude that the drugs differ. Clearly the direction of the association is that

soporifc B is better, since nearly all patients received increased sleep on B versus A.

DIC comparisons are informative. For the Normal distribution, we have pD = 1.7

and DIC = 35.6, for the t3, pD = 2.2 and DIC = 32.6, but for the Cauchy, the

DIC values will not be accurate due to the trick. However, we know that there is

one potential outlier (observation 9) and that distributions with the smaller tails

(Normal and t3) will have higher means to accommodate this value, while the

heavier-tailed Cauchy need not increase the mean. This pattern is observed in the

posterior means of µ shown in Table D.3, as well as in Figure!D.8, which displays



the original data and three posteriors (kernel density estimates from the posterior

samples) for µ, each based on a different data likelihood. This plot was constructed

using the following R code:

sleep_norm_samples <- rbind(read.table("sleep_n_CODAchain1.txt"),

read.table("sleep_n_CODAchain2.txt"))

names(sleep_norm_samples)<- c(’index’,’mu’)

sleep_t3_samples <- rbind(read.table("sleep_t3_CODAchain1.txt"),

read.table("sleep_t3_CODAchain2.txt"))

names(sleep_t3_samples)<- c(’index’,’mu’)

sleep_t1_samples <- rbind(read.table("sleep_t1_CODAchain1.txt"),

read.table("sleep_t1_CODAchain2.txt"))

names(sleep_t1_samples)<- c(’index’,’mu’)

# Plot original data and the three mu posteriors

#postscript("Pics/sleep_posteriors.ps")

par(mar=c(5,4,1,1),lwd=2,cex=1.5)

plot(x=c(1.2,2.4,1.3,1.3,0.0,1.0,1.8,0.8,4.6,1.4),

y=rep(0,10),ylim=c(0,3),xlim=c(-1,5),pch=19,

ylab="Probability",xlab="Increased Hours of Sleep",type=’n’)

n_x <- density(sleep_norm_samples[,’mu’])[[’x’]]

n_y <- density(sleep_norm_samples[,’mu’])[[’y’]]

polygon(

c(n_x[n_x>1.5],rev(n_x[n_x>1.5])),



c(n_y[n_x>1.5],rep(0,length(n_y[n_x>1.5]))),

lty=1,col=’grey’,density=10,angle=90,border=NA)

t3_x <- density(sleep_t3_samples[,’mu’])[[’x’]]

t3_y <- density(sleep_t3_samples[,’mu’])[[’y’]]

polygon(

c(t3_x[t3_x>1.5],rev(t3_x[t3_x>1.5])),

c(t3_y[t3_x>1.5],rep(0,length(t3_y[t3_x>1.5]))),

lty=1,col=’grey’,density=10,angle=-45,border=NA)

t1_x <- density(sleep_t1_samples[,’mu’])[[’x’]]

t1_y <- density(sleep_t1_samples[,’mu’])[[’y’]]

polygon(

c(t1_x[t1_x>1.5],rev(t1_x[t1_x>1.5])),

c(t1_y[t1_x>1.5],rep(0,length(t1_y[t1_x>1.5]))),

lty=1,col=’grey’,density=10,angle=45,border=NA)

text(2,.75,"Normal",pos=4)

lines(density(sleep_norm_samples[,’mu’]),lty=1)

text(1.5,1.25,expression(t[3]),pos=4)

lines(density(sleep_t3_samples[,’mu’]),lty=2)

text(1.5,2.5,"Cauchy",pos=4)

lines(density(sleep_t1_samples[,’mu’]),lty=3)

points(x=c(1.2,2.4,1.3,1.3,0.0,1.0,1.8,0.8,4.6,1.4),

y=rep(0,10),pch=19)



−1 0 1 2 3 4 5

0.0

0.5

1.0

1.5

2.0

2.5

3.0

Increased Hours of Sleep

Pro

babi

lity

Normal

t3

Cauchy

● ●●●● ● ●● ●●

abline(h=0)

#dev.off()

We use the function density() to compute and plot smooth kernel estimates of

the posteriors for µ, as WinBUGS does when plotting the posterior densities.

Suppose we had an existing sleep aid that could achieve 90 minutes (1.5 hrs) of

increased sleep, so that the new outcome of interest is H1 : µ > 1.5. In this



case, the difference among the three likelihood choices becomes very important.

We modify our BUGS code to add the node gt1 5 <- step(mu-1.5), which will

return 1 when µ − 1.5 ≥ 0 and 0 otherwise. Then we can directly compute the

posterior probability that the sleep aid being considered is superior by examining

the posterior mean of this node. The results are 0.09 for the Cauchy, 0.25 for the

t3, and 0.58 for the Normal. We can also see these probabilities represented as

the shaded regions under the curves. Clearly, the normal distribution’s movement

toward the outlier greatly increases the probability of concluding that the new

drug is better than the old one. However, we may be uncomfortable making such

a conclusion based primarily on the results of just one patient. For part c, we can

compute the posterior by hand, since we know a conjugate prior is Beta. Then the

data are yi ∼iid Bin(10, p) and we can use an uninformative Beta(1, 1) = U(0, 1)

prior on p. The posterior is Beta(x + 1, n − x + 1). If we treat the zero value

as a “failure”, i.e., a Bernoulli 0, then the posterior is p|y ∼ Beta(10, 2), making

the posterior mean for E(p|y) = 0.83. If we decide to ignore the zero value, then

we have all successes and the posterior is p|y ∼ Beta(10, 1) with E(p|y) = 0.91.

Computing posterior credible intervals is easily done in R using

qbeta(c(0.025,0.975),10,2)

qbeta(c(0.025,0.975),10,1)

which yields (0.587, 0.977) and (0.692, 0.997).

In this scenario, the null hypothesis that the drugs do not differ is formulated as



H0 : p = 0.5 versus the alternative H1 : p 6= 0.5. With either handling of the zero

observation, we have evidence to reject the null, again showing that patients are

more likely to have increased sleep on drug B.


Solutions to HW 3 PubH 7440 - people.vcu.edu

Documents

Transcript of Solutions to HW 3 PubH 7440 - people.vcu.edu