MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS ...saralees/solprob4.pdf · MATH...

MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS

Solutions to Problems on Multivariate Normal Distribution

1. Let X and Y have the joint pdf

f(x, y) =1

2π√

1− ρ2exp

{−x

2 + y2 − 2ρxy

2 (1− ρ2)

}

for −∞ < x <∞, −∞ < y <∞ and −1 < ρ < 1.

First we find the marginal pdfs. The marginal pdf of Y can be obtained as

fY (y) =1

2π√

1− ρ2

∫ ∞−∞

exp

{−x

2 + y2 − 2ρxy

2 (1− ρ2)

}dx

=1

2π√

1− ρ2

∫ ∞−∞

exp

{−(x− ρy)2 + y2

(1− ρ2

)2 (1− ρ2)

}dx

=1√2π

exp

(−y

2

2

)[1√

2π√

1− ρ2

∫ ∞−∞

exp

{− (x− ρy)2

2 (1− ρ2)

}dx

]

=1√2π

exp

(−y

2

2

),

the standard normal pdf. By symmetry, the marginal pdf of X is also standard normal. So,it follows E(X) = E(Y ) = 0 and V ar(X) = V ar(Y ) = 1.

Now consider deriving E(XY ). We have

E(XY ) =1

2π√

1− ρ2

∫ ∞−∞

∫ ∞−∞

yx exp

{−x

2 + y2 − 2ρxy

2 (1− ρ2)

}dxdy

=1

2π√

1− ρ2

∫ ∞−∞

∫ ∞−∞

yx exp

{−(x− ρy)2 + y2

(1− ρ2

)2 (1− ρ2)

}dxdy

=1

2π√

1− ρ2

∫ ∞−∞

y exp

(−y

2

2

)∫ ∞−∞

x exp

{− (x− ρy)2

2 (1− ρ2)

}dxdy

=1

2π√

1− ρ2

∫ ∞−∞

y exp

(−y

2

2

)∫ ∞−∞

(x− ρy + ρy) exp

{− (x− ρy)2

2 (1− ρ2)

}dxdy

=1

2π√

1− ρ2

∫ ∞−∞

y exp

(−y

2

2

)∫ ∞−∞

(z + ρy) exp

{− z2

2 (1− ρ2)

}dzdy

=ρ

2π√

1− ρ2

∫ ∞−∞

y2 exp

(−y

2

2

)∫ ∞−∞

exp

{− z2

2 (1− ρ2)

}dzdy

=ρ√2π

∫ ∞−∞

y2 exp

(−y

2

2

)[1√

2π√

1− ρ2

∫ ∞−∞

exp

{− z2

2 (1− ρ2)

}dz

]dy

=ρ√2π

∫ ∞−∞

y2 exp

(−y

2

2

)dy

=2ρ√2π

∫ ∞0

y2 exp

(−y

2

2

)dy

1

=2√

2ρ√2π

∫ ∞0

w1/2 exp (−w) dw

=2√

2ρ√2π

Γ

(3

2

)=

2√

2ρ√2π

√π

2= ρ.

So, Cov(X,Y ) = E(XY )− E(X)E(Y ) = ρ and Corr(X,Y ) = ρ.

Finally, consider proving that X and Y are independent if and only if Cov(X,Y ) = 0. If Xand Y are independent then by definition Cov(X,Y ) = 0. If Cov(X,Y ) = 0 then ρ = 0, sosubstituting into to the joint pdf we see that it factorizes into the products of two standardnormal pdfs, so X and Y are independent.

2. The conditional pdf of X given Y = y is

f(x | y) =f(x, y)

fY (y)

=1√

2π√

1− ρ2exp

{−x

2 + y2 − 2ρxy

2 (1− ρ2)+y2

2

}

=1√

2π√

1− ρ2exp

{−x

2 + ρ2y2 − 2ρxy

2 (1− ρ2)

}

=1√

2π√

1− ρ2exp

{− (x− ρy)2

2 (1− ρ2)

},

the normal pdf with mean ρy and variance 1− ρ2.Similarly, the conditional pdf of Y given X = x is

f(y | x) =f(x, y)

fX(x)

=1√

2π√

1− ρ2exp

{−x

2 + y2 − 2ρxy

2 (1− ρ2)+x2

2

}

=1√

2π√

1− ρ2exp

{−ρ

2x2 + y2 − 2ρxy

2 (1− ρ2)

}

=1√

2π√

1− ρ2exp

{− (y − ρx)2

2 (1− ρ2)

},

the normal pdf with mean ρx and variance 1− ρ2.

3. Let X and Y have the joint pdf

f(x, y) = exp

(c+ 4x+ 4y − x2

2− y2

2− x2y2

2

)

for −∞ < x <∞ and −∞ < y <∞, where c is a constant.

2

First, we determine the marginal pdfs. The marginal pdf of Y can be obtained as

fY (y) =

∫ ∞−∞

exp

(c+ 4x+ 4y − x2

2− y2

2− x2y2

2

)dx

= exp

(c+ 4y − y2

2

)∫ ∞−∞

exp

{−(1 + y2

)x2

2− 4x

}dx

= exp

(c+ 4y − y2

2

)∫ ∞−∞

exp

{−1 + y2

2

(x2 − 8x

1 + y2

)}dx

= exp

(c+ 4y − y2

2+

8

1 + y2

)∫ ∞−∞

exp

{−1 + y2

2

(x− 4

1 + y2

)2}dx

=

√2π√

1 + y2exp

(c+ 4y − y2

2+

8

1 + y2

)

×

1√

2π(1/√

1 + y2) ∫ ∞−∞

exp

{−1 + y2

2

(x− 4

1 + y2

)2}dx

=

√2π√

1 + y2exp

(c+ 4y − y2

2+

8

1 + y2

).

By symmetry, the marginal pdf of X is

fX(x) =

√2π√

1 + x2exp

(c+ 4x− x2

2+

8

1 + x2

).

So, the conditional pdf of X given Y = y is

f(x | y) =f(x, y)

fY (y)

=

√1 + y2√

2πexp

(4x− x2

2− x2y2

2− 8

1 + y2

)

=

√1 + y2√

2πexp

(− 8

1 + y2

)exp

{−(1 + y2

)x2

2+ 4x

}

=

√1 + y2√

2πexp

(− 8

1 + y2

)exp

{−1 + y2

2

(x2 − 8x

1 + y2

)}

=

√1 + y2√

2πexp

(− 8

1 + y2

)exp

{−1 + y2

2

(x− 4

1 + y2

)2

+8

1 + y2

}

=

√1 + y2√

2πexp

{−1 + y2

2

(x− 4

1 + y2

)2},

the normal pdf with mean 4/(1 + y2) and variance 1/(1 + y2). By symmetry, the conditionalpdf of Y given X = x is normal with mean 4/(1 +x2) and variance 1/(1 +x2). Note howeverthat the joint pdf of X and Y is not bivariate normal.

This is an example of a distribution where joint pdf is not normal but the conditionals arenormal. For other examples of this kind, see Arnold, Castillo, and Sarabia, 2001, StatisticalScience, Volume 16, Issue 3, 249-274.

3

4. Let X and Y have the bivariate normal pdf

f(x, y) =1

2π√

1− ρ2exp

{−x

2 + y2 − 2ρxy

2 (1− ρ2)

}

for −∞ < x <∞, −∞ < y <∞ and −1 < ρ < 1.

The joint moment generating function of X and Y can be written as

M(s, t) = E [exp(sX + tY )]

=1

2π√

1− ρ2

∫ ∞−∞

∫ ∞−∞

exp

{sx+ ty − x2 + y2 − 2ρxy

2 (1− ρ2)

}dydx

=1

2π√

1− ρ2

×∫ ∞−∞

∫ ∞−∞

exp

{−x

2 + y2 − 2(1− ρ2

)sx− 2

(1− ρ2

)ty − 2ρxy

2 (1− ρ2)

}dydx. (1)

We want to rewrite the numerator of the fraction within the exponential term in the form(x−a)2 +(y−b)2−2ρ(x−a)(y−b)−a2−b2 +2ρab for some constants a and b. To determinethese constants, we equate the coefficients of x and y. Equating the coefficients of x, weobtain the equation −2a + 2ρb = −2(1 − ρ2)s. Equating the coefficients of y, we obtain theequation −2b + 2ρa = −2(1 − ρ2)t. Solving these two equations simultaneously, we obtaina = ρt+ s and b = ρs+ t. So, we can rewrite (1) as

M(s, t) =1

2π√

1− ρ2

×∫ ∞−∞

∫ ∞−∞

exp

{−(x− a)2 + (y − b)2 − 2ρ(x− a)(y − b)− a2 − b2 + 2ρab

2 (1− ρ2)

}dydx

= exp

{a2 + b2 − 2ρab

2 (1− ρ2)

}

×[

1

2π√

1− ρ2

∫ ∞−∞

∫ ∞−∞

exp

{−(x− a)2 + (y − b)2 − 2ρ(x− a)(y − b)

2 (1− ρ2)

}dydx

]

= exp

{a2 + b2 − 2ρab

2 (1− ρ2)

}.

5. Let a p× 1 random vector X = (X1, . . . , Xp)T have the p-variate normal pdf

f (x) =1

(2π)p/2 | Σ |1/2exp

{−1

2xTΣ−1x

}for −∞ < xi <∞ for i = 1, . . . , p.

The joint moment generating function of X can be written as

M(t) = E[exp

(tTX

)]=

1

(2π)p/2 | Σ |1/2∫x∈Rp

exp

{tTx− 1

2xTΣ−1x

}dx

=1

(2π)p/2 | Σ |1/2∫x∈Rp

exp

{−1

2

[xTΣ−1x− 2tTx

]}dx. (2)

4

We want to rewrite the terms within square brackets in the form (x−µ)TΣ−1(x−µ)−µTΣ−1µfor some constants µ. To determine these constants, we equate the coefficients of x. Equatingthese coefficients, we obtain the equation −2µTΣ−1 = −2tT . Solving this equation, we obtainµ = Σt. So, we can rewrite (2) as

M(t) =1

(2π)p/2 | Σ |1/2∫x∈Rp

exp

{−1

2

[(x−Σt)TΣ−1(x−Σt)− tTΣt

]}dx

= exp(−tTΣt/2

) [ 1

(2π)p/2 | Σ |1/2∫x∈Rp

exp

{−1

2

[(x−Σt)TΣ−1(x−Σt)

]}dx

]= exp

(−tTΣt/2

).

6. Let X be a standard normal random variable. Let W = 1 or −1, each with probability 1/2,and assume W is independent of X. Let Y = WX. Then

(i) we have

Cov(X,Y ) = E(XY )− E(X)E(Y )

= E(XY )− E(X)E(WX)

= E(XY )− E(X)E(W )E(X)

= E(XY )

= E (E(XY |W ))

= E(X2)

Pr(W = 1) + E(−X2

)Pr(W = −1)

= 1× 1

2+ (−1)× 1

2= 0,

so X and Y are uncorrelated;

(ii) we have

Pr(Y ≤ x) = E (Pr(Y ≤ x |W ))

= Pr (X ≤ x) Pr(W = 1) + Pr (−X ≤ x) Pr(W = −1)

= Φ(x)× 1

2+ Φ(x)× 1

2= Φ(x),

so X and Y have the same normal distribution (where Φ(·) denotes the standard normaldistribution function);

(iii) we |Y | = |X| and Pr(Y > 1|X = 1/2) = 0, so X and Y are not independent.

7. Let X be a standard normal random variable. Let

Y =

{−X, | X |< c,X, otherwise,

where c is the root of the equation∫ c

0x2φ(x)dx = 1/4,

where φ(·) denotes the standard normal pdf. Then

5

(i) we have

Pr(Y ≤ x) = Pr ({| X |< c and −X ≤ x} or {| X |> c and X ≤ x})= Pr (| X |< c and −X ≤ x) + Pr (| X |> c and X ≤ x)

= Pr (| X |< c and X ≤ x) + Pr (| X |> c and X ≤ x)

= Pr (X ≤ x) ,

so X and Y have the same normal distribution;

(ii) we have

Cov(X,Y ) = E(XY )− E(X)E(Y )

= E(XY )− E(X)E(X)

= E(XY )

=

∫ −c−∞

x2φ(x)dx+

∫ ∞c

x2φ(x)dx−∫ c

−cx2φ(x)dx

= E(X2)− 2

∫ c

−cx2φ(x)dx

= 1− 2

∫ c

−cx2φ(x)dx

= 1− 4

∫ c

0x2φ(x)dx

= 0,

so X and Y are uncorrelated;

(iii) X and Y are clearly not independent since X completely determines Y .

8. Let the random vector X ∼ N2(µ,Σ) where µ = 0 and

Σ =

[2 −1−1 4

]

Note that we can write

Y =

[Y1Y2

]=

[X1 −X2

X2

]=

[1 −10 1

] [X1

X2

].

So,

EY =

[1 −10 1

] [00

]=

[00

],

CovY =

[1 −10 1

] [2 −1−1 4

] [1 0−1 1

]=

[8 −5−5 4

]

and

Y ∼ N2

([00

],

[8 −5−5 4

]).

Hence, Cov(Y1, Y2) = −5 implying that Y1 and Y2 are not independently distributed.

6

9. Let X have a N3(µ,Σ) distribution where µ = (0, 0, 0)T and

Σ =

1 −2 0−2 5 0

0 0 2

(i) For X1 and X2, we have[

X1

X2

]∼ N2

([00

],

[1 −2−2 5

]),

so Cov(X1, X2) = −2 implying that X1 and X2 are not independently distributed.

(ii) For X2 and X3, we have[X2

X3

]∼ N2

([00

],

[5 00 2

]),

so Cov(X2, X3) = 0 implying that X2 and X3 are independently distributed.

(iii) For (X1, X2) and X3, Cov((X1, X2), X3) = (0, 0), so (X1, X2) and X3 are independentlydistributed.

(iv) For (X1 +X2)/2 and X3, note that

[(X1 +X2)/2

X3

]=

[1/2 1/2 00 0 1

] X1

X2

X3

.So, [

(X1 +X2)/2X3

]∼ N2

([00

],

[1/2 00 2

]),

so Cov((X1 + X2)/2, X3) = 0 implying that (X1 + X2)/2 and X3 are independentlydistributed.

(v) For X2 and −52X1 +X2 −X3, note that

[−5

2X1 +X2 −X3

X2

]=

[−5/2 1 −1

0 1 0

] X1

X2

X3

.So, [

−52X1 +X2 −X3

X2

]∼ N2

([00

],

[93/4 1010 5

]),

so Cov(−52X1 + X2 − X3, X2) = 10 implying that −5

2X1 + X2 − X3 and X2 are notindependently distributed.

7

10. Let X have a N2(µ,Σ) distribution where µ = 0 and Σ = I2. Let

Y = CX + d ∼ N2

([32

],

[1 −1.5−1.5 4

]).

It follows that (3, 2)T = E(Y) = CE(X) + d = C0 + d = d. Also[1 −1.5−1.5 4

]= CovY = CCov(X)CT = CI2C

T = CCT ,

so

C =

[1 −1.5−1.5 4

]1/2

=

[−0.383 −0.924−0.924 −0.383

] [ √4.621 0

0√

0.379

] [−0.383 −0.924−0.924 −0.383

]

=

[0.840 −0.542−0.542 1.925

].

11. Let X1, . . . ,Xn be a random sample from the Np(µ,Σ) distribution where µ and Σ areunknown but we know that, in this case, Σ = diag(σ11, . . . , σpp). The log-likelihood functioncan be written as

logL(µ,Σ) = −np2

log(2π)− n

2log | Σ | −1

2

n∑i=1

(Xi − µ)TΣ−1(Xi − µ)

= −np2

log(2π)− n

2

p∑j=1

log σjj −1

2

n∑i=1

p∑j=1

(Xij − µj)2

σjj

= −np2

log(2π)− n

2

p∑j=1

log σjj −1

2

p∑j=1

1

σjj

n∑i=1

(Xij − µj)2

Let us suppose the estimate of µj is known as x̄j . Then unknown parameters are σjj , j =1, 2, . . . , p. The first derivative of the log-likelihood function with respect to σjj is

∂ logL(µ,Σ)

∂σjj= − n

2σjj+

1

2σ2jj

n∑i=1

(Xij − x̄j)2 = − n

2σjj+

(n− 1)sjj2σ2jj

.

Setting this to zero and solving, we obtain σ̂jj = (n − 1)sjj/n, that is Σ̂ = (n−1)n D where

D = diag(s11, . . . , spp).

12. Let X1,X2,X3,X4 be independent Np(µ,Σ) random vectors.

(i) Then the marginal distribution of V1 = (X1 −X2 + X3 −X4)/4 is normal with mean(µ − µ + µ − µ)/4 = 0 and covariance matrix ((1,−1, 1,−1)T (1,−1, 1,−1)/16)Σ =(1/4)Σ.

The marginal distribution of V2 = (X1 + X2 −X3 −X4)/4 is normal with mean (µ +µ− µ− µ)/4 = 0 and covariance matrix ((1, 1,−1,−1)T (1, 1,−1,−1)/16)Σ = (1/4)Σ.

8

(ii) The joint pdf of V1 and V2 is that of the normal distribution with mean and covariancegiven by [

00

]and [

Σ/4 00 Σ/4

],

respectively.

13. Let X be N3(µ,Σ) with µT = [−3, 1, 4] and

Σ =

1 −2 0−2 5 00 0 2

.Then Cov(X1, X2) = −2, so X1 and X2 are not independent.

We have Cov(X2, X3) = 0, X2 and X3 are independent.

We have Cov((X1, X2), X3) = (0, 0), so (X1, X2) and X3 are independent.

We have Cov((X1 + X2)/2, X3) = (1/2) × 0 + (1/2) × 0 = 0, so (X1 + X2)/2 and X3 areindependent.

14. Let X be N3(µ,Σ) with µT = [−3, 1, 4] and

Σ =

1 −2 0−2 5 00 0 2

.The conditional distribution of X1 given (X2, X3) = (x2, x3) is normal with mean equal to

µ1 + Σ12Σ−122 (x2 − µ2) = −3 + [−2, 0]

[1/5 00 1/2

][x2 − 1, x3 − 1] = −13/5− 2x2/5

and variance equal to

Σ11 −Σ12Σ−122 Σ21 = 1− [−2, 0]

[1/5 00 1/2

][−2, 0]T = 1/5.

The conditional distribution of X2 given (X1, X3) = (x1, x3) is normal with mean equal to

µ1 + Σ12Σ−122 (x2 − µ2) = 1 + [−2, 0]

[1 00 1/2

][x1 + 3, x3 − 4] = −5− 2x1

and variance equal to

Σ11 −Σ12Σ−122 Σ21 = 5− [−2, 0]

[1 00 1/2

][−2, 0]T = 1.

The conditional distribution of X3 given (X1, X2) = (x1, x2) is the same as that of X3, normalwith mean 4 and variance 2.

9

15. Let X1 be N(0, 1) and let

X2 =

{−X1 if −1 ≤ X1 ≤ 1,X1 otherwise.

Then we have the following.

(i) the distribution of X2 is

Pr (X2 ≤ x) = Pr (X2 ≤ x | |X1| ≤ 1) Pr (|X1| ≤ 1) + Pr (X2 ≤ x | |X1| > 1) Pr (|X1| > 1)

= Pr (−X1 ≤ x) Pr (|X1| ≤ 1) + Pr (X1 ≤ x) Pr (|X1| > 1)

= Pr (X1 ≤ x) Pr (|X1| ≤ 1) + Pr (X1 ≤ x) Pr (|X1| > 1)

= Pr (X1 ≤ x) ,

the distribution function of the N(0, 1) distribution.

(ii) the joint distribution of X1 and X2 is not bivariate normal since Pr(−1 ≤ X1 ≤ 1,−1 ≤X2 ≤ 1) = Pr(−1 ≤ X1 ≤ 1) = Φ(1) − Φ(−1), where Φ(·) is the standard normaldistribution function.

16. If X is distributed as Np(µ,Σ) then GX is distributed as Np(Gµ,GΣGT ). If µ = 0 thenGµ = µ = 0. If Σ = σ2I then GΣGT = σ2GIGT ) = σ2GGT = σ2I = Σ.

17. If X is distributed as Np(µ,Σ) and a is any fixed vector then

X− µ ∼ Np (0,Σ)

⇒ aT (X− µ) ∼ Np

(0,aTΣa

)⇒ aT (X− µ)√

aTΣa∼ N(0, 1)

as required.

18. Let

Σ =

1 ρ ρ2

ρ 1 0ρ2 0 1

.The conditional distribution of (X1, X2) given X3 has the mean vector

[µ1, µ2]T +

[ρ2, 0

]T[1] (X3 − µ3) =

[µ1 + ρ2 (x3 − µ3) , µ2

]Tand covariance matrix[

1 ρρ 1

]−[ρ2, 0

]T[1][ρ2, 0

]=

[1− ρ4 ρρ 1

].

19. Suppose that x ∼ Np(µ,Σ) and a is a fixed vector. Let ri be the correlation between xi andaTx. Write xi = eTi x, where ei is a vector of zeros except for a one at the ith position. ThenCov(xi,a

Tx) = Cov(eTi x,aTx) = aTeiΣ = Σai, V ar(xi) = σii and V ar(aTx) = aTΣa. So,

r = (cD)−1/2Σa, where c = aTΣa and D = diag(Σ). We have r = Σa if Σ = I/√

aTa.

10

20. Suppose x1,x2,x3 are iid Np(µ,Σ) random variables, and y1 = x1 + x2, y2 = x2 + x3 andy3 = x1 + x3.

Let 1 denote a vector of ones of the same length as x1. Let 0 denote a vector of zeros of thesame length as x1. Then we can write y1

y2

y3

=

1 0 00 1 11 0 1

x1

x2

x3

.So, y1

y2

y3

∼ N3p

1 0 0

0 1 11 0 1

µµµ

, 1 0 0

0 1 11 0 1

Σ Σ Σ

Σ Σ ΣΣ Σ Σ

1 0 1

1 1 00 1 1

≡ N3p

2µ

2µ2µ

, Σ Σ 0

0 Σ ΣΣ 0 Σ

1 0 1

1 1 00 1 1

≡ N3p

2µ

2µ2µ

, 2Σ Σ Σ

Σ 2Σ ΣΣ Σ 2Σ

and [y1

y2

]∼ N2p

([2µ2µ

],

[2Σ ΣΣ 2Σ

]).

So, the conditional distribution of y1 given y2 is

y1 | y2 ∼ Np

(2µ+ Σ(2Σ)−1 (y2 − 2µ) , 2Σ−Σ(2Σ)−1Σ

)≡ Np (µ+ (1/2)y2, (3/2)Σ) .

The conditional distribution of y1 given y2 and y3 is

y1 | y2,y3 ∼ Np

(2µ+ [Σ,Σ]

[2Σ ΣΣ 2Σ

]−1 [y2 − 2µy3 − 2µ

],

2Σ + [Σ,Σ]

[2Σ ΣΣ 2Σ

]−1 [ΣΣ

])

≡ Np

(2µ+ [Σ,Σ]

[(2/3)Σ−1 −(4/3)Σ−1

−(4/3)Σ−1 (2/3)Σ−1

] [y2 − 2µy3 − 2µ

],

2Σ + [Σ,Σ]

[(2/3)Σ−1 −(4/3)Σ−1

−(4/3)Σ−1 (2/3)Σ−1

] [ΣΣ

])

≡ Np

(2µ+ [−2/3I,−2/3I]

[y2 − 2µy3 − 2µ

], 2Σ + [−2/3I,−2/3I]

[ΣΣ

])≡ Np ((14/3)µ− (2/3) (y2 + y3) , (10/3)Σ) .

11

MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS ...saralees/solprob4.pdf · MATH...

Documents

Transcript of MATH 38061/MATH48061/MATH68061: MULTIVARIATE STATISTICS ...saralees/solprob4.pdf · MATH...