The Burmeister-McElroy Sliced Normal Theorem · The Burmeister-McElroy Sliced Normal Theorem...

The Burmeister-McElroy Sliced Normal Theorem

Conditional forecasting with probabilistic scenarios

by

Edwin Burmeister and

Marjorie B. McElroy

Department of Economics Box 90097

Duke University Durham, NC 27708-0097

September 24, 1997

Revised April 19, 1999 Revised October 3, 2002 Revised March 18, 2012 Revised July 11, 2017

Copyright 1997-2017 by Edwin Burmeister and Marjorie B. McElroy. All rights reserved. Do not quote without permission of the authors.

1. The problem

A conditional forecast produces the mean and variance of return, conditional on the forecasted variables that impact return via a linear factor model. These conditional forecasts are easy to compute because the historical factor realizations are approximately multivariate normal. However, this is not the way in which many people think about forecasting. Rather, they want to say, “I think there is a 50% chance bond yields will rise by 50 b.p., a 30% chance they will stay the same, and a 20% chance they will fall by 10 b.p.” We will call such forecasts probabilistic scenarios to distinguish them from a pure conditional forecast. We need to compute the joint multivariate distribution for all the factors impacting return when the user forecasts some of them with a probabilistic scenario. To do so we need a new result, the Sliced Normal Theorem proved below.


2

2. Technical background information and notation

In this section we state well-known results and establish notation. One reference for these results is Introduction to the Theory of Statistics by Alexander M. Mood and Franklin A. Graybill (New York: McGraw-Hill Book Company, Inc., Second Edition, 1963), Chapter 9, pages 198-219.

The random vector Y is distributed as the p-variate normal if the joint density of

y1, y2 ,…, yp is

h(Y) = h(y1, y2 , … , yp ) =R

12

2π( )p2

e−1

2Y −µ( )′R Y −µ( )

for − ∞ < yi < +∞ and i = 1,2, … , p

and where(a) R is a positive definite matrix whose elements rij are constants, and(b) µ is a p ×1 vector whose elements µi are constants.

The univariate case with p=1 is obtained by setting r11 =1σ 2 . The quantity

Q = Y − µ( )′ R Y − µ( ) is called the quadratic form of the p-variate normal. It is a theorem that

e−12Y −µ( )′R Y −µ( )

−∞

+∞

∫−∞

+∞

∫ dy1dyp = 2π( )p/2 R −1/2

.

and does not depend on the vector µ .

The p × p covariance matrix of the y ’s is

V =σ 11 σ 1p

σ p1 σ pp

⎡

⎣

⎢⎢⎢

⎤

⎦

⎥⎥⎥

.

It is also a theorem (Theorem 9.9, page 211) that V = R−1 .


3

We now define the following partitions:

Y =Y1

Y2

⎛

⎝⎜⎜

⎞

⎠⎟⎟

µ =U1

U2

⎛

⎝⎜⎜

⎞

⎠⎟⎟

R =R11 R12

R21 R22

⎛

⎝⎜⎜

⎞

⎠⎟⎟

V=V11 V12

V21 V22

⎛

⎝⎜⎜

⎞

⎠⎟⎟

where

Y1 =y1

yk

⎛

⎝

⎜⎜⎜

⎞

⎠

⎟⎟⎟

U1 =µ1

µk

⎛

⎝

⎜⎜⎜

⎞

⎠

⎟⎟⎟

Y2 =yk+1

yp

⎛

⎝

⎜⎜⎜

⎞

⎠

⎟⎟⎟

U2 =µk+1

µp

⎛

⎝

⎜⎜⎜

⎞

⎠

⎟⎟⎟

.

Note that R11 and V11 are k × k .

The following theorem (Theorem 9.11 on page 213) is critical for what follows:

The conditional distribution of Y1 given Y2 is the k-variate normal with mean

U1 +V12V22−1(Y2 −U2 )

and covariance matrix

R11-1 =V11 −V12V22

−1V21 .

Note that the covariance matrix of Y1 given Y2 does not depend on what the value of Y2 is. This fact will be very important. A probabilistic scenario arises when, instead of forecasting a single realization for the vector Y2 , the user forecasts a probability distribution for the vector Y2 . There is, however, a consistency issue because the true marginal distribution of Y2 is given by


4

g(Y2 ) =V 22

−12

2π( )p−k( )2

e−12Y2−U2( )′V22−1 Y2−U2( )

.

Also, by definition, the conditional distribution of y1, y2 ,… , yk given yk+1, yk+2 ,… , yp is

f Y1|Y2( ) = h Y( )g Y2( ) .

Therefore, any forecast by the user other than the probability distribution g Y2( ) is inconsistent with the underlying joint probability distribution h(Y) stated above. Nevertheless, an economic forecast often entails the belief, perhaps mistaken, that special information can be used to infer that the future will be different from the past. 3. The user-supplied distribution for a probabilistic scenario

Some additional notation is required. As above,

Y =Y1

Y2

⎛

⎝⎜⎜

⎞

⎠⎟⎟

where Y1 =y1

yk

⎛

⎝

⎜⎜⎜

⎞

⎠

⎟⎟⎟

and Y2 =yk+1

yp

⎛

⎝

⎜⎜⎜

⎞

⎠

⎟⎟⎟

.

A particular realization of the vector Y2 , the i th, will be denoted by

y2i ≡

yk+1 i( )

yp i( )

⎛

⎝

⎜⎜⎜

⎞

⎠

⎟⎟⎟

.

We take as given a user forecast for the vectors y2

i and their associated probabilities pi . However this user forecast is generated, we take as given the following (marginal) distribution of Y2 :


5

Pr Y2 = y2i( ) ≡ Pr Y2 =

yk+1 i( )

yp i( )

⎛

⎝

⎜⎜⎜

⎞

⎠

⎟⎟⎟

⎡

⎣

⎢⎢⎢⎢

⎤

⎦

⎥⎥⎥⎥

= pi , pi ≥ 0, pi = 1i∑ . (1)

This discrete distribution will be denoted by ξ Y2( ) . 4. The probabilistic scenario expected value and variance

Given the user forecast, the expected value of Y2 is E Y2;ξ( ) = piy2

i

i∑ ≡U2

C (2)

where the notation E Y2;ξ( ) denotes that the expectation is taken with respect to the user-

provided distribution ξ Y2( ) and where the superscript C denotes “conditional on the user forecast.” Similarly, the (p − k)× (p − k) covariance matrix of Y2 is given by

cov Y2;ξ( ) = E Y2 − E Y2( )⎡⎣ ⎤⎦ Y2 − E Y2( )⎡⎣ ⎤⎦′;ξ⎧

⎨⎩

⎫⎬⎭

= pi y2i − piy2

i

i∑⎡

⎣⎢⎤⎦⎥y2i − piy2

i

i∑⎡

⎣⎢⎤⎦⎥

′⎧⎨⎪

⎩⎪

⎫⎬⎪

⎭⎪i∑

≡V22C .

(3)

Equations (2) and (3) follow directly from the definitions of expected value and covariance. We require the following Sliced Normal Theorem to infer anything about the distribution of Y1

given the user forecast ξ Y2( ) . Note also that the user forecast alone tells us nothing about the

covariance of Y1 and Y2 .


6

5. The Sliced Normal Theorem

From the well-known results stated above in Section 2, for each realization of the random vector Y2 = y2

i , the conditional distribution f Y1|Y2 = y2i( ) is multivariate normal and has a known

mean and covariance matrix:

Y1 y2

i( ) ~ N U1 +V12V22−1 y2

i −U2( ),V11 −V12V22−1V21⎡⎣ ⎤⎦ . (4)

For each i these conditional distributions are slices of the multivariate normal distribution, scaled so that the volume under the density function is one. That is, using the results from Section 2,

f Y1|y2i( ) = h Y1, y2

i( )g y2

i( )

where 1

g y2i( ) is the scale factor that makes the volume one. Here g(Y2 ) is the true marginal

distribution of Y2 , not the user forecast ξ(Y2 ) . The sliced normal distribution for the p ×1 vector Y is defined by these slices and the user-

supplied distribution ξ Y2( ) with mean U2C ≡ piy2

i

i∑ and covariance matrix V22

C . Note

thatY is not multivariate normal. By Theorem 9.11 of Mood and Graybill stated above, Y1 is a

mixture of multivariate normals, while the discrete distribution for Y2 is ξ Y2( ) defined above.

Therefore the joint distribution of Y = Y1,Y2( ) is complicated. More formally, from (4) we know that the conditional density of the k ×1 vector Y1 for each

given (p − k)×1 vector Y2 = y2i is

f Y1 Y2 = y2

i( ) =1

(2π )k2 V11 −V12V22

−1V21

e−1

2Y1− U1+V12V22

−1(y2i −U2 )( )⎡

⎣⎤⎦′V11−V12V22

−1V21⎡⎣ ⎤⎦−1Y1− U1+V12V22

−1(y2i −U2 )( )⎡

⎣⎤⎦ . (5)

Multiplying (5) by the probability pi gives the equation for the i th slice of the sliced joint normal:


7

ϕ(Y1, y2

i ) ≡ pi f Y1 y2i( ) for − ∞ < yj < ∞, j = 1,…,k; Y2 = y2

i .

Hence, given the user’s forecast for the n realizations y2

i , i = 1,2,… ,n, the joint probability density function for the p-variate sliced normal is

ϕ Y1,Y2( ) ≡

p1 f Y1 y21( ) for − ∞ < yj < ∞, j = 1, … ,k; Y2 = y2

1

pi f Y1 y2i( ) for − ∞ < yj < ∞, j = 1, … ,k; Y2 = y2

i

pn f Y1 y2n( ) for − ∞ < yj < ∞, j = 1, … ,k; Y2 = y2

n

⎧

⎨

⎪⎪⎪⎪

⎩

⎪⎪⎪⎪

. (6)

Of course, the conditional distribution of Y1 given Y2 for the sliced normal distribution is

f Y1 Y2 = y2i( ) , while the marginal distribution of Y2 is Pr Y2 = y2

i( ) = pi . The marginal

distribution of Y1 is

pi f Y1|y2

i( )i∑ for − ∞ < yj < ∞, j = 1, … ,k; Y2 = y2

i , i = 1, … ,n. .

The expected value of the vector Y1 for the sliced normal distribution is

U1C ≡ E(Y1)

= piY1 f (Y1 y2i )dy1dyk

−∞

∞

∫−∞

∞

∫i∑

= pii∑ E Y1 y2

i( )⎡⎣

⎤⎦

= pii∑ U1 +V12V22

−1 y2i −U2( )⎡⎣ ⎤⎦ =U1 +V12V22

−1 pi y2i −U2( )

i∑

=U1 +V12V22−1 U2

C −U2( ) .

(7)

Similarly the expected value of the vector Y2 for the sliced normal distribution is


8

U2C ≡ E(Y2 )

= piy2i f (Y1 y2

i )dy1dyk−∞

∞

∫−∞

∞

∫i∑

= pii∑ y2

i f (Y1 y2i )dy1dyk

−∞

∞

∫−∞

∞

∫ = pi

i∑ y2

i ⋅(1)

= pii∑ y2

i .

(8)

We now state the formal result: Sliced Normal Theorem The p-variate Sliced Normal Distribution defined by (6) has mean

U1C

U2C

⎡

⎣⎢⎢

⎤

⎦⎥⎥=

U1 +V12V22−1 U2

C −U2( )pi

i∑ y2

i

⎡

⎣

⎢⎢⎢

⎤

⎦

⎥⎥⎥

(9)

and covariance matrix

V * =V11 −V12V22

−1 V22 −V22C( )V22−1V21 V12V22

−1V22C

V22CV22

−1V21 V22C

⎡

⎣

⎢⎢

⎤

⎦

⎥⎥

. (10)

Equations (7) and (8) establish (9). We now must prove (10). Note that V22* =V22

C by definition.

The difficulty involves computing the covariance matrices V12* = V21

*( )′ and V11* .


9

6. Computation of V11*

By definition

V11* =

−∞

∞

∫ pi Y1 −U1C( )

−∞

∞

∫ Y1 −U1C( )′ f Y1 y2i( )dy1dyk

i∑ . (11)

To ease notation we define b ≡ E Y1 y2

i( ) =U1 +V12V22−1 y2

i −U2( ) ; (12)

see equation (7). Then (11) may be written as

V11* =

−∞

∞

∫i∑ pi Y1 − E Y1 y2

i( )⎡⎣

⎤⎦ + E Y1 y2

i( )−U1C⎡

⎣⎤⎦{ }

−∞

∞

∫ Y1 − E Y1 y2i( )⎡

⎣⎤⎦ + E Y1 y2

i( )−U1C⎡

⎣⎤⎦{ }′ f Y1 y2

i( )dy1dyk =

−∞

∞

∫ pi Y1 − b( ) + b −U1C( )⎡⎣ ⎤⎦

−∞

∞

∫ Y1 − b( ) + b −U1C( )⎡⎣ ⎤⎦

′f Y1 y2

i( )dy1dyki∑ .

Performing the multiplication in the integrand gives

−∞

∞

∫ pi Y1 − b( ) Y1 − b( )′ + b −U1C( ) Y1 − b( )′⎡

⎣⎢−∞

∞

∫i∑

+ Y1 − b( ) b −U1C( )′ + b −U1

C( ) b −U1C( )′ ⎤

⎦⎥f Y1 y2

i( )dy1dyk . (13)

We now proceed to evaluate the four terms in (13).

−∞

∞

∫ pi Y1 − b( ) Y1 − b( )′−∞

∞

∫ f Y1 y2i( )dy1dyk

i∑ =

pii∑ var Y1 y2

i( )⎡⎣

⎤⎦ = pi

i∑ V11 −V12V22

−1V21( ) =V11 −V12V22−1V21 .

(14)


10

−∞

∞

∫ pi b −U1C( ) Y1 − b( )′

−∞

∞


i∑ =

−∞

∞

∫ pi Y1 − b( ) b −U1C( )′

−∞

∞


i∑ =

−∞

∞

∫ Y1 − E Y1 y2i( )⎡

⎣⎤⎦

−∞

∞

∫ f Y1 y2i( )dy1dyk pi b −U1

C( )′i∑ =

(0) ⋅ pi b −U1C( )′

i∑ = 0 .

(15)

We will use the following result: b −U1

C( ) = U1 +V12V22−1 y2

i −U2( )⎡⎣ ⎤⎦ − U1 +V12V22−1 U2

C −U2( )⎡⎣ ⎤⎦ =V12V22

−1 y2i −U2

C( ) . (16)

−∞

∞

∫ pi b −U1C( ) b −U1

C( )′−∞

∞


i∑ = using (16)

−∞

∞

∫ pi V12V22−1 y2

i −U2C( )⎡⎣ ⎤⎦

−∞

∞

∫ V12V22−1 y2

i −U2C( )⎡⎣ ⎤⎦

′f Y1 y2

i( )dy1dyki∑ =

−∞

∞

∫ piV12V22−1 y2

i −U2C( )

−∞

∞

∫ y2i −U2

C( )′V22−1V21 f Y1 y2

i( )dy1dyki∑ =

pii∑ V12V22

−1 y2i −U2

C( ) y2i −U2

C( )′V22−1V21

⎡⎣⎢

⎤⎦⎥

−∞

∞

∫ f Y1 y2i( )

−∞

∞

∫ dy1dyk =

pii∑ V12V22

−1 y2i −U2

C( ) y2i −U2

C( )′V22−1V21

⎡⎣⎢

⎤⎦⎥⋅1=

V12V22−1V22

CV22−1V21 .

(17)

Substituting (14), (15), and (17) into (13) gives V11

* =V11 −V12V22−1V21 +V12V22

−1V22CV22

−1V21

=V11 −V12V22−1 V22 −V22

C( )V22−1V21 .

(18)


11

7. Computation of V21

* By definition

V21* =

−∞

∞

∫ pi y2i −U2

C( ) Y1 −U1C( )′

−∞

∞


i∑

= pi y2i −U2

C( )i∑ E Y1 y2

i( )−U1C⎡

⎣⎤⎦′

= using (12) and (16) pi y2i −U2

C( )i∑ V12V22

−1 y2i −U2

C( )⎡⎣ ⎤⎦′

= pi y2i −U2

C( ) y2i −U2

C( )′i∑ V22

−1V21

=V22CV22

−1V21 .

(19)

Then

V12

* = V21*( )′

=V12V22−1V22

C . (20)

Comparison of (18), (19), and (20) with (10) completes the proof of the Sliced Normal Theorem.

8. Relationship to ordinary least squares

As is well-known, much of the above is closely related to ordinary least squares regression. To illustrate this fact, consider the multiple regression equation

Y1 =U1 + β Y2 −U2( ) + ε (21)

where β ≡V12V22

−1 is a k × (p − k) matrix. Taking conditional expectations of (21) gives E Y1 Y2( ) =U1 + β Y2 −U2( ) . (22)


12

Moreover, the covariance matrix of the error term is

E ε ′ε( ) = E Y1 − U1 + β Y2 −U2( )( )⎡⎣ ⎤⎦ Y1 − U1 + β Y2 −U2( )( )⎡⎣ ⎤⎦′⎧

⎨⎩

⎫⎬⎭

= E Y1 −U1( ) Y1 −U1( )′⎡⎣⎢

⎤⎦⎥+ E β Y2 −U2( ) Y2 −U2( )′ ′β⎡

⎣⎢⎤⎦⎥−

E Y1 −U1( ) Y2 −U2( )′ ′β⎡⎣⎢

⎤⎦⎥− E β Y2 −U2( ) Y1 −U1( )′⎡

⎣⎢⎤⎦⎥

=V11 + βV22 ′β −V12 ′β −V21 ′β =V11 +V12V22

−1V21 −V12V22−1V21 −V12V22

−1V21

=V11 −V12V22−1V21

(23)

which is cov Y1 Y2( ) . Also

E Y2 −U2( ) ′ε⎡⎣ ⎤⎦ = E Y2 −U2( ) Y1 − U1 + β Y2 −U2( )( )⎡⎣ ⎤⎦′⎧

⎨⎩

⎫⎬⎭

= E Y2 −U2( ) Y1 −U1( )⎡⎣ ⎤⎦ − E Y2 −U2( ) Y2 −U2( )′ ′β⎡⎣⎢

⎤⎦⎥

=V21 −V22V22−1V21

= 0

(24)

so the error term is orthogonal to the right-hand-side variables.

9. Positive definiteness of the covariance matrix V*

Of course, the covariance matrix V* for the sliced normal distribution given by (10) must be positive definite by definition. Nevertheless, it is an instructive check to verify this fact directly.

We will use the following results:


13

Let the p × p matrix A be partioned into A =A11 A12

A21 A22

⎡

⎣⎢⎢

⎤

⎦⎥⎥ where A11

is k × k, A12 is k × (p − k), A21 is (p − k)× k, and A22 is (p − k)× (p − k).

Then A11 A12

0 A22

=I 00 A22

⋅A11 A12

0 I= A11 ⋅ A22 .

(25)

A reference for (25) is T. W. Anderson, An Introduction to Multivariate Statistical Analysis, 2nd Edition, New York: John Wiley & Sons, 1984, pages 592-593. If P is a nonsingular matrix and if A is positive definite (semidefinite),then ′P AP is positive definite (semidefinite).

(26)

A reference for (26) is Henri Theil, Principles of Econometrics, New York: John Wiley & Sons, 1971, Proposition F3, page 22. We now prove: The k × k matrix V11 −V12V22

−1V21⎡⎣ ⎤⎦ is positive definite. (27) Proof:

Let x be any k ×1 vector, x ≠ 0 . Then

′x − ′x V12V22−1( ) V11 V12

V21 V22

⎡

⎣⎢⎢

⎤

⎦⎥⎥

x−V22

−1V21x

⎡

⎣⎢⎢

⎤

⎦⎥⎥=

′x V11 − ′x V12V22−1V21 ′x V12 − ′x V12V22

−1V22⎡⎣⎢

⎤⎦⎥

x−V22

−1V21x

⎡

⎣⎢⎢

⎤

⎦⎥⎥=

′x V11 −V12V22−1V21( ) ′x ⋅(0)⎡

⎣⎢⎤⎦⎥

x−V22

−1V21x

⎡

⎣⎢⎢

⎤

⎦⎥⎥=

′x V11 −V12V22−1V21( )x > 0

because V is positive definite.


14

Proposition V* is positive definite. Proof:

Let ′P = I −V12V22−1

0 I

⎡

⎣⎢⎢

⎤

⎦⎥⎥, P =

I 0−V22

−1V21 I⎡

⎣⎢⎢

⎤

⎦⎥⎥

.

Then

′P V *P =V11 −V12V22

−1 V22 −V22C( )V22

−1V21 −V12V22−1V22

CV22−1V21 V12V22

−1V22C −V12V22

−1V22C

V22CV22

−1V21 V22C

⎡

⎣

⎢⎢

⎤

⎦

⎥⎥P

=V11 −V12V22

−1V21 0

V22CV22

−1V21 V22C

⎡

⎣⎢⎢

⎤

⎦⎥⎥P

=V11 −V12V22

−1V21 0

V22CV22

−1V21 −V22CV22

−1V21 V22C

⎡

⎣⎢⎢

⎤

⎦⎥⎥

=V11 −V12V22

−1V21 0

0 V22C

⎡

⎣⎢⎢

⎤

⎦⎥⎥≡V ** .

Moreover, by property (25) of determinants stated above, P = ′P = 1 . Hence

P-1 and ′P( )-1 both exist, and we may write V * = ′P( )−1

V **P−1 . Therefore, using

(25), (26), and (27) above and the fact that V22C is positive definite by construction, we conclude that

V * is positive definite.


15

Alternative proof:

For any p ×1 vector x ≠ 0, let y = P−1x . Then

′x V*x =

= ′x ′P( )−1′P⎡

⎣⎢⎤⎦⎥V * PP−1⎡⎣ ⎤⎦ x = ′x ′P( )−1

′P V *P( )P−1x

= ′y V ** y > 0

because V ** is postive definite.

Corollary Suppose the forecasted covariance matrix is given by

V

22c = ′α V

22α

where α is a p − k( )× p − k( ) diagonal matrix with positive diagonal elements. Then the

forecasted standard deviation of the variable k + i is then equal to α k+i times its historical standard deviation, while the forecasted correlations between any two forecasted variables are the same as their historical correlations. The covariance matrix V * is positive definite in this case. If some diagonal elements of the matrix α are allowed to be zero (a point forecast with probability one), then V * is positive semi-definite.


16


17

Appendix:ComputationoftheCovarianceMatrixV22C

The user provides a forecast for the variables yk+1,…, yk+m where k +m = p . This forecast is given by

Pr Y2 = y2i( ) ≡ Pr Y2 =

yk+1 i( )

yp i( )

⎛

⎝

⎜⎜⎜

⎞

⎠

⎟⎟⎟

⎡

⎣

⎢⎢⎢⎢

⎤

⎦

⎥⎥⎥⎥

= pi , pi ≥ 0, pi = 1i=1

n

∑ ; (1.1)

see equation (1) above. It is important to note that the pi ’s are probabilities while p without a subscript is the dimension of

the vector Y = Y1, Y2( ) . Also note that both k and n are selected by the user as part of her forecast. To simplify notation we will write

yk+1 1( ) yk+1 2( ) yk+1 n( )yk+2 1( ) yk+2 2( ) yk+2 n( )

yk+m 1( ) yk+m 2( ) yk+m n( )

⎡

⎣

⎢⎢⎢⎢⎢

⎤

⎦

⎥⎥⎥⎥⎥

=

a11 a12 a1na21 a22 a2n am1 am2 amn

⎡

⎣

⎢⎢⎢⎢⎢

⎤

⎦

⎥⎥⎥⎥⎥

= A . (1.2)

Here rows denote the variables that the user has chosen to forecast. Columns denote “events.” That is, the i-th column of the matrix A consists of the forecasted realizations a1i for variable 1, a2i for variable 2, ... , ami for variable m . The forecasted

probability for this event consisting of m forecasted realizations is pi . Now define the diagonal probability matrix

P =p1 0 00 00 0 pn

⎡

⎣

⎢⎢⎢

⎤

⎦

⎥⎥⎥

, (1.3)

a n ×1 unit vector

ι =11

⎡

⎣

⎢⎢⎢

⎤

⎦

⎥⎥⎥

, (1.4)

and a n × n unit matrix

Z =1 1 1 1

⎡

⎣

⎢⎢⎢

⎤

⎦

⎥⎥⎥= ιιT . (1.5)


18

It follows that

E Y2( ) = Eyk+1yk+m

⎡

⎣

⎢⎢⎢

⎤

⎦

⎥⎥⎥=

pia1ii=1

n

∑

piamii=1

n

∑

⎡

⎣

⎢⎢⎢⎢⎢⎢

⎤

⎦

⎥⎥⎥⎥⎥⎥

= APι . (1.6)

Therefore

E Y2( ) E Y2( )⎡⎣ ⎤⎦T= APι APι( )T = APιιT PT AT = APZ PAT (since P = PT ) . (1.7)

Similarly

E Y2( ) Y2( )T⎡⎣

⎤⎦ = E

yk+1( )2 yk+myk+1

yk+1yk+m yk+m( )2

⎡

⎣

⎢⎢⎢⎢

⎤

⎦

⎥⎥⎥⎥

=

pi a1i( )2i=1

n

∑ pia1iamii=1

n

∑

piamia1ii=1

n

∑ pi ami( )2i=1

n

∑

⎡

⎣

⎢⎢⎢⎢⎢⎢

⎤

⎦

⎥⎥⎥⎥⎥⎥

.(1.8)

We then see that

E Y2( ) Y2( )T⎡⎣

⎤⎦ = APA

T . (1.9)

We also know that the m ×m covariance matrix of the vector Y2 (given the discrete user forecast ξ ) is

V22C ≡ cov Y2;ξ( ) = E Y2( ) Y2( )T⎡

⎣⎤⎦ − E Y2( ) E Y2( )⎡⎣ ⎤⎦

T . (1.10)

Then from the above we have

V22C = APAT − APZPAT = AP I − ZP[ ]AT . (1.11)

Here A is m × n , P is n × n , and I − ZP[ ] is n × n . Moreover, since the probabilities must sum to one, the rank

of I − ZP[ ] is at most n −1. We conclude that m ≤ n −1 is required if V22C is to be positive definite with full rank

m .


19

In words, the number of variables being forecasted by the user (m ) must be less than the number of forecasted “events” (n ),

where each event has probability pi and pi = 1i=1

n

∑ .

The Burmeister-McElroy Sliced Normal Theorem · The Burmeister-McElroy Sliced Normal Theorem...

Documents

Transcript of The Burmeister-McElroy Sliced Normal Theorem · The Burmeister-McElroy Sliced Normal Theorem...