Midterm 05 Solution

8/12/2019 Midterm 05 Solution

1/7

2005 Spring Open Book Midterm for Information Theory

1. Assume that the alphabets for random variablesX and Y are both {1, 2, 3, 4, 5}. Letx = g(y) be an estimate ofx from observing y. Define the probability of estimating

error as:Pe= Pr{g(Y)=X}.

Then, the Fanos inequality gives bounds for Pe as:

Hb(Pe) + 2Pe H(X|Y),

where Hb(p) =p log21p

+ (1p)log21

1p is the binary entropy function. The curve for

Hb(Pe) + 2Pe = H(X|Y) is plotted below.

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 10

0.5

1

1.5

2

2.5

Pe

H(X|Y)(in

bits)

|X|=5

A

B

C

D

(a) (8 pt) Point A on the above figure shows that ifH(X|Y) = 0, zero estimationerror, namely,Pe= 0, can be achieved. In this case, characterize the distributionPX|Y. Also, give an estimator g() that achieves Pe = 0. (Hint: Think of whatkind of statistical relation between Xand Y can render H(X|Y) = 0.)

(b) (8 pt) Point B on the above figure indicates that whenH(X|Y) = log2(5), the es-

timation error can only be equal to 0.8. In this case, characterize the distributionsPX|Y and PX. Prove that at H(X|Y) = log2(5), all estimators yield Pe = 0.8.(Hint: Think of what kind of statistical relation between X and Y can renderH(X|Y) = log2(5).)

(c) (4 pt) Point C on the above figure hints that whenH(X|Y) = 2, the estimationerror can be as worse as 1. Give an estimator g() that leads to Pe = 1, if PX|Y(x|y) = 1/4 for x = y, and PX|Y(x|y) = 0 for x = y. (Hint: The answer isapparent, isnt it?)

1


2/7

(d) (4 pt) Similarly, point D on the above figure hints that whenH(X|Y) = 0, theestimation error can be as worse as 1. Give an estimator g() that leads to Pe = 1at H(X|Y) = 0. (Hint: The answer is apparent, isnt it?)

Answer:

(a) H(X|Y) = 0 means that Xis deterministic given Y, namely, PX|Y(x|y) is either1 or 0. Therefore, the choice that g(y) is equal to the x for which PX|Y(x|y) = 1can achieve Pe= 0.

(b) Since log2(5) = H(X|Y) H(X) log2(5), we have H(X|Y) = H(X), whichimplies that X and Y are independent. Hence,PX|Y(x|y) =PX(x). In addition,H(X) = log2(5) implies that X is uniformly distributed, namely, PX(x) = 1/5.This concludes that PX|Y(x|y) = 1/5. Finally, any estimator g() gives Pe = 0.8,because

Pr{g(Y)=X} = 1xX

PX(x) Pr{g(Y) =x|X=x}

= 1xX

PX(x) Pr{g(Y) =x}

= 11

5

xX

Pr{g(Y) =x}

= 11

5=

4

5.

(c) Let g(y) =y. Then, Pr{g(Y) =X}=

yYPY(y)PX|Y(y|y) = 0.(d) That g(y) =x for anyxwith PX|Y(x|y) = 0 satisfies the need.

2. (12 pt) In the second part of Theorem 3.22 (slide I:3-49), it is shown that there existsa prefix code with

=x

Px(x)(cx) H(X) + 1,

where cx is the codeword for the source symbol xand (cx) is the length of codewordcx. Show that the upper bound can be improved to:

< H(X) + 1.

(Hint: Replace (cx) = log2 PX(x)+ 1 by a new assignment.)

Answer: We can modify the proof of Theorem 3.22 as follows.

Choose the codeword length for source symbol xas

(cx) = log2 PX(x). (1)

2


3/7

Then2(cx) PX(x).

Summing both sides over all source symbols, we obtain

xX

2(cx) 1,

which is exactly the Kraft inequality. On the other hand, (1) implies

(cx)< log2 PX(x) + 1,

which in turn implies

xXPX(x)(cx) 0 for every i.

i. (8pt) Prove that the average codeword length of the single-letter binary Huff-man code is equal to H(X) if, and only if, PX(xi) is equal to 2

ni for everyi, where {ni} is a sequence of positive integers. (Hint: The if-part can beproved by the new bound in Problem 2, and the only-if-part can be provedby modifying the proof of Theorem 3.18 or slide I:3-39.)

ii. (6pt) What is the sufficient and necessary condition under which the aver-age codeword length of the single-letter ternary Huffman code equalsH(X)?(Hint: You only need to write down the condition. No proof is necessary.)

iii. (4pt) Prove that the average codeword length of the two-letter Huffman codecannot be equal to H(X) + 1/2 bits? (Hint: Use the new bound in Problem2.)

(b) (4 pt) Can the channel capacity between channel inputXand channel output Zbe strictly larger than the channel capacity between channel input Xand channeloutput Y? Which lemma or theorem is your answer based on?

Channel

PY|X

Post processingdeterministic

mappingg()

X Y Z=g(Y)

Answer:

3


4/7

(a) i. If PX(xi) is 2ni for every i, then the source entropy is equal to

H(X) =xX

PX(x)log21

PX(x)=

j=1

nj2nj .

Problem 2 gives that we can take (cxj) = log2 PX(xj) = nj to forman optimal variable-length code (for which there exists a Huffman code thatperforms as good.) Hence, =

j=1 PX(xj)nj =

j=1 nj2

nj =H(X).

On the contrary, if= H(X), then the proof of Theorem 3.18 indicates that= H(X) only when

PX(xj) = 2(cxj )

and

j=1

2(cxj ) = 1, (2)

for which (2) is always true since the probability mass sums to 1, namely,

j=1

2(cxj ) =

j=1

PX(xj) = 1.

ii. PX(xi) is equal to 3ni for every i, where{ni}is a sequence of positive integers.

iii. From Problem 2, < H(X2) + 1 = 2H(X) + 1 bits. Hence, the averagecodeword length 1

2 must be strictly less than H(X) + 1/2.

(b) The answer is no by the data processing lemma.

4. Let the single-letter channel transition probability PY|X of the discrete memorylesschannel be defined as the following figure, where 0 <


5/7


6/7

(a) Let pY() be a pdf with mean and continuous support [0,). Then,

h(pX) h(pY) =

0

pX(x)log 1

pX(x)dx

0

pY(y)log 1

pY(y)dy

=

0

pX(x)

log() + 1

x

dx

0

pY(y)log 1

pY(y)dy

=

log() +

1

E[X]

0

pY(y)log 1

pY(y)dy

=

log() +

1

E[Y]

0

pY(y)log 1

pY(y)dy

=

0

pY(y)

log() +

1

y

dy

0

pY(y)log 1

pY(y)dy

=

0

pY(y)log 1

pX(y)dy

0

pY(y)log 1

pY(y)dy

=

0

pY(y)logpY(y)

pX(y)dy

= D(pYpX) 0,

with equality holds if, and only if, pYpX.

(b) One choice ofpXthat makes E[logpX(X)] =E[logpX(Y)] is the uniform distri-bution over [0, K], for which pX(x) =

1K

for x [0, K]. Then,

h(pX) h(pY) =

0

pX(x)log 1

pX(x)dx

0

pY(y)log 1

pY(y)dy

=

0

pX(x)log(K)dx

0

pY(y)log 1

pY(y)dy

= log(K)

0

pY(y)log 1

pY(y)dy

=

0

pY(y)log(K)dy

0

pY(y)log 1

pY(y)dy

=

0

pY(y)log 1

pX(y)dy

0

pY(y)log 1

pY(y)dy

= 0

pY(y)log pY(y)pX(y)dy

= D(pYpX) 0,

with equality holds if, and only if, pYpX.

6. Consider the 3-input 3-output memoryless additive Gaussian channel

Y = X+ Z,

6


7/7

where X = [X1, X2, X3], Y = [Y1, Y2, Y3] and Z = [Z1, Z2, Z3] are all 3-Dimensionalreal vectors. Assume that Xis independent ofZ, and the input power constraint is S(i.e., E(X21 +X

22 +X

23 ) S). Also, assume that Z is Gaussian distributed with zero

mean and covariance matrix K, where

K =

1 0 00 1

0 1

.

(a) (6 pt) Determine the capacity-cost function of the channel, if = 0. (Hint:Directly apply Theorem 6.31 or slide I:6-55.)

(b) (8 pt) Determine the capacity-cost function of the channel, if 0 < < 1. (Hint:Directly apply Theorem 6.33 or slide I:6-60.)

Answer:

(a)

C(S) = max{P

X3 : E[X2

1]+E[X2

2]+E[X2

3]S}

I(X3; Y3) =3

i=1

1

2log

1 +

S

3

.

(b) The eigenvalues ofKare 1 = 1,2= 1 and 3= 1 + . Hence,

C(S) =3

i=1

1

2log

1 +

Sii

,

where

S1= 0, S2 = S, S3 = 0, if 0 S < ;

S1=S

2 , S2 =

S+

2 , S3 = 0, ifS

Midterm 05 Solution

Documents

Transcript of Midterm 05 Solution