5 Channel Capacity and Coding

CODING & MODULATION Prof. Ing. Anton Čižmár, PhD.

also from Digital Communications 4th Ed., J. G. Proakis, McGraw-Hill Int. Ed. 2001

7 Channel Capacity and Coding

In chapter 5, we considered the problem of digital modulation by means of M=2k signals waveforms, where each waveform conveys k bits of information. We observed that some modulation methods provide better performance than others. Orthogonal signals allow us to make the probability of error arbitrarily small by letting the number of waveforms ∞→M , provided that SNR per bit dBb 6.1−≥γ We can operate at the capacity of the AWGN channel in the limit as the bandwidth expansion

∞→= RWBe / . This is a heavy price to pay, because Be grows exponentially with the block length k. Such inefficient use of channel bandwidth is highly undesirable. In this and the following chapter, we consider signal waveforms generated from either binary or nonbinary sequences. The resulting waveforms are generally characterized by a bandwidth expansion factor that grows only linearly with k. Coded waveforms offer the potential for greater bandwidth efficiency than orthogonal M-ary waveforms. 7.1 CHANNEL MODELS AND CHANNEL CAPACITY

• Code word (n,k) • Code rate (k/n) • Hard decision – the detector , which follows the demodulator, may decide on

wheter the transmitted bit is 0 or 1. It corresponds to binary quantization of the demodulator output.

• Soft decision – a detector that quantize to Q > 2 levels, i.e., Q-ary detector. If M-ary signals are used, then MQ ≥ . In the extreme case when no quantization is performed, ∞=Q .

7.1.1 Channel Models In this section we describe channel models that will be useful in the design of codes. The simplest is the binary symmetric channel, which corresponds to the case with M=2 and hard decision at the detector. Binary symmetric channel (BSC) Consider an additive noise channel and let the modulator and the demodulator/detector be included as parts of the channel. Such a composite channel is characterized by the set X ={0,1} of possible inputs, the set of Y ={0,1} of possible outputs, and a set of conditional probabilities that relate the possible outputs to the possible inputs. If the channel noise and other disturbances cause

statistically independent errors in the transmitted binary sequence with average probability p, then

pXYPXYPpXYPXYP−======

======1)0|0()1|1(

)0|1()1|0( (7.1-1)

Fig.7.1-2 Binary symmetric channel - memoryless

Thus, we have reduced the cascade of the binary modulator, the waveform channel, and the binary demodulator and detector into equivalent discrete-time channel, which is represented by the Fig.7.1-2. Discrete memoryless channels The BSC is a special case of a more general discrete-input, discrete-output channel. Suppose that the output symbols from the channel encoder are q-ary symbols. i.e.,

},...,,{ 110 −= qxxxX and the output of the detector consists of Q-ary symbols, where qMQ 2=≥ . If the channel and the modulation are memoryless, then the input-output

characteristics of the composite channel are described by a set of qQ conditional probabilities

)|()|( jiji xyPxXyYP ≡== (7.1-2)

where i=0,1,…,Q-1 and j=0,1,…,q-1.

If the input to a DMC is a sequence of n symbols u1,u2,…,un selected from the alphabet X and the corresponding output is the sequence v1,v2,…,vn of symbols from the alphabet Y, the joint conditional probability is

∏=

===

=====n

kkk

nnn

uXvYP

uXuXvYvYvYP

1

12211

)|(

),...,|,...,,( (7.1-3)

In general, the condition probabilities {P(yi|xj)} that characterize a DMC can be arranged in the matrix form P=[pji], where, by definition, )|( jiji xyPp ≡ . P is called the probability transition matrix for the channel. Discrete-input, continuous-output channel Suppose that the input to the modulator comprises symbols selected from a finite and discrete input alphabet },...,,{ 110 −= qxxxX and the output of the detector is unquantized

)( ∞=Q . Then, the input to the channel decoder cam assume any value on the real line, i.e., },{ ∞−∞=Y . This leads us to define a composite discrete-time memoryless channel that is characterized by the discrete input X, the continuous output Y, and the set of conditional probability density functions

))|( kxXyp = , k=0,1,…,q-1 The most important channel of this type is the additive white Gaussian noise (AWGN) channel for which

Y=X+G (7.1-4) where G is a zero-mean Gaussian random variable with variance 2σ and

1,...,1,0, −== qkxX k . For a given X, it follows that Y is Gaussian with mean xk and variance 2σ . That is,

22 2/)(

21)|( σ

σπkxy

k exXyp −−== (7.1-5)

For any given input sequence, Xi, i=1,2,…,n, there is a corresponding output sequence

Yi = Xi + Gi i=1,2,…,n (7.1-6) The condition, that the channel is memoryless may be expressed as

∏=

=====n

iiiinnn uXypuXuXuXyyyp

1221121 )|(),...,,|,...,,( (7.1-7)

Waveform channel We may separate the modulator and demodulator from the physical channel, and consider a channel model in which the inputs are waveforms and the outputs are waveforms. Let us assume that such a channel has a given bandwidth W, with ideal frequency response C(f)=1 within the bandwidth W, and the signal at its output is corrupted by additive white Gaussian noise. Suppose, that x(t) is a band-limited input to such a channel and y(t) is the corresponding output. Then,

y(t) = x(t) + n(t) (7.1-8) where n(t) represents a sample function of the additive noise process. A suitable method for defining a set of probabilities that characterize the channel is to expand x(t), y(t), and n(t) into a complete set of orthonormal functions. That is, we express x(t), y(t), and n(t) in the form

∑=i

ii tfyty )()(

∑=

iii tfxtx )()( (7.1-9)

∑=

iii tfntn )()(

where {yi}, {xi}, and {ni} are the sets of coefficients in the corresponding expansions, e.g.,

∫ ∫ +=+==T T

oiiiii nxdttftntxdttftyy

0

** )()()([)()( (7.1-10)

where ni is Gaussian, it follows that

22 2/)(

21)|( iii xy

iii exyp σ

σπ−−= (7.1-12)

Since the functions {fi(t)} in the expansion are orthonormal, it follows that the {ni} are uncorrelated. Since they are Gaussian, they are also statistically independent. Hence,

∏=

=N

iiinn xypxxxyyyp

12121 )|(),...,,|,...,,( (7.1-13)

for any N. In this manner, the waveform channel is reduced to an aquivalent discrete-time channel characterized by the conditional PDF given in Eq.7.1-12.

When the additive noise is white and Gaussian with spectral density 021 N , the variances

02

21 Ni =σ for all I in Eq.7.1-12. In this case, samples of x(t) and y(t) may be taken at the

Nyquist rate of 2W samples/s, so that xi=x(i/2W) and yi=y(i/2W). Since the noise is white, the noise samples are statistically independent. Thus, Eq.7.1-12 and 7.1-13 describe the statistics of the sampled signal. We note that in a time interval of length T, there are N=2WT samples. This parameter is used below in obtaining the capacity of the band-limited AWGN waveform channel. The choice of which channel model to use at any one time depends on our objectives. If we are interested in the design and analysis of the performance of the discrete channel encoder and decoder, it is appropriate to consider channel models in which the modulator and demodulator are part of the composite channel. On the other hand, if our intent is to design and analyze the performance of the digital modulator and demodulator, we use a channel model for the waveform channel. 7.1.2 Channel Capacity Now let us consider DMC having an input alphabet },...,,{ 110 −= qxxxX , and output alphabet },...,,{ 110 −= QyyyY ,and the set of transition probabilities )|( ji xyP as defined in Eq. 7.1-2. Suppose that the symbol xj is transmitted and the symbol yi is received. The mutual information provided about the event X=xj by the occurrence Y=yi is

)()|(

logi

ji

yPxyP

, where

∑−

=

==≡1

0

)|()()()(q

kkikii xyPxPyYPyP (7.1-14)

Hence, the average mutual information provided by the output Y about the input X is

∑∑−

=

−

=

=1

0

1

0 )()|(

log)|()();(q

j

Q

i i

jijij yP

xyPxyPxPYXI (7.1-15)

The channel characteristics determine the transition probabilities )|( ji xyP , but the probabilities of the input symbols are under the kontrol of the discrete channel encoder. The value of I(X;Y) maximized over the set of input symbol probabilities P(xj) is a

quantity that depends only on the characteristics of the DMC through the conditional probabilities )|( ji xyP . This quantity is called the capacity of the channel and is denoted as

∑∑−

=

−

=

=

=

1

0

1

0)(

)(

)()|(

log)|()(max

);(max

q

j

Q

i i

jijijxP

xP

yPxyP

xyPxP

YXIC

j

j

(7.1-16)

The maximization of I(X;Y) is performed under the constraints that 0)( ≥jxP and

∑−

=

=1

01)(

q

jjxP . The units of C are bits per input symbol into the channel (bits per channel

use) when the log is base 2. If a symbol enters the channel every sτ seconds, the channel capacity in bits/s is C/ sτ . Example 7.1-1 For the BSC with transition probabilities P(0|1)=P(1|0)=p the average mutual information is maximized when the input probabilities P(0)=P(1)=1/2. Thus the capacity of the BSC is C = p log 2p + (1 – p) log 2(1-p) = 1 – H(p) (7.1-17) where H(p) is the binary entropy function.

A plot of C versus p is illustrated in Fig.7.1-4. Next let us consider the discrete-time AWGN memoryless channel described by the transition probability density functions defined by Eq.7.1-5. The capacity of this channel in bits per channel use is the maximum average mutual information between the discrete input },...,,{ 110 −= qxxxX and the output },{ ∞−∞=Y . That is,

∑ ∫−

=

∞

∞−

=1

02)( )(

)|(log)()|(maxq

j

iiixP

dyypxypxPxypC

j

(7.1-18)

where

∑−

=

=1

0)()|()(

q

kkk xPxypyp (7.1-19)

Example 7.1-2. Let us consider a binary-input AWGN memoryless channel with possible inputs X=A and X= - A. The average mutual information I(X;Y) is maximized when the input probabilities are P(X=A)=P(X= -A) = ½. Hence, the capacity of this channel in bits per channel use is

∫∫∞

∞−

∞

∞−

−−+= dy

ypAypAypdy

ypAypAypC

)()|(log)|(

21

)()|(log)|(

21

22 (7.1-20)

Fig.7.1-6 illustrates C as a function of the ratio 22 2/ σA . Note that C increases monotonically from 0 to 1 bit per symbol as this ratio increases.

It is interesting to note that in the two channel models described above, the choice of equally probable input symbols maximizes the average mutual information. Thus, the capacity of the channel is obtained when the input symbols are equally probable. This is not always the solution for the capacity formulas given in Eq.71.-16 and 7.1-18, however. Nothing can be said in general above the input probability assignment that maximizes the average mutual information. However, in the two channel models considered above, the channel transition probabilities exhibit a form of symmetry that results in the maximum of I(X;Y) being obtained when the input symbols are equally probable. The symmetry condition can be expressed in terms of the elements of the probability transition matrix P of the channel. When each row of this matrix is a permutation of any other row and each column is a permutation of any other column, the probability transition matrix is symmetric and input symbols with equal probability maximize I(X;Y).

In general, necessary and sufficient conditions for the set of input probabilities {P(xj)} to maximize I(X;Y) and, thus to achieve capacity on a DMC are that (Problem 7.1)

CYxI =);( for all j with P(xj) > 0 CYxI ≤);( for all j with P(xj) = 0 (7.1-21) where C is the capacity of the channel and

∑−

=

=1

0 )()|(

log)|();(Q

i i

jijij yP

xyPxyPYxI (7.1-22)

Now let us consider a band-limited waveform channel with AWGN. The capacity of the channel per unit time has been defined by Shannon (1948) as

);(1maxlim)(

YXIT

CxpT ∞→

= (7.1-23)

Alternatively, we may use the samples or the coefficients {yi}, {xi}, and {ni} in the series expansions of y(t), x(t), and n(t), respectively, to determine the average mutual information between xN=[x1x2…xN] and yN=[y1y2…yN], where N = 2WT. The average mutual information between xN and yN for the AWGN channel is

∑ ∫ ∫=

∞

∞−

∞

∞−

=N

iii

i

iiiiiNN dxdy

ypxypxpxypYXI

12 )(

)|(log)()|();( (7.1-24)

where

02 /)(

0

1)|( Nxyii

iieN

xyp −−=π

(7.1-25)

The maximum of I(X;Y) over the input PDFs p(xi) is obtained when the {xi} are statistically independent zero-mean Gaussian random variables. i.e.,

22 2/

21)( xix

xi exp σ

σπ−= (7.1-26)

where 2

xσ is the variance of each xi. Then it follows from Eq.7.1-24 that

⎟⎟⎠

⎞⎜⎜⎝

⎛+=⎟⎟

⎠

⎞⎜⎜⎝

⎛+=⎟⎟

⎠

⎞⎜⎜⎝

⎛+=∑

= 0

2

0

2

11 0

2

)(

21log21log2121log

21);(max

NWT

NN

NYXI xx

Nx

NNxp

σσσ (7.1-27)

Suppose that we put a constraint on the average power in x(t). That is,

∫ ∑=

===T N

i

xiav T

NxET

dttxET

P0 1

222 )(1)]([1 σ (7.1-28)

Hence

WP

NTP avav

x 22 ==σ (7.1-29)

Substitution of this result into Eq.7.1-27 for 2

xσ yields

⎟⎟⎠

⎞⎜⎜⎝

⎛+=

0)(

1log);(maxWNPWTYXI av

NNxp (7.1-30)

Finally, the channel capacity per unit time is obtained by dividing the result in Eq. 7.1-30 by T. Thus

⎟⎟⎠

⎞⎜⎜⎝

⎛+=

02 1log

WNPWC av (7.1-31)

This is the basic formula for the capacity of the band-limited AWGN waveform channel with a band-limited and average power-limited input. A plot of the capacity in bits/s normalized by the bandwidth W is plotted in Fig.7.1-7 as a function of the ratio of signal power Pav to noise power WN0.

Note that the capacity increases monotonically with increasing SNR. Thus, for a fixed bandwidth, the capacity of the waveform channel increases with an increase in the transmitted signal power. On the other hand, if Pav is fixed, the capacity can be increased by increasing the bandwidth W. Fig. 7.1-8 illustrates a graph of C versus W. Note that as W approaches infinity, the capacity of the channel approaches the asymptotic value

2lnlog

02

0 NPe

NPC avav ==∞ bits/s (7.1-32)

It is instructive to express the normalized channel capacity C/W as a function of the SNR per bit. Since Pav represents the average transmitted power and C is the rate in bits/s, it follows that

bav CEP = (7.1-33)

where Eb is the energy per bit.

We may express 7.1-31 as

⎟⎟⎠

⎞⎜⎜⎝

⎛+=

02 1log

WNCE

WC b (7.1-34)

Consequently

WCNE WC

b

/12 /

0

−= (7.1-35)

when C/W=1, Eb/N0=1 (0 dB). As ∞→WC /

)ln2lnexp(/

2 /

0 WC

WC

WCNE WC

b −≈≈ (7.1-36)

Thus, Eb/N0 increases exponentially as ∞→WC / . On the other hand, as 0/ →WC ,

2ln/

12lim/

0/0

=−

=→ WCN

E WC

WCb (7.1-35)

which is –1.6 dB. A plot C/W versus Eb/N0 is shown in Fig.5.2-17 We have derived the channel capacities of three important channel models that are considered in the lectures. The first is the discrete-input, discrete-output channel, of which the BSC is a special case. The second is a discrete-input, continuous memoryless additive white Gaussian noise channel. From these two channel models, we can obtain benchmarks for the coded performance with hard and soft decision decoding in digital communication systems. The third channel model focuses on the capacity in bits/s of a waveform channel. In this case, we assumed that we have a bandwidth limitation on the channel, an additive Gaussian noise that corrupts the signal, and an average power constraint at the transmitter. The major significance of the channel capacity given above is that they serve as upper limits on the transmission rate for reliable communication over a noisy channel. Noisy Channel Coding Theorem. There exist channel codes (and decoders) that makes it possible to achieve reliable communication, with as small an error probability as desired, if the transmission rate R < C, where C is the channel capacity. If R > C, it is not possible to make the probability of error tend toward zero with any code. The design of coded modulation for efficient transmission may be divided into two basic approaches:

1. algebraic approach (specific classes of codes – cyclic codes, Hamming codes, convolutional codes, …)

2. probabilistic approach ( analysis of the performance of a general class of coded signals)

7.1 Show that that the following two relations are necessary and sufficient conditions for the set of input probabilities )}({ jxP to maximize );( YXI and, thus, to achieve capacity for a DMC:

I(xj; Y)=C for all j with P(xj)>0 I(xj; Y)≤C for all j with P(xj)=0

where C is capacity of the channel and

)()(

log)(),(1

0 i

jij

Q

iij yP

xyPxyPYxI

⏐⏐=∑

−

=

7.2 Figure P7.2 illustrates an M-ary symmetric DMC with transition probabilities

pxyP −= 1)( when x=y=k for k=0, 1,...., M-1, and )1/()( −= MpxyP when x≠y. a) Show that this channel satisfies the condition given in Problem 7.1 when

MxP k /1)( = . b) Determine and plot the channel capacity as a function of p.

Figure P7.2

p/(M-1)

p/(M-1)

p/(M-1)p/(M-1)

M-1 M-11-p

11

0

OutputY

InputX0

p/(M-1)

1-p

1-p

7.3 Determine the capacities of the channels show in Figure P7.3.

Figure P7.3

611

3

16

13

21

16

12

31

(c)

12

13

61

61

16

16

31

31

(b)

13

(a)

21

21

12

12

21

21

21

21

7.4 Consider the two channels with the transition probabilities as shown figure P7.4. Determine if equally probable input symbols maximize the information rate through the channel.

Figure P7.4. y1

y2

y3

0.6

0.3

0.6x1

0.1

0.3

x2

0.1

(b)

y3

y2

y1

x2

x1

0.3

0.1

0.1

0.6

0.3

(a)0.6

7.5 A telephone channel has a bandwidth W=3000 Hz and a signal-to-noise power ratio of 400 (26 dB). Suppose we characterize the channel as a band-limited AWGN waveform channel with .400/ 0 =WNPav

a) Determine the capacity of the channel in bits/s. b) Is the capacity of the channel sufficient to support the transmission of a speech

signal that has been sampled and coded by means of logarithmic PCM? c) Usually, channel impairments other than additive noise limit the transmission rate

over the telephone channel to less than the channel capacity of the equivalent band-limited AWGN channel considered in (a). Suppose that a transmission rate of 0.7C is achievable in practice without channel encoding. Which of the speech source encoding methods described in Section 3.5 provide sufficient compression to fit the bandwidth restrictions of the telephone channel.

7.6 Consider the binary-input, quaternary-output DMC shown in Figure P7.6.

a) Determine the capacity of the channel. b) Show that this channel is equivalent to a BSC.

Figure P7.6

x2

21 (1-p)

21 (1-p)

p12

p21

p12

y4

y3

y2

p

y1

21 (1-p)

(1-p)x1

21

21

7.7 Determine the capacity for the channel shown in Figure P7.7.

Figure P7.7

p

pp

(1-p)

(1-p)

(1-p)

y3

y2

y1

x3

x2

x1

7.10 Figure P7.10 illustrates a binary erasure channel with transition probabilities pPP −== 1)11()00( and pePeP == )1()0( . The probabilities for the input symbols are

α== )0(XP and α−== 1)1(XP . a) Determine the average mutual information I(X;Y) in bits. b) Determine the value of α that maximizes I(X;Y), i.e., the channel capacity C in

bits per channel use, and plot C as a function of p for the optimum value of α. c) For the value of α found in (b), determine the mutual information I(x;

y)=I(0,0),I(1;1), I(0;e), and I(1,e).

Figure P7.10 Output

YInput

X0

e

11-p

p

1-p0

p

1

7.12 Find the capacity of the cascade connection of n binary-symmetric channels with the same crossover probability ε. What is the capacity when the number of channels goes to infinity?

7.13 Channels 1, 2, and 3 are shown in Figure P7.13. a) Find the capacity of channel 1. What input distribution achieves capacity? b) Find the capacity of channel 2. What input distribution achieves capacity? c) Let C denote the capacity of the third channel and C1 and C2 represent the

capacities of the first and second channel. Which of the following relations holds true and why?

)(21

21 CCC +< (i)

)(21

21 CCC += (ii)

)(21

21 CCC +> (iii)

Figure P7.13

d

c

b

0.25

0.5

a0.5

Channel 30.5

0.25

1

0

0.5

c

d

1

0.5

0

1Channel 2

b1

a0

1Channel 1

1

7.15 The channel C (know as the Z channel) is shown in Figure P7.15

a) Find the input probability distribution that achieves capacity. b) What is the input distribution and capacity for the special cases ε=0, ε=1, and

ε=0.5? c) Show that if n such channels are cascaded, the resulting channel will be

equivalent to a Z channel with ε1=en. d) What is the capacity of the equivalent Z channel when ∞→n ?

Figure P7.15

1-ε

0

1ε

0

1

7.16 Find the capacity of an additive white Gaussian noise channel with a bandwidth 1

MHz, power 10 W, and noise power spectral density 90 10

21 −=N W/Hz.

7.17 Channel C1 is an additive white Gaussian noise channel with a bandwidth W,

average transmitter power P, and noise power spectral density 021 N . Channel C2 is an

additive Gaussian noise channel with the same bandwidth and power as channel C1 but with noise power spectral density Φn(f). It is further assumed that the total noise power for both channels is the same; that is,

∫ ∫− −

==ΦW

W

W

W

n WNdfNdff 0021)(

Which channel do you think has a larger capacity? Give an intuitive reasoning.

5 Channel Capacity and Coding

Documents

Transcript of 5 Channel Capacity and Coding