An Introduction to Digital Communications - Part 2

8/13/2019 An Introduction to Digital Communications - Part 2

1/38

INFORMATION THEORY

Information theory deals with mathematical modelling and analysis of a communications

system rather than with physical sources and physical channels.

Specifically, given an information sourceand a noisy channel, information theory provides

limitson :1- The minimum number of bits per symbol required to fully represent the source.

(i.e. the efficiency with which information from a given source can be represented.)

2- The maximum rate at which reliable (error-free) communications can take place over the

noisy channel.

Since the whole purpose of a communications system is to transport information from a

source to a destination, the question arises as to how much information can be transmitted in

a given time. ( Normally the goal would be to transmit as much information as possible in as

small a time as possible such that this information can be correctly interpreted at the

destination.)This of course leads to the next question, which is :

How can information be measured and how do we measure the rate at which information is

emitted from a source ?

Suppose that we observe the output emitted by a discrete source ( every unit interval or

signalling interval.)

The source output can be considered as a set, S, of discrete random events ( or outcomes).

These events are symbols from a fixed finite alphabet.

( for example the set or alphabet can be the numbers 1 to 6 on a die and each roll of the die

outputs a symbol being the number on the die upper face when the die comes to rest.

Another example is a digital binary source, where the alphabet is the digits "0" and "1", and

the source outputs a symbol of either "0" or "1" at random .)

If in general we consider a discrete random source which outputs symbols from a fixed finite

alphabet which has k symbols. Then the set S contains all the k symbols and we can write

S = { s0, s1, s2, ......., sk-1} and

p sii

i k

( )( )

1

0

1

(3.1)

In addition we assume that the symbols emitted by the source during successive signalling

intervals are statistically independenti.e. the probability of any symbol being emitted at any

signalling interval does not depend on the probability of occurrence of previous symbols. i.e.

we have what is called a discrete memoryless source.

Can we find a measure of how much "information" is produced by this source ?

The idea of information is closely related to that of "uncertainty" and "surprise".


2/38

If the source emits an output si, which has a probability of occurrence p(si) = 1, then all other

symbols of the alphabet have a zero probability of occurrence and there is really no

"uncertainty", "surprise", or information since we already know before hand ( a priori) what

the output symbol will be.

If on the other hand the source symbols occur with different probabilities, and the probability

p(si) is low, then there is more "uncertainty", "surprise" and therefore "information" when thesymbol si is emitted by the source, rather than another one with higher probability.

Thus the words "uncertainty", "surprise", and "information" are all closely related.

- Before the output sioccurs, there is an amountof "uncertainty".

- When the output sioccurs, there is an amountof "surprise".

- After the occurrence of the output si, there is gain in the amountof "information".

All three amountsare really the same and we can see that the amount of information is

related to the inverse of the probability of occurrence of the symbol.

Definition:

The amount of information gained after observing the event si which occurs with probability

p(si), is

])

ip(s

1[

2log)

iI(s bits, for i = 0,1,2, ..., (k-1) (3.2)

The unit of information is called "bit" , a contraction of "binary digit"

This definition exhibits the following important properties that are intuitively satisfying:

1- I (si) = 0 for p(si) = 1

i.e. if we are absolutely certain of the output of the source even before it occurs (a

priory), then there is no information gained.

2- I(si) 0 because 0 p(si) 1 for symbols of the alphabet.

i.e. the occurrence of an output sj either provides some information or no information

but never brings about a loss of information ( unless it is a severe blow to the headwhich is highly unlikely from the discrete source !)

3- I(sj) I(si) for p(sj) p(si)

i.e. the less probability of occurrence an output has the more information we gain when it

occurs.

4- I(sjsi) = I(sj) + I(si) if the outputs sjand siare statistically independent.


3/38

The use of the logarithm to the base 2 ( instead of to the base 10 or to the base e ) has been

adopted in the measure of information because usually we are dealing with digital binary

sources, (however it is useful to remember that log2(a) = 3.322 log10(a)). Thus if the source

alphabet was the binary set of symbols, i.e. "0" or "1" , and each symbol was equally likely to

occur i.e. s0 having p(s0) = 1/2 and s1 having p(s1) = 1/2

we have :

1)2(log]21

1[log=]

)i

p(s

1[

2log)

iI(s 22 bit

Hence "one bit" is the amount of information that is gained when one of two possible and

equally likely (equiprobable)outputs occurs.

[Note that a "bit" is also used to refer to a binary digit when dealing with the transmission of

a sequence of 1's and 0's].

Entropy

The amount of information , I(si), associated with the symbol si emitted by the source during

a signalling interval depends on the symbol's probability of occurrence. In general, each

source symbol has a different probability of occurrence. Since the source can emit any one of

the symbols of its alphabet, a measure for the average information content per source

symbolwas defined and called the entropy of the discrete source, H, (i.e. taking all the

discrete source symbols into account ).

DefinitionThe entropy, H, of a discrete memoryless source with source alphabet composed of the set

S = { s0, s1, s2, ......., sk-1}, is a measure of the average information content per source

symbol, and is given by :

])p(s

1[log)p(s

)I(s)p(sH

i

2

1)(ki

0i

i

i

1)(ki

0i

i

bits/symbol (3.3)

We note that the entropy, H, of a discrete memoryless source with equiprobable symbols is

bounded as follows:

0 2 H klog , where k is the number of equiprobable source symbols.

Furthermore, we may state that :

1- H = 0 , if and only if the probability p(si) = 1 for some symbol si , and the remaining

source symbols probabilities are all zero. This lower bound on entropy corresponds to no

uncertainty and no information.

2- H = log2k bits/symbol, if and only if p(si) = 1/k for all the k source symbols (i.e. they are

all equiprobable). This upper bound on entropy corresponds to maximum uncertainty andmaximum information.


4/38

Example:

Calculate the entropy of a discrete memoryless source with source alphabet S = { s 0, s1, s2}

with probabilities p(s0) = 1/4 , p(s1) = 1/4, p(s3) = 1/2 .

H p(s ) I(s )

p(s ) log [ 1

p(s )]

i

i 0

i ( k 1 )

i

ii 0

i ( k 1 )

2i

H p(s ) log [ 1

p(s )]+ p(s log [

1

p(s )]+ p(s log [

1

p(s )]

=1

4log

1

4log

1

2log

0 20

1 21

2 22

2 2 2

) )

( ) ( ) ( ) . /4 4 2 3

2 1 5 bits symbol

Information RateIf we consider that the symbols are emitted from the source at a fixed time rate (the signalling

interval), denoted by rs symbols/second. We can define the

average source information rate Rin bits per second as the product of the average

information content per symbol, H, and the symbol rate rs.

R = rsH bits/sec (3.4)

ExampleA discrete source emits one of five symbols once every millisecond. The source symbols

probabilities are 1/2, 1/4, 1/8, 1/16, and 1/16 respectively.

Find the source entropy and information rate.

H p(s ) log [ 1

p(s )]i

i 0

i ( k 1 )

2i

bits where, in this case k = 5

H p(s ) log [ 1p(s )

]+ p(s log [ 1p(s )

]+ p(s log [ 1p(s )

]+ p(s log [ 1p(s )

]+ p(s log [ 1p(s )

]

=1

2log

1

4log

1

8log

1

16log

1

16log

0 20

1 21

2 22

3 23

4 24

2 2 2 2 2

) ) ) )

( ) ( ) ( ) ( ) ( )

. . . . . . /

2 4 8 16 16

0 5 0 5 0 375 0 25 0 25 1 875 bits symbol

R = rsH bits/sec

The information rate R = (1/10-3) x 1.875 = 1875 bits/second.


5/38

Entropy of a Binary Memoryless Source:

To illustrate the properties of H, let us consider a memoryless digital binary source for which

symbol 0 occurs with probability p0and symbol 1 with probability p1 = (1 - p0).

The entropy of such a source equals:

bits])p-(1

1[log)p-(1+]

p

1[logp

]p

1

[logp+]p

1

[logpH

0

20

0

20

121

020

We note that

1- When p0= 0, the entropy H = 0. This follows from the fact that x log x 0 as x 0.

2- When p0= 1, the entropy H = 0.

3- The entropy H attains its maximum value, Hmax= 1 bit, when p0 = p1=1/2, that is symbols

0 and 1 are equally probable. (i.e. H = log2k = log22 = 1 )

( Hmax= 1 can be verified by differentiating H with respect to p and equating to zero )


6/38


7/38


8/38

CHANNEL CAPACITY

In Information Theory, the transmission medium is treated as an abstract and noisy filter

called the channel. The maximum rate of information transmission through a channel is

called the channel capacity, C.

Channel Coding Theorm

Shannon showed that, if the information rate R [remember that R = rsHbits/sec] is equal to

or less than C, R C, then there exists a coding technique which enables transmission overthe noisy channel with an arbitrarily small frequency of errors.

[A converse to this theorem states that it is not possible to transmit messages without error if

R > C ]

Thus the channel capacity is defined as the maximum rate of reliable (error-free) information

transmission through the channel.

Now consider a binary source with an available alphabet of k discrete messages (or symbols)

which are equiprobable and statistically independent (these messages could be either single

digit symbols or could be composed of several digits each depending on the situation). We

assume that each message sent can be identified at the receiver; therefore this case is often

called the discrete noiseless channel. The maximum entropy of the source is log2k bits,

and if T is the transmission time of each message, (i.e. rs=T

1symbols/sec), the channel

capacity is

krHrRC ss 2log bits per second.

To attain this maximum the messages must be equiprobable and statistically independent.

These conditions form a basis for the coding of the information to be transmitted over the

channel.

In the presence of noise, the capacity of this discrete channel decreases as a result of theerrors made in transmission.

In making comparisons between various types of communications systems, it is convenient to

consider a channel which is described in terms of bandwidth and signal-to-noise ratio.


9/38

Review of Signal to Noise Rat io

The analysis of the effect of noise on digital transmission will be covered later on in this

course but before proceeding, we will review the definition of signal to noise ratio. It is

defined as the ratio of signal power to noise power at the same point in a system. It is

normally measured in decibels.

Signal to Noise Ratio (dB) =N

S

10log10 dB

Noise is any unwanted signal. In electrical terms it is any unwanted introduction of energy

tending to interfere with the proper reception and reproduction of transmitted signals.

Channel Capacity Theorm

Bit error and signal bandwidths are of prime importance when designing a communications

system. In digital transmission systems noise may change the value of the transmitted digit

during transmission. (e.g. change a high voltage to a low voltage or vice versa).

This raises the question : Is it possible to invent a system with no bit error at the output even

when we have noise introduced into the channel? Shannons Channel Capacity Theorm

(also called the the Shannon -Hartley Theorm) answers this question

C = B log2 (1 + S/N) bits per second,where C is the channel capacity, B is the channel bandwidth in hertz and S/N is the signal-to-noise

power ratio (watts/watts, not dB).

Although this formula is restricted to certain cases (in particular certain types of random

noise), the result is of widespread importance to communication systems because many

channels can be modelled by random noise.

From the formula, we can see that the channel capacity, C, decreases as the available

bandwidth decreases. C is also proportional to the log of (1+S/N), so as the signal to noise

level decreases C also decreases.

The channel capacity theorem is one of the most remarkable results of information theory. In

a single formula, it highlights the interplay between three key system parameters: Channel

bandwidth, average transmitted power (or, equivalently, average received power), and noise

at the channel output.

The theorem implies that, for given average transmitted power S and channel bandwidth B,

we can transmit information at the rate C bits per seconds, with arbitrarily small probability

of error by employing sufficiently complex encoding systems. It is not possible to transmit at

a rate higher than C bits per second by any encoding system without a definite probability of

error.


10/38

Hence, the channel capacity theorem defines the fundamental limit on the rate of error-free

transmission for a power-limited, band-limited Gaussian channel. To approach this limit,

however, the noise must have statistical properties approximating those of white Gaussian

noise.

Problems:

1. A voice-grade channel of the telephone network has a bandwidth of 3.4 kHz.

(a) Calculate the channel capacity of the telephone channel for a signal-to-noise ratio of 30

dB.

(b) Calculate the minimum signal-to-noise ratio required to support information transmission

through the telephone channel at the rate of 4800 bits/sec.

(c) Calculate the minimum signal-to-noise ratio required to support information transmission

through the telephone channel at the rate of 9600 bits/sec.

2. Alphanumeric data are entered into a computer from a remote terminal through a voice-

grade telephone channel. The channel has a bandwidth of 3.4 kHz, and output signal-to-noise

ratio of 20 dB. The terminal has a total of 128 symbols. Assume that the symbols are

equiprobable, and the successive transmission are statistically independent.

(a) Calculate the channel capacity.

(b) Calculate the maximum symbol rate for which error-free transmission over the channel is

possible.

3. A black-and-white television picture may be viewed as consisting of approximately 3 x 105

elements, each one of which may occupy one 10 distinct brightness levels with equal

probability. Assume (a) the rate of transmission is 30 picture frames per second, and (b) the

signal-to-noise ratio is 30 dB.

Using the channel capacity theorem, calculate the minimum bandwidth required to support

the transmission of the resultant video signal.

4. What is the minimum time required for the facsimile transmission of one picture over a

standardtelephone circuit?

There are about 2.25 x 106picture elements to be transmitted and 12 brightness levels are tobe used for good reproduction. Assume all brightness levels are equiprobable. The telephone

circuit has a

3-kHz bandwidth and a 30-dB signal-to-noise ratio (these are typical parameters).


11/38


12/38


13/38


14/38


15/38


16/38

THE BINARY SYMMETRIC CHANNEL

Usually when a "1" or a "0" is sent it is received as a "1" or a "0", but occasionally a "1" will

be received as a "0" or a "0" will be received as a "1".

Let's say that on the average 1 out of 100 digits will be received in error, i.e. there is a

probability p = 1/100 that the channel will introduce an error.This is called a Binary Symmetric Channel(BSC), and is represented by the following

diagram.

p

p

(1-p)

(1-p)

0

1

0

1

Representation of the Bi nary Symmetric Channel

with an error probability of p

Now let us consider the useof this BSC model.

Say we transmit one information digit coded with a single even parity bit . This means that if

the information digit is 0 then the codeword will be 00 , and if the information digit is a 1

then the codeword will be 11.

As the codeword is transmitted through the channel, the channel may (or may not) introducean error according to the following error patterns:

E = 00 i.e. no errors

E = 01 i.e. a single error in the last digit

E = 10 i.e. a single error in the first digit

E = 11 i.e. a double error

The probability of no error , is the probability of receiving the second transmitted digit

correctly on condition that the first transmitted digit was received correctly.

Here we have to remember our discussion on joint probability:

p(AB) = p(A) p(B/A) = p(A) p(B) when the occurrence of any of the two outcomes is

independent of the occurrence of the other.

Thus the probabilty of no error is equal to the probability of receiving each digit correctly.

This probability, according to the BSC model, is equal to (1 - p), where p is the probability of

one digit being received incorrectly.

Thus the probability of no error = (1 - p) ( 1- p) = (1 - p)2.

Similarly, the probability of a single error in the first digit = p ( 1- p)

and the probability of a single error in the second digit = (1 - p) p ,

i.e. the probability of a single error is equal to the sum of the above two probabilities ( since

the two events are mutually exclusive), i.e.


17/38

the probability of a single error ( when a code with block length, n = 2 , is used, as in this

case)

is equal to 2 p(1 - p)

Similarly the probability of a double error in the above example ( i.e. the error pattern E = 11

)is equal to p2.

In summary these probabilities would be

p(E = 00) = (1 - p)2p(E = 01) = (1 - p) p

p(E = 10) = p (1 - p)

p(E = 11) = p2.

and if we substitute for p = .01 ( given in the above example) we find that

p(E = 00) = (1 - p) = 0.98

p(E = 01) = (1 - p) p = 0.0099

p(E = 10) = p (1 - p) = 0.0099Thus the probability of a single error per codeword = (1 - p) p + p (1 - p) = 2 p (1-p)

p(E = 11) = p2 = 0.0001

This shows that if p < 1/2 , then the probability of no error is higher than the probability of a

single error occurring which in turn is higher than the probability of a double error.

Again, if we consider a block code with block length n = 3, then the

probability of no error p(E = 000) = (1 - p)3,

probability of an error in the first digit p(E = 100) = p (1 -p)2,

probability of a single error per codeword p(1e) = 3 p (1 -p)2,

probability of a double error per codeword = p(2e) = ( )23 p2(1 - p) = 3 p2(1 - p)

probability of a triple error per codeword = p(3e) = p3.

And again, if we have a code with block length n = 4, then the




probability of a double error per codeword = p(2e) = ( )24 p2(1 - p)2= 6 p2(1 - p)2

probability of a triple error per codeword = p(3e) = ( )34 p3(1 - p) = 4 p3(1 - p)

probability of four errors per codeword = p(4e) = p4.

And again, if we have a code with block length n = 5, then the




probability of a double error per codeword = p(2e) = ( )25 p2(1 - p)3= 10 p2(1 - p)2

probability of a triple error per codeword = p(3e) = ( )35 p3(1 - p)2= 10 p3(1 - p)2

probability of four errors per codeword = p(4e) = ( )45 p4(1 - p) = 5 p4(1 - p).

probability of five errors per codeword = p(5e) = p5.


18/38

From all of the disscussion, we realise that if the error pattern (of length n) has weight

of say e

then the probability of occurrence of eerrors in a codeword with blocklength nis

)(ne

pe(1 - p)n-e.

We also realise that, since p < 1/2 , we have (1 - p) p, and

(1 - p)n p (1 - p)n-1 p2 (1 - p)n-2...............

Therefore an error pattern of weight 1 is more likely to occur than an error pattern of weight

2., and so on.


19/38

The Communications System from the channel Coding Theorem point of view

The Communications System from the channel Coding Theorem point of view

source Encoder Decoder user


20/38

Information Theory Summary

1-A discrete memoryless source (DMS) is one that outputs symbols taken from a fixed finte

alphabet which has k symbols. These symbols form a set S = {s0, s1, s2, . , sk-1}

where the occurrence of each symbol (si) at the output of the source has a probability ofoccurrence p(si) .( The probabilities of occurrence of the symbols are called the source

statistics.)

and

iki

i

isp

0

1)(

2- The amount of information gained after observing the output symbol (si) which

occurrs

with probability p(si) is

)(

1log)( 2

ii

spsI i= 0,1,2,,(k-1)

3- The entropy, H, of a discrete memoryless source with source alphabet composed of the set

S = {s0, s1, s2, . , sk-1}, is a measure of the average information content per source

symbol, and is given by:

1

0

2

1

0 )(

1log)()()(

ki

i ii

ki

i

iisp

spsIspH bits/symbol

4- Information rate (bit rate) = symbol rate * entropyR = rsH bits/sec

5-

N

SBWC 1log 2 bits/sec

6- BSC = Binary Symmetric Channel

7- Prob of e errors in n digits = )(n

epe(1 - p)n-e.


21/38


22/38

CHANNEL CODING

Suppose that we wish to transmit a sequence of binary digits across a noisy channel. If we

send a one, a one will probably be received; if we send a zero, a zero will probably be

received. Occasionally, however, the channel noise will cause a transmitted one to be

mistakenly interpreted as a zero or a transmitted zero to be mistakenly interpreted as a one.Although we are unable to prevent the channel from causing such errors, we can reduce their

undesirable effects with the use of coding.

The basic idea, is simple. We take a set of k information digitswhich we wish to transmit,

annex to them r check digits, and transmit the entire block of n = k + r channel digits.

Assuming that the channel noise changes sufficiently few of these transmitted channel digits,

the r check digits may provide the receiver with sufficient information to enable it to detect

and/or correct the channel errors.

(The detection and/or correction capability of a channel code will be discussed at some length

in the following pages.)

Given any particular sequence of k message digits, the transmitter must have some rule for

selecting the r check digits. This is called channel encoding.Any particular sequence of n digits which the encoder might transmit is called a codeword.

Although there are 2ndifferent binary sequences of length n, only 2kof these sequences arecodewords, because the r check digits within any codeword are completely determined by the

k information digits. The set consisting of these 2kcodewords, of length n each, is called a

code(some times referred to as a code book.)

No matter which codeword is transmitted, any of the 2npossible binary sequences of length n

may be received if the channel is sufficiently noisy. Given the n received digits, the decoder

must attempt to decide which of the 2kpossible codewords was transmitted.

Repetition codes and single-parity-check codes

Among the simplest examples of binary codes are the repetition codes, with k = 1, r arbitrary,

and n = k + r = 1 + r . The code contains two codewords, the sequence of n zeros and the

sequence of n ones.

We may call the first digit the information digit; the other r digits, check digits. The value of

each check digit (each 0 or 1) in a repetition code is identical to the value of the information

digit. The decoder might use the following rule:

Count the number of zeros and the number of ones in the received bits. If there are more

received zeros than ones, decide that the all-zero codeword was sent; if there are more ones

than zeros, decide that the all-one codeword was sent. If the number of ones equal the number

of zeros do not decide (just flag the error)..This decoding rule will decode correctly in all cases when the channel noise changes less

than half the digits in any one block. If the channel noise changes exactly half of the digits in

any one block, the decoder will be faced with a decoding failure(i.e. it will not decode the

received word into any of the possible transmitted codewords) which could result in an ARQ

(automatic request to repeat the message). If the channel noise changes more than half of the

digits in any one block, the decoder will commit a decoding error; i.e. it will decode the

received word into the wrong codeword.

If channel errors occur infrequently, the probability of a decoding failure or a decoding error

for a repetition code of long block length is very small indeed. However repetition codes are

not very useful. They have only two codewords and have very low information rateR = k/n

( also called code rate),all but one of the digits are check digits.We are usually more interested in codes which have a higher information rate.


23/38

Extreme examples of such very high rate codes which use a single-parity-checkdigit. This

check digit is taken to be the modulo-2 sum (Exclusive-OR) of the codeword (n -1)

information digits. (

The information digits are added according to the exclusive-OR binary operation : 0 + 0 = 0 ,

0 + 1 = 1, 1 + 0 = 1, 1 + 1 = 0 ). If the number of ones in the information word is even themodulo-2 sum of all the information digits will be equal to zero, If the number of ones in the

information word is odd their modulo-2 sum will be equal to one.

Even paritymeans that the total number of ones in the codeword is even, odd paritymeans

that the total number ones in the codeword is odd. Accordingly the parity bit (or digit) is

calculated and appended to the information digits to form the codeword.

This type of code can only detecterrors. A single digit error (or any number of odd digit

errors) will be detected but any combination of two digit errors (or any number of even digit

errors) will cause a decoding error. Thus the single-parity-check type of code cannot correct

errors.

These two examples, the repetitive codes and the single-parity-check codes, provide theextreme, relatively trivial, cases of binary block codes. ( Although relatively trivial single-

parity-checks are used quite often because they are simple to implement.)

The repetition codes have enormous error-correction capability but only one information bit

per block. The single-parity-check codes have very high information rate but since they

contain only one check digit per block, they are unable to do more than detect an odd number

of channel errors.

There are other codes which have moderate information rate and moderate error-

correction/detection capability, and we will study few of them.

These codes are classified into two major categories:

Block codes , and Convolutional codes.

In block codes, a block of k information digits is encoded to a codeword of n digits

(n > k). For each sequence of k information digits there is a distinct codeword of n digits.

In convolutional codes, the coded sequence of n digits depends not only on the k

information digits but also on the previous N - 1 information digits (N > 1). Hence the coded

sequence for a certain k information digits is not unique but depends on N - 1 earlier

information digits.

In block codes, k information digits are accumulated and then encoded into an n-digitcodeword. In convolutional codes, the coding is done on a continuous, or running, basis

rather than by accumulating k information digits.

We will start by studying block codes. (and if there is time we might come back to study

convolutional codes).

BLOCK CODES


24/38

The block encoder input is a stream of information digits. The encoder segments the input

information digit stream into blocks of k information digits and for each block it calculates

a number of r check digits and outputs a codeword of n digits, where n = k + r (or r = n -

k).

The code efficiency (also known as the code rate ) is k/n.

Such a block code is denoted as an (n,k) code.Block codes in which the k information digits are transmitted unaltered first and followed by

the transmission of the r check digits are called systematic codes, as shown in figure 1

below.

Since systematic block codes simplify implementation of the decoder and are always used in

practice we will consider only systematic codes in our studies.

( A non-systematic block code is one which has the check digits interspersed between the

information digits. For Linear block codes it can be shown that a non systematic block code

can always be transformed into a systematic one).

C1 CnCn-1C2 ..........Ck.....................................

r check digitsk information digits

Figure 1 an (n,k) block codeword in systematic form

LINEAR BLOCK CODES

Linear block codes are a class of parity check codes that can be characterized by the (n, k)notation described earlier.

The encoder transforms a block of k information digits (an information word) into a longer

block of n codeword digits, constructed from a given alphabet of elements. When the

alphabet consists of two elements (0 and 1), the code is a binary code comprised of binary

digits (bits). Our discussion of linear block codes is restricted to binary codes.

Again, the k-bit information words form 2k distinct information sequences referred to as

k-tuples(sequences of k digits).

An n-bit block can form as many as 2ndistinct sequences, referred to as n-tuples.

The encoding procedure assigns to each of the 2kinformation k-tuples one of the 2nn-tuples.

A block code represents a one-to-one assignment, whereby the 2kinformation k-tuples are

uniquely mapped into a new set of 2kcodeword n-tuples; the mapping can be accomplished

via a look-up table, or via some encoding rules that we will study shortly.

Definition:

An (n, k) binary block code, is said to be linearif, and only if, the modulo-2 addition (Ci

Cj) of any two codewords, Ciand Cj, is alsoa codeword. This property thus means that (for

linear block code) the all-zero n-tuple mustbe a member of the code book (because the

modulo-2 addition of a codeword with itself results in the all-zero n-tuple).

A linear block code, then, is one in which n-tuples outside the code book cannot be created

by the modulo-2 addition of legitimate codewords (members of the code book).

For example, the set of all 24= 16, 4-tuples (or 4-bit sequences ) is shown below:


25/38

0000 0001 0010 0011 0100 0101 0110 0111

1000 1001 1010 1011 1100 1101 1110 1111

an example of a block code ( which is really a subset of the above set ) that forms a linear

code is

0000 0101 1010 1111

It is easy to verify that the addition of any two of these 4 code words in the code book can

only yield one of the other members of the code book and since the all-zero n-tuple is a

codeword this code is a linearbinary block code.

Figure 5. 13 illustrates, with a simple geometric analogy, the structure behind linear block

codes. We can imagine the total set comprised of 2n n-tuples. Within this set (also calledvector space) there exists a subset of 2kn-tuples comprising the code book . These 2k

codewords or points , shown in bold "sprinkled" among the more numerous 2npoints,

represent the legitimate or allowable codeword assignments.

An information sequence is encoded into one of the 2kallowable codewords and thentransmitted. Because of noise in the channel, a corrupted version of the sent codeword


26/38

(one of the other 2nn-tuples in the total n-tuple set) may be received.

The objective of coding is that the decoder would be able to decide whether the received

word is a valid codeword, or whether it is a codeword which has been corrupted by noise (

i.e. detect the occurrence of one or more errors ). Ideally of course the decoder should be able

to decide which codeword was sent even if this transmitted codeword was corrupted by noise,and this process is calld error-correction.

By thinking about it, if one is going to attempt to correct errors in a received word

represented by a sequence of n binary symbols, then it is absolutely essential not to allow the

use of all 2nn-tuples as being legitimate codewords.

If, in fact, every possible sequence of n binary symbols were a legitimate codeword, then in

the presence of noise one or more binary symbols could be changed, and one would have no

possible basis for determining if a received sequence was any more valid than any other

sequence.

Carrying this thought a little further, if one wished that the coding system would correct theoccurrence of a single error, then it is both necessary and sufficient that each codeword

sequence differsfrom every other codeword in at least 3positions.

In fact, if one wished that the coding system would correct the occurrence of e errors, then it

is both necessary and sufficient that each codeword sequence differsfrom every other

codeword in at least (2e + 1) positions.

DEFINITIONThe number of positions in which any two codewords differ from each other is called the

Hamming distance, and is normally denoted by d .

For example:

Looking at the (n,k) = (4,2) binary linear block code, mentioned earlier, which has the

following codewords:

C1 0000

C2 0101

C3 1010

C4 1111

we see that the Hamming distance, d, :

between C2

and C3

is equal to 4

between C2 and C4 is equal to 2

between C3 and C4 is equal to 2

We also observe that the Hamming distance between C1and any of the other codewords is

equal to the "weight" that is the number of onesin each of the other codewords.

We can also see that the minimum Hamming distance( i.e. the smallest Hamming distance

between any pair of the codewords), denoted by dmin, of this code is equal to 2

( The minimum Hamming distance of a binary linear block code is simply equal to the

minimum weight of its codewords. This is due to the fact that the code is linear, meaning that

if any two codewords are added together modulo-2 the result will be another codeword. thus


27/38

to find the minimum Hamming distance of a linear block code all we need to do is to find the

minimum-weight codeword).

Looking at the above code again, and keeping in mind what we said earlier about the

"Hamming distance" property of the codewords for a code to correct a single error.

We said that, to correct a single error, this code must have any of its codewords differingfrom any of the other codewords by at least (2e + 1), where e in our case is 1 (i.e. a single

error). That is the minimum Hamming distance of the code must be at least 3. Therefore the

above mentioned code cannot correct the result of occurrence of a single error, (since its dmin

= 2), but it can detect it.

To explain this further let us consider the following diagram in figure

C1 C2

xx x

b) Hamming sphere of radius e = 1 around each codew ord

Hamming distance betw een codew ords = 2

Code can only detect e = 1 error but cannot correct it

because d = e + 1 ( i.e. d < 2e + 1)

xx x

x

C1 C2

a) Hamming sphere of radius e = 1 around each codew ord

Hamming distance between codew ords = 3

Code can correct a single error since d = 2e + 1

FIGURE 2

If we imagine that we draw a sphere ( called a Hamming sphere) of radius e= 1 around each

codeword. This sphere will contain all n-tuples which are at a distance 1 away from each

codeword ( i.e. all n-tuples which differ from this code word in one position ).

If the minimum Hamming distance of the code dmin< 2e + 1 (as in figure 2b, where d = 2)

the occurrence of a single error will result in changing the codeword to the next n-tuple and

the decoder does not have enough information to decide if codeword C1or C2was

transmitted. The decoder however can detect that an error has occurred.

If we look at figure 2a we see that the code has a dmin= 2e + 1 and that the occurrence of a

single error results in the next n-tuple being received and in this case the decoder can make

an unambiguous decision, based on what is called nearest neighbour decoding rule, as towhich of the two codewords was transmitted.


28/38

If the corrupted received n-tuple is not too unlike (not too distant from) the valid codeword,

the decoder could make a decision that the transmitted codeword was the code word "nearest

in distance" to the received the word.

Thus in general we can say that a binary linear code will correct e errors

if dmin= 2e + 1(for odd dmin )if dmin= 2e + 2 (for even dmin )

A (6 , 3) Linear Block Code Example

Examine the following coding assignment that describes a (6, 3) code. There are 2k= 23= 8

information words, and therefore eight codewords.

There are 2n= 26= sixty-four 6-tuples in the total 6-tuple set (or vector space)

Information digits, Codewords

C1,C2,C3 C1C2,C3C4C5,C6 parity check equations for this code are

c4 = c1 c2

000 000000 c5 = c2 c3

001 001011 c6 = c1 c3

010 010110 and its H matrix is

011 011101 110100100 100101 011010

101 101110 101001

110 110011

111 111000

It is easy to check that the eight codewords shown above form a linear code (the all-zeros

codeword is present, and the sum of any two codeword is another codeword member of the

code). Therefore, these codewords represent a linear binary block code.

It is also easy enough to check that the minimum Hamming distance of the code is dmin= 3

thus we conclude that this code is a single error correction code, since

dmin= 2e + 1 (for odd dmin) .


29/38

In the simple case of single-parity-check codes, the single parity was chosen to be the

modulo-2 sum of all the information digits.

Linear block codes contain several check digits, and each check digit is a function of the

modulo-2 sum of some (or all) of the information digits.

Let us consider the (6 , 3) code, i.e. n = 6, k = 3, and there are r = n-k = 3 chek digits.

We shall label the three information digits by C1,C2,C3 and the three check digits as C4,C5

and C6.

Lets choose to calculate the check digits from the information digits according to the

following rules: (each one of these equations must be indepentof any or all of the others)

C4= C1C2

C5= C2C3

C6= C1 C3

or in matrix notation

3

2

1

6

5

4

101

110

011

C

C

C

C

C

C

The full codeword consists of the digits C1,C2,C3, C4,C5,C6.

Generally the n-tuple codeword is denoted as C= [C1,C2,C3, C4,C5,C6]

Every codeword must satisfy the parity-check equations

C1 C2 C4 = 0

C2 C3 C5 = 0

C1 C3 C6= 0

or in matrix notation

0

0

0

100101

010110

001011

6

5

4

3

2

1

C

C

C

C

C

C

which can be written a little more compactly as

0

0

0

100101

010110

001011tC

Here Ctdenotes the column which is the transpose of the codeword


30/38


31/38

We can say that the error pattern was E= [ 000100 ]

If we multiplied the transpose of the received word by the parity-check matrix H

what do we get ?

H Rt = H (CE)t= H Ct H Et = StThe r-tuple S= [ S1,S2,S3] is called the syndrome.

This shows that the syndrome test, whether performed on either the corrupted received word

or on the error pattern that caused it, yields the same syndrome

Since the syndrome digits are defined by the same equations as the parity-check equations,

the syndrome digits reveal the parity check failures on the received codeword. (This happens

because the code is linear. An important property of linear block codes, fundamental to the

decoding process, is that the mapping between correctable error patterns and syndromes is

one-to-one and this means that we not only can detect an error but we can also correct it.)

For example using the received word given above R=[ 110111]

H Rt =

0

0

1

1

1

1

0

1

1

100101

010110

001011

= St,

where S= [ S1,S2,S3] = [100]

and as we can see this points to the fourth bit being in error.

Now all the decoder has to do ( after calculating the syndrome) is to invert the fourth bit

position in the received word to produce the codeword that was sent i.e C= [ 110011 ].

having obtained a feel of what channel coding and decoding is about, lets apply this

knowledge to a particular type of linear binary block codes called the Hamming codes.


32/38

HAMMING CODES

These are Linear binary single-error-correcting codes having the property that the columns of

the parity-check-matrix, H,consist of all the distinct non-zero rsequences of binary numbers.

Thus a Hamming code has as many parity-check matrix columns as there are single-error

sequences. these codes will correct all patterns of single errors in any transmitted codeword.

These codes have n = k + r , where n = 2r- 1 , and k = 2r- 1 - r .

These codes have a guaranteed minimum Hamming distance dmin= 3 .

for example the parity-check-matrix for the (7,4) Hamming code is

H=

1001101

0101011

0010111

a) Determine the codeword for the information sequence 0011

b) If the received word, R, is 1000010, determine if an error has occurred. If it has, find the

correct codeword.

Solution:

a) since H Ct = 0, we can use this equation to calculate the parity digits for the given

codeword as follows:

7

6

5

4

3

2

1

1001101

0101011

0010111

C

C

C

C

C

C

C

=

7

6

5

1

1

0

0

1001101

0101011

0010111

C

C

C

=

0

0

0

by multiplying out the left hand side we get

1.0 1.0 1.1 0.1 1.C50.C60.C7= 0

0 0 1 0 C5 0 0 = 0

i.e. 1 C5 = 0 and C5 = 1similarly by multiplying out the second row of the H matrix by the transpose of the

codeword we obtain

1.0 1.0 0.1 1.1 0.C51.C60.C7= 0

0 0 0 1 0 C6 0 = 0

i.e. 1 C6 = 0 and C6 = 1

similarly by multiplying out the third row of the H matrix by the transpose of the codeword

we obtain

1.0 0.0 1.1 1.1 0.C50.C61.C7= 0

0 0 1 1 0 0 C7 = 0

i.e. 1 1 C7 = 0 and C7 = 0


33/38

so that the codeword is

C = [C1,C2,C3, C4,C5,C6, C7] = 0011110

b) to find whether an error has occurred or not we use the following equation

H Rt = St , if the syndrome is zero then no error has occurred, if not an error

has occurred and is pin pointed by the syndrome.

Thus to compute the syndrome we multiply out the rows of H by the transpose of the

received word.

0

10

0

0

0

1

1001101

0101011

0010111

=

1

0

1

because the syndrome is the third column of the parity-check matrix, the third position of the

received word is in error and the correct codeword is 1010010.


34/38

The Generator Matrixof a linear binary block code

We saw above that the parity-check matrix of a systematic linear binary block code can be

written in the following (n-k) by n matrix form

H = [ h In-k]

The Generator Matrixof this same code is written in the following k by n matrix form

G= [ Ik ht ]

The generator matrix is useful in obtaining the codeword from the information sequence

according to the following formula

C= mG

Where,

Cis the codeword [C1,C2,.........,Cn-1,Cn]

mis the information digit sequence [m1, m2, ....., mk], and

Gis the generator matrix of the code as given by the formula for Gabove.

Thus if we consider the single-error-correcting (n,k) = (7,4) Hamming code disscussed

previously, its parity-check matrix was

H=

1 1 1 0 1 0 0

1 1 0 1 0 1 0

1 0 1 1 0 0 1

and thus its generator matrix would be

G

1 0 0 0 1 1 1

0 1 0 0 1 1 0

0 0 1 0 1 0 1

0 0 0 1 0 1 1

now if we had an information sequence given by the following digits 0011 , the codeword

would be given by C= mG, i.e.

C 0 0 1 1

1 0 0 0 1 1 1

0 1 0 0 1 1 0

0 0 1 0 1 0 1

0 0 0 1 0 1 1

0011110

Thus the (n,k) = (7,4) Hamming code generator matrix and the code book:


35/38

H = [ h In-k]

G= [ Ik ht ]

4

3

2

1

1101000

1010100

0110010

1110001

row

row

row

row

G

combinations codeword

1 1 row1 1000111

2 2 row2 0100110

3 3 row3 0010101

4 4 row4 0001011

5 1 2 rowirowj 1100001

6 1 3 etc 1010010

7 1 4 . 1001100

8 2 3 . 0110011

9 2 4 . 0101101

10 3 4 . 0011110

11 1 2 3 . 1110100

121 2 4

. 1101010

13 1 3 4 . 1011011

14 2 3 4 . 0111000

15 1 2 3 4 . 1111111

16 11 or 2 2 or 33 or 4 4 0000000


36/38


37/38


38/38

An Introduction to Digital Communications - Part 2

Documents

Transcript of An Introduction to Digital Communications - Part 2