An Introduction to Digital Communications - Part 2
-
Upload
ben-gurion -
Category
Documents
-
view
219 -
download
0
Transcript of An Introduction to Digital Communications - Part 2
-
8/13/2019 An Introduction to Digital Communications - Part 2
1/38
INFORMATION THEORY
Information theory deals with mathematical modelling and analysis of a communications
system rather than with physical sources and physical channels.
Specifically, given an information sourceand a noisy channel, information theory provides
limitson :1- The minimum number of bits per symbol required to fully represent the source.
(i.e. the efficiency with which information from a given source can be represented.)
2- The maximum rate at which reliable (error-free) communications can take place over the
noisy channel.
Since the whole purpose of a communications system is to transport information from a
source to a destination, the question arises as to how much information can be transmitted in
a given time. ( Normally the goal would be to transmit as much information as possible in as
small a time as possible such that this information can be correctly interpreted at the
destination.)This of course leads to the next question, which is :
How can information be measured and how do we measure the rate at which information is
emitted from a source ?
Suppose that we observe the output emitted by a discrete source ( every unit interval or
signalling interval.)
The source output can be considered as a set, S, of discrete random events ( or outcomes).
These events are symbols from a fixed finite alphabet.
( for example the set or alphabet can be the numbers 1 to 6 on a die and each roll of the die
outputs a symbol being the number on the die upper face when the die comes to rest.
Another example is a digital binary source, where the alphabet is the digits "0" and "1", and
the source outputs a symbol of either "0" or "1" at random .)
If in general we consider a discrete random source which outputs symbols from a fixed finite
alphabet which has k symbols. Then the set S contains all the k symbols and we can write
S = { s0, s1, s2, ......., sk-1} and
p sii
i k
( )( )
1
0
1
(3.1)
In addition we assume that the symbols emitted by the source during successive signalling
intervals are statistically independenti.e. the probability of any symbol being emitted at any
signalling interval does not depend on the probability of occurrence of previous symbols. i.e.
we have what is called a discrete memoryless source.
Can we find a measure of how much "information" is produced by this source ?
The idea of information is closely related to that of "uncertainty" and "surprise".
-
8/13/2019 An Introduction to Digital Communications - Part 2
2/38
If the source emits an output si, which has a probability of occurrence p(si) = 1, then all other
symbols of the alphabet have a zero probability of occurrence and there is really no
"uncertainty", "surprise", or information since we already know before hand ( a priori) what
the output symbol will be.
If on the other hand the source symbols occur with different probabilities, and the probability
p(si) is low, then there is more "uncertainty", "surprise" and therefore "information" when thesymbol si is emitted by the source, rather than another one with higher probability.
Thus the words "uncertainty", "surprise", and "information" are all closely related.
- Before the output sioccurs, there is an amountof "uncertainty".
- When the output sioccurs, there is an amountof "surprise".
- After the occurrence of the output si, there is gain in the amountof "information".
All three amountsare really the same and we can see that the amount of information is
related to the inverse of the probability of occurrence of the symbol.
Definition:
The amount of information gained after observing the event si which occurs with probability
p(si), is
])
ip(s
1[
2log)
iI(s bits, for i = 0,1,2, ..., (k-1) (3.2)
The unit of information is called "bit" , a contraction of "binary digit"
This definition exhibits the following important properties that are intuitively satisfying:
1- I (si) = 0 for p(si) = 1
i.e. if we are absolutely certain of the output of the source even before it occurs (a
priory), then there is no information gained.
2- I(si) 0 because 0 p(si) 1 for symbols of the alphabet.
i.e. the occurrence of an output sj either provides some information or no information
but never brings about a loss of information ( unless it is a severe blow to the headwhich is highly unlikely from the discrete source !)
3- I(sj) I(si) for p(sj) p(si)
i.e. the less probability of occurrence an output has the more information we gain when it
occurs.
4- I(sjsi) = I(sj) + I(si) if the outputs sjand siare statistically independent.
-
8/13/2019 An Introduction to Digital Communications - Part 2
3/38
The use of the logarithm to the base 2 ( instead of to the base 10 or to the base e ) has been
adopted in the measure of information because usually we are dealing with digital binary
sources, (however it is useful to remember that log2(a) = 3.322 log10(a)). Thus if the source
alphabet was the binary set of symbols, i.e. "0" or "1" , and each symbol was equally likely to
occur i.e. s0 having p(s0) = 1/2 and s1 having p(s1) = 1/2
we have :
1)2(log]21
1[log=]
)i
p(s
1[
2log)
iI(s 22 bit
Hence "one bit" is the amount of information that is gained when one of two possible and
equally likely (equiprobable)outputs occurs.
[Note that a "bit" is also used to refer to a binary digit when dealing with the transmission of
a sequence of 1's and 0's].
Entropy
The amount of information , I(si), associated with the symbol si emitted by the source during
a signalling interval depends on the symbol's probability of occurrence. In general, each
source symbol has a different probability of occurrence. Since the source can emit any one of
the symbols of its alphabet, a measure for the average information content per source
symbolwas defined and called the entropy of the discrete source, H, (i.e. taking all the
discrete source symbols into account ).
DefinitionThe entropy, H, of a discrete memoryless source with source alphabet composed of the set
S = { s0, s1, s2, ......., sk-1}, is a measure of the average information content per source
symbol, and is given by :
])p(s
1[log)p(s
)I(s)p(sH
i
2
1)(ki
0i
i
i
1)(ki
0i
i
bits/symbol (3.3)
We note that the entropy, H, of a discrete memoryless source with equiprobable symbols is
bounded as follows:
0 2 H klog , where k is the number of equiprobable source symbols.
Furthermore, we may state that :
1- H = 0 , if and only if the probability p(si) = 1 for some symbol si , and the remaining
source symbols probabilities are all zero. This lower bound on entropy corresponds to no
uncertainty and no information.
2- H = log2k bits/symbol, if and only if p(si) = 1/k for all the k source symbols (i.e. they are
all equiprobable). This upper bound on entropy corresponds to maximum uncertainty andmaximum information.
-
8/13/2019 An Introduction to Digital Communications - Part 2
4/38
Example:
Calculate the entropy of a discrete memoryless source with source alphabet S = { s 0, s1, s2}
with probabilities p(s0) = 1/4 , p(s1) = 1/4, p(s3) = 1/2 .
H p(s ) I(s )
p(s ) log [ 1
p(s )]
i
i 0
i ( k 1 )
i
ii 0
i ( k 1 )
2i
H p(s ) log [ 1
p(s )]+ p(s log [
1
p(s )]+ p(s log [
1
p(s )]
=1
4log
1
4log
1
2log
0 20
1 21
2 22
2 2 2
) )
( ) ( ) ( ) . /4 4 2 3
2 1 5 bits symbol
Information RateIf we consider that the symbols are emitted from the source at a fixed time rate (the signalling
interval), denoted by rs symbols/second. We can define the
average source information rate Rin bits per second as the product of the average
information content per symbol, H, and the symbol rate rs.
R = rsH bits/sec (3.4)
ExampleA discrete source emits one of five symbols once every millisecond. The source symbols
probabilities are 1/2, 1/4, 1/8, 1/16, and 1/16 respectively.
Find the source entropy and information rate.
H p(s ) log [ 1
p(s )]i
i 0
i ( k 1 )
2i
bits where, in this case k = 5
H p(s ) log [ 1p(s )
]+ p(s log [ 1p(s )
]+ p(s log [ 1p(s )
]+ p(s log [ 1p(s )
]+ p(s log [ 1p(s )
]
=1
2log
1
4log
1
8log
1
16log
1
16log
0 20
1 21
2 22
3 23
4 24
2 2 2 2 2
) ) ) )
( ) ( ) ( ) ( ) ( )
. . . . . . /
2 4 8 16 16
0 5 0 5 0 375 0 25 0 25 1 875 bits symbol
R = rsH bits/sec
The information rate R = (1/10-3) x 1.875 = 1875 bits/second.
-
8/13/2019 An Introduction to Digital Communications - Part 2
5/38
Entropy of a Binary Memoryless Source:
To illustrate the properties of H, let us consider a memoryless digital binary source for which
symbol 0 occurs with probability p0and symbol 1 with probability p1 = (1 - p0).
The entropy of such a source equals:
bits])p-(1
1[log)p-(1+]
p
1[logp
]p
1
[logp+]p
1
[logpH
0
20
0
20
121
020
We note that
1- When p0= 0, the entropy H = 0. This follows from the fact that x log x 0 as x 0.
2- When p0= 1, the entropy H = 0.
3- The entropy H attains its maximum value, Hmax= 1 bit, when p0 = p1=1/2, that is symbols
0 and 1 are equally probable. (i.e. H = log2k = log22 = 1 )
( Hmax= 1 can be verified by differentiating H with respect to p and equating to zero )
-
8/13/2019 An Introduction to Digital Communications - Part 2
6/38
-
8/13/2019 An Introduction to Digital Communications - Part 2
7/38
-
8/13/2019 An Introduction to Digital Communications - Part 2
8/38
CHANNEL CAPACITY
In Information Theory, the transmission medium is treated as an abstract and noisy filter
called the channel. The maximum rate of information transmission through a channel is
called the channel capacity, C.
Channel Coding Theorm
Shannon showed that, if the information rate R [remember that R = rsHbits/sec] is equal to
or less than C, R C, then there exists a coding technique which enables transmission overthe noisy channel with an arbitrarily small frequency of errors.
[A converse to this theorem states that it is not possible to transmit messages without error if
R > C ]
Thus the channel capacity is defined as the maximum rate of reliable (error-free) information
transmission through the channel.
Now consider a binary source with an available alphabet of k discrete messages (or symbols)
which are equiprobable and statistically independent (these messages could be either single
digit symbols or could be composed of several digits each depending on the situation). We
assume that each message sent can be identified at the receiver; therefore this case is often
called the discrete noiseless channel. The maximum entropy of the source is log2k bits,
and if T is the transmission time of each message, (i.e. rs=T
1symbols/sec), the channel
capacity is
krHrRC ss 2log bits per second.
To attain this maximum the messages must be equiprobable and statistically independent.
These conditions form a basis for the coding of the information to be transmitted over the
channel.
In the presence of noise, the capacity of this discrete channel decreases as a result of theerrors made in transmission.
In making comparisons between various types of communications systems, it is convenient to
consider a channel which is described in terms of bandwidth and signal-to-noise ratio.
-
8/13/2019 An Introduction to Digital Communications - Part 2
9/38
Review of Signal to Noise Rat io
The analysis of the effect of noise on digital transmission will be covered later on in this
course but before proceeding, we will review the definition of signal to noise ratio. It is
defined as the ratio of signal power to noise power at the same point in a system. It is
normally measured in decibels.
Signal to Noise Ratio (dB) =N
S
10log10 dB
Noise is any unwanted signal. In electrical terms it is any unwanted introduction of energy
tending to interfere with the proper reception and reproduction of transmitted signals.
Channel Capacity Theorm
Bit error and signal bandwidths are of prime importance when designing a communications
system. In digital transmission systems noise may change the value of the transmitted digit
during transmission. (e.g. change a high voltage to a low voltage or vice versa).
This raises the question : Is it possible to invent a system with no bit error at the output even
when we have noise introduced into the channel? Shannons Channel Capacity Theorm
(also called the the Shannon -Hartley Theorm) answers this question
C = B log2 (1 + S/N) bits per second,where C is the channel capacity, B is the channel bandwidth in hertz and S/N is the signal-to-noise
power ratio (watts/watts, not dB).
Although this formula is restricted to certain cases (in particular certain types of random
noise), the result is of widespread importance to communication systems because many
channels can be modelled by random noise.
From the formula, we can see that the channel capacity, C, decreases as the available
bandwidth decreases. C is also proportional to the log of (1+S/N), so as the signal to noise
level decreases C also decreases.
The channel capacity theorem is one of the most remarkable results of information theory. In
a single formula, it highlights the interplay between three key system parameters: Channel
bandwidth, average transmitted power (or, equivalently, average received power), and noise
at the channel output.
The theorem implies that, for given average transmitted power S and channel bandwidth B,
we can transmit information at the rate C bits per seconds, with arbitrarily small probability
of error by employing sufficiently complex encoding systems. It is not possible to transmit at
a rate higher than C bits per second by any encoding system without a definite probability of
error.
-
8/13/2019 An Introduction to Digital Communications - Part 2
10/38
Hence, the channel capacity theorem defines the fundamental limit on the rate of error-free
transmission for a power-limited, band-limited Gaussian channel. To approach this limit,
however, the noise must have statistical properties approximating those of white Gaussian
noise.
Problems:
1. A voice-grade channel of the telephone network has a bandwidth of 3.4 kHz.
(a) Calculate the channel capacity of the telephone channel for a signal-to-noise ratio of 30
dB.
(b) Calculate the minimum signal-to-noise ratio required to support information transmission
through the telephone channel at the rate of 4800 bits/sec.
(c) Calculate the minimum signal-to-noise ratio required to support information transmission
through the telephone channel at the rate of 9600 bits/sec.
2. Alphanumeric data are entered into a computer from a remote terminal through a voice-
grade telephone channel. The channel has a bandwidth of 3.4 kHz, and output signal-to-noise
ratio of 20 dB. The terminal has a total of 128 symbols. Assume that the symbols are
equiprobable, and the successive transmission are statistically independent.
(a) Calculate the channel capacity.
(b) Calculate the maximum symbol rate for which error-free transmission over the channel is
possible.
3. A black-and-white television picture may be viewed as consisting of approximately 3 x 105
elements, each one of which may occupy one 10 distinct brightness levels with equal
probability. Assume (a) the rate of transmission is 30 picture frames per second, and (b) the
signal-to-noise ratio is 30 dB.
Using the channel capacity theorem, calculate the minimum bandwidth required to support
the transmission of the resultant video signal.
4. What is the minimum time required for the facsimile transmission of one picture over a
standardtelephone circuit?
There are about 2.25 x 106picture elements to be transmitted and 12 brightness levels are tobe used for good reproduction. Assume all brightness levels are equiprobable. The telephone
circuit has a
3-kHz bandwidth and a 30-dB signal-to-noise ratio (these are typical parameters).
-
8/13/2019 An Introduction to Digital Communications - Part 2
11/38
-
8/13/2019 An Introduction to Digital Communications - Part 2
12/38
-
8/13/2019 An Introduction to Digital Communications - Part 2
13/38
-
8/13/2019 An Introduction to Digital Communications - Part 2
14/38
-
8/13/2019 An Introduction to Digital Communications - Part 2
15/38
-
8/13/2019 An Introduction to Digital Communications - Part 2
16/38
THE BINARY SYMMETRIC CHANNEL
Usually when a "1" or a "0" is sent it is received as a "1" or a "0", but occasionally a "1" will
be received as a "0" or a "0" will be received as a "1".
Let's say that on the average 1 out of 100 digits will be received in error, i.e. there is a
probability p = 1/100 that the channel will introduce an error.This is called a Binary Symmetric Channel(BSC), and is represented by the following
diagram.
p
p
(1-p)
(1-p)
0
1
0
1
Representation of the Bi nary Symmetric Channel
with an error probability of p
Now let us consider the useof this BSC model.
Say we transmit one information digit coded with a single even parity bit . This means that if
the information digit is 0 then the codeword will be 00 , and if the information digit is a 1
then the codeword will be 11.
As the codeword is transmitted through the channel, the channel may (or may not) introducean error according to the following error patterns:
E = 00 i.e. no errors
E = 01 i.e. a single error in the last digit
E = 10 i.e. a single error in the first digit
E = 11 i.e. a double error
The probability of no error , is the probability of receiving the second transmitted digit
correctly on condition that the first transmitted digit was received correctly.
Here we have to remember our discussion on joint probability:
p(AB) = p(A) p(B/A) = p(A) p(B) when the occurrence of any of the two outcomes is
independent of the occurrence of the other.
Thus the probabilty of no error is equal to the probability of receiving each digit correctly.
This probability, according to the BSC model, is equal to (1 - p), where p is the probability of
one digit being received incorrectly.
Thus the probability of no error = (1 - p) ( 1- p) = (1 - p)2.
Similarly, the probability of a single error in the first digit = p ( 1- p)
and the probability of a single error in the second digit = (1 - p) p ,
i.e. the probability of a single error is equal to the sum of the above two probabilities ( since
the two events are mutually exclusive), i.e.
-
8/13/2019 An Introduction to Digital Communications - Part 2
17/38
the probability of a single error ( when a code with block length, n = 2 , is used, as in this
case)
is equal to 2 p(1 - p)
Similarly the probability of a double error in the above example ( i.e. the error pattern E = 11
)is equal to p2.
In summary these probabilities would be
p(E = 00) = (1 - p)2p(E = 01) = (1 - p) p
p(E = 10) = p (1 - p)
p(E = 11) = p2.
and if we substitute for p = .01 ( given in the above example) we find that
p(E = 00) = (1 - p) = 0.98
p(E = 01) = (1 - p) p = 0.0099
p(E = 10) = p (1 - p) = 0.0099Thus the probability of a single error per codeword = (1 - p) p + p (1 - p) = 2 p (1-p)
p(E = 11) = p2 = 0.0001
This shows that if p < 1/2 , then the probability of no error is higher than the probability of a
single error occurring which in turn is higher than the probability of a double error.
Again, if we consider a block code with block length n = 3, then the
probability of no error p(E = 000) = (1 - p)3,
probability of an error in the first digit p(E = 100) = p (1 -p)2,
probability of a single error per codeword p(1e) = 3 p (1 -p)2,
probability of a double error per codeword = p(2e) = ( )23 p2(1 - p) = 3 p2(1 - p)
probability of a triple error per codeword = p(3e) = p3.
And again, if we have a code with block length n = 4, then the
probability of no error p(E = 0000) = (1 - p)4,
probability of an error in the first digit p(E = 1000) = p (1 -p)3,
probability of a single error per codeword p(1e) = 4 p (1 -p)3,
probability of a double error per codeword = p(2e) = ( )24 p2(1 - p)2= 6 p2(1 - p)2
probability of a triple error per codeword = p(3e) = ( )34 p3(1 - p) = 4 p3(1 - p)
probability of four errors per codeword = p(4e) = p4.
And again, if we have a code with block length n = 5, then the
probability of no error p(E = 00000) = (1 - p)5,
probability of an error in the first digit p(E = 10000) = p (1 -p)4,
probability of a single error per codeword p(1e) = 5 p (1 -p)4,
probability of a double error per codeword = p(2e) = ( )25 p2(1 - p)3= 10 p2(1 - p)2
probability of a triple error per codeword = p(3e) = ( )35 p3(1 - p)2= 10 p3(1 - p)2
probability of four errors per codeword = p(4e) = ( )45 p4(1 - p) = 5 p4(1 - p).
probability of five errors per codeword = p(5e) = p5.
-
8/13/2019 An Introduction to Digital Communications - Part 2
18/38
From all of the disscussion, we realise that if the error pattern (of length n) has weight
of say e
then the probability of occurrence of eerrors in a codeword with blocklength nis
)(ne
pe(1 - p)n-e.
We also realise that, since p < 1/2 , we have (1 - p) p, and
(1 - p)n p (1 - p)n-1 p2 (1 - p)n-2...............
Therefore an error pattern of weight 1 is more likely to occur than an error pattern of weight
2., and so on.
-
8/13/2019 An Introduction to Digital Communications - Part 2
19/38
The Communications System from the channel Coding Theorem point of view
The Communications System from the channel Coding Theorem point of view
source Encoder Decoder user
-
8/13/2019 An Introduction to Digital Communications - Part 2
20/38
Information Theory Summary
1-A discrete memoryless source (DMS) is one that outputs symbols taken from a fixed finte
alphabet which has k symbols. These symbols form a set S = {s0, s1, s2, . , sk-1}
where the occurrence of each symbol (si) at the output of the source has a probability ofoccurrence p(si) .( The probabilities of occurrence of the symbols are called the source
statistics.)
and
iki
i
isp
0
1)(
2- The amount of information gained after observing the output symbol (si) which
occurrs
with probability p(si) is
)(
1log)( 2
ii
spsI i= 0,1,2,,(k-1)
3- The entropy, H, of a discrete memoryless source with source alphabet composed of the set
S = {s0, s1, s2, . , sk-1}, is a measure of the average information content per source
symbol, and is given by:
1
0
2
1
0 )(
1log)()()(
ki
i ii
ki
i
iisp
spsIspH bits/symbol
4- Information rate (bit rate) = symbol rate * entropyR = rsH bits/sec
5-
N
SBWC 1log 2 bits/sec
6- BSC = Binary Symmetric Channel
7- Prob of e errors in n digits = )(n
epe(1 - p)n-e.
-
8/13/2019 An Introduction to Digital Communications - Part 2
21/38
-
8/13/2019 An Introduction to Digital Communications - Part 2
22/38
CHANNEL CODING
Suppose that we wish to transmit a sequence of binary digits across a noisy channel. If we
send a one, a one will probably be received; if we send a zero, a zero will probably be
received. Occasionally, however, the channel noise will cause a transmitted one to be
mistakenly interpreted as a zero or a transmitted zero to be mistakenly interpreted as a one.Although we are unable to prevent the channel from causing such errors, we can reduce their
undesirable effects with the use of coding.
The basic idea, is simple. We take a set of k information digitswhich we wish to transmit,
annex to them r check digits, and transmit the entire block of n = k + r channel digits.
Assuming that the channel noise changes sufficiently few of these transmitted channel digits,
the r check digits may provide the receiver with sufficient information to enable it to detect
and/or correct the channel errors.
(The detection and/or correction capability of a channel code will be discussed at some length
in the following pages.)
Given any particular sequence of k message digits, the transmitter must have some rule for
selecting the r check digits. This is called channel encoding.Any particular sequence of n digits which the encoder might transmit is called a codeword.
Although there are 2ndifferent binary sequences of length n, only 2kof these sequences arecodewords, because the r check digits within any codeword are completely determined by the
k information digits. The set consisting of these 2kcodewords, of length n each, is called a
code(some times referred to as a code book.)
No matter which codeword is transmitted, any of the 2npossible binary sequences of length n
may be received if the channel is sufficiently noisy. Given the n received digits, the decoder
must attempt to decide which of the 2kpossible codewords was transmitted.
Repetition codes and single-parity-check codes
Among the simplest examples of binary codes are the repetition codes, with k = 1, r arbitrary,
and n = k + r = 1 + r . The code contains two codewords, the sequence of n zeros and the
sequence of n ones.
We may call the first digit the information digit; the other r digits, check digits. The value of
each check digit (each 0 or 1) in a repetition code is identical to the value of the information
digit. The decoder might use the following rule:
Count the number of zeros and the number of ones in the received bits. If there are more
received zeros than ones, decide that the all-zero codeword was sent; if there are more ones
than zeros, decide that the all-one codeword was sent. If the number of ones equal the number
of zeros do not decide (just flag the error)..This decoding rule will decode correctly in all cases when the channel noise changes less
than half the digits in any one block. If the channel noise changes exactly half of the digits in
any one block, the decoder will be faced with a decoding failure(i.e. it will not decode the
received word into any of the possible transmitted codewords) which could result in an ARQ
(automatic request to repeat the message). If the channel noise changes more than half of the
digits in any one block, the decoder will commit a decoding error; i.e. it will decode the
received word into the wrong codeword.
If channel errors occur infrequently, the probability of a decoding failure or a decoding error
for a repetition code of long block length is very small indeed. However repetition codes are
not very useful. They have only two codewords and have very low information rateR = k/n
( also called code rate),all but one of the digits are check digits.We are usually more interested in codes which have a higher information rate.
-
8/13/2019 An Introduction to Digital Communications - Part 2
23/38
Extreme examples of such very high rate codes which use a single-parity-checkdigit. This
check digit is taken to be the modulo-2 sum (Exclusive-OR) of the codeword (n -1)
information digits. (
The information digits are added according to the exclusive-OR binary operation : 0 + 0 = 0 ,
0 + 1 = 1, 1 + 0 = 1, 1 + 1 = 0 ). If the number of ones in the information word is even themodulo-2 sum of all the information digits will be equal to zero, If the number of ones in the
information word is odd their modulo-2 sum will be equal to one.
Even paritymeans that the total number of ones in the codeword is even, odd paritymeans
that the total number ones in the codeword is odd. Accordingly the parity bit (or digit) is
calculated and appended to the information digits to form the codeword.
This type of code can only detecterrors. A single digit error (or any number of odd digit
errors) will be detected but any combination of two digit errors (or any number of even digit
errors) will cause a decoding error. Thus the single-parity-check type of code cannot correct
errors.
These two examples, the repetitive codes and the single-parity-check codes, provide theextreme, relatively trivial, cases of binary block codes. ( Although relatively trivial single-
parity-checks are used quite often because they are simple to implement.)
The repetition codes have enormous error-correction capability but only one information bit
per block. The single-parity-check codes have very high information rate but since they
contain only one check digit per block, they are unable to do more than detect an odd number
of channel errors.
There are other codes which have moderate information rate and moderate error-
correction/detection capability, and we will study few of them.
These codes are classified into two major categories:
Block codes , and Convolutional codes.
In block codes, a block of k information digits is encoded to a codeword of n digits
(n > k). For each sequence of k information digits there is a distinct codeword of n digits.
In convolutional codes, the coded sequence of n digits depends not only on the k
information digits but also on the previous N - 1 information digits (N > 1). Hence the coded
sequence for a certain k information digits is not unique but depends on N - 1 earlier
information digits.
In block codes, k information digits are accumulated and then encoded into an n-digitcodeword. In convolutional codes, the coding is done on a continuous, or running, basis
rather than by accumulating k information digits.
We will start by studying block codes. (and if there is time we might come back to study
convolutional codes).
BLOCK CODES
-
8/13/2019 An Introduction to Digital Communications - Part 2
24/38
The block encoder input is a stream of information digits. The encoder segments the input
information digit stream into blocks of k information digits and for each block it calculates
a number of r check digits and outputs a codeword of n digits, where n = k + r (or r = n -
k).
The code efficiency (also known as the code rate ) is k/n.
Such a block code is denoted as an (n,k) code.Block codes in which the k information digits are transmitted unaltered first and followed by
the transmission of the r check digits are called systematic codes, as shown in figure 1
below.
Since systematic block codes simplify implementation of the decoder and are always used in
practice we will consider only systematic codes in our studies.
( A non-systematic block code is one which has the check digits interspersed between the
information digits. For Linear block codes it can be shown that a non systematic block code
can always be transformed into a systematic one).
C1 CnCn-1C2 ..........Ck.....................................
r check digitsk information digits
Figure 1 an (n,k) block codeword in systematic form
LINEAR BLOCK CODES
Linear block codes are a class of parity check codes that can be characterized by the (n, k)notation described earlier.
The encoder transforms a block of k information digits (an information word) into a longer
block of n codeword digits, constructed from a given alphabet of elements. When the
alphabet consists of two elements (0 and 1), the code is a binary code comprised of binary
digits (bits). Our discussion of linear block codes is restricted to binary codes.
Again, the k-bit information words form 2k distinct information sequences referred to as
k-tuples(sequences of k digits).
An n-bit block can form as many as 2ndistinct sequences, referred to as n-tuples.
The encoding procedure assigns to each of the 2kinformation k-tuples one of the 2nn-tuples.
A block code represents a one-to-one assignment, whereby the 2kinformation k-tuples are
uniquely mapped into a new set of 2kcodeword n-tuples; the mapping can be accomplished
via a look-up table, or via some encoding rules that we will study shortly.
Definition:
An (n, k) binary block code, is said to be linearif, and only if, the modulo-2 addition (Ci
Cj) of any two codewords, Ciand Cj, is alsoa codeword. This property thus means that (for
linear block code) the all-zero n-tuple mustbe a member of the code book (because the
modulo-2 addition of a codeword with itself results in the all-zero n-tuple).
A linear block code, then, is one in which n-tuples outside the code book cannot be created
by the modulo-2 addition of legitimate codewords (members of the code book).
For example, the set of all 24= 16, 4-tuples (or 4-bit sequences ) is shown below:
-
8/13/2019 An Introduction to Digital Communications - Part 2
25/38
0000 0001 0010 0011 0100 0101 0110 0111
1000 1001 1010 1011 1100 1101 1110 1111
an example of a block code ( which is really a subset of the above set ) that forms a linear
code is
0000 0101 1010 1111
It is easy to verify that the addition of any two of these 4 code words in the code book can
only yield one of the other members of the code book and since the all-zero n-tuple is a
codeword this code is a linearbinary block code.
Figure 5. 13 illustrates, with a simple geometric analogy, the structure behind linear block
codes. We can imagine the total set comprised of 2n n-tuples. Within this set (also calledvector space) there exists a subset of 2kn-tuples comprising the code book . These 2k
codewords or points , shown in bold "sprinkled" among the more numerous 2npoints,
represent the legitimate or allowable codeword assignments.
An information sequence is encoded into one of the 2kallowable codewords and thentransmitted. Because of noise in the channel, a corrupted version of the sent codeword
-
8/13/2019 An Introduction to Digital Communications - Part 2
26/38
(one of the other 2nn-tuples in the total n-tuple set) may be received.
The objective of coding is that the decoder would be able to decide whether the received
word is a valid codeword, or whether it is a codeword which has been corrupted by noise (
i.e. detect the occurrence of one or more errors ). Ideally of course the decoder should be able
to decide which codeword was sent even if this transmitted codeword was corrupted by noise,and this process is calld error-correction.
By thinking about it, if one is going to attempt to correct errors in a received word
represented by a sequence of n binary symbols, then it is absolutely essential not to allow the
use of all 2nn-tuples as being legitimate codewords.
If, in fact, every possible sequence of n binary symbols were a legitimate codeword, then in
the presence of noise one or more binary symbols could be changed, and one would have no
possible basis for determining if a received sequence was any more valid than any other
sequence.
Carrying this thought a little further, if one wished that the coding system would correct theoccurrence of a single error, then it is both necessary and sufficient that each codeword
sequence differsfrom every other codeword in at least 3positions.
In fact, if one wished that the coding system would correct the occurrence of e errors, then it
is both necessary and sufficient that each codeword sequence differsfrom every other
codeword in at least (2e + 1) positions.
DEFINITIONThe number of positions in which any two codewords differ from each other is called the
Hamming distance, and is normally denoted by d .
For example:
Looking at the (n,k) = (4,2) binary linear block code, mentioned earlier, which has the
following codewords:
C1 0000
C2 0101
C3 1010
C4 1111
we see that the Hamming distance, d, :
between C2
and C3
is equal to 4
between C2 and C4 is equal to 2
between C3 and C4 is equal to 2
We also observe that the Hamming distance between C1and any of the other codewords is
equal to the "weight" that is the number of onesin each of the other codewords.
We can also see that the minimum Hamming distance( i.e. the smallest Hamming distance
between any pair of the codewords), denoted by dmin, of this code is equal to 2
( The minimum Hamming distance of a binary linear block code is simply equal to the
minimum weight of its codewords. This is due to the fact that the code is linear, meaning that
if any two codewords are added together modulo-2 the result will be another codeword. thus
-
8/13/2019 An Introduction to Digital Communications - Part 2
27/38
to find the minimum Hamming distance of a linear block code all we need to do is to find the
minimum-weight codeword).
Looking at the above code again, and keeping in mind what we said earlier about the
"Hamming distance" property of the codewords for a code to correct a single error.
We said that, to correct a single error, this code must have any of its codewords differingfrom any of the other codewords by at least (2e + 1), where e in our case is 1 (i.e. a single
error). That is the minimum Hamming distance of the code must be at least 3. Therefore the
above mentioned code cannot correct the result of occurrence of a single error, (since its dmin
= 2), but it can detect it.
To explain this further let us consider the following diagram in figure
C1 C2
xx x
b) Hamming sphere of radius e = 1 around each codew ord
Hamming distance betw een codew ords = 2
Code can only detect e = 1 error but cannot correct it
because d = e + 1 ( i.e. d < 2e + 1)
xx x
x
C1 C2
a) Hamming sphere of radius e = 1 around each codew ord
Hamming distance between codew ords = 3
Code can correct a single error since d = 2e + 1
FIGURE 2
If we imagine that we draw a sphere ( called a Hamming sphere) of radius e= 1 around each
codeword. This sphere will contain all n-tuples which are at a distance 1 away from each
codeword ( i.e. all n-tuples which differ from this code word in one position ).
If the minimum Hamming distance of the code dmin< 2e + 1 (as in figure 2b, where d = 2)
the occurrence of a single error will result in changing the codeword to the next n-tuple and
the decoder does not have enough information to decide if codeword C1or C2was
transmitted. The decoder however can detect that an error has occurred.
If we look at figure 2a we see that the code has a dmin= 2e + 1 and that the occurrence of a
single error results in the next n-tuple being received and in this case the decoder can make
an unambiguous decision, based on what is called nearest neighbour decoding rule, as towhich of the two codewords was transmitted.
-
8/13/2019 An Introduction to Digital Communications - Part 2
28/38
If the corrupted received n-tuple is not too unlike (not too distant from) the valid codeword,
the decoder could make a decision that the transmitted codeword was the code word "nearest
in distance" to the received the word.
Thus in general we can say that a binary linear code will correct e errors
if dmin= 2e + 1(for odd dmin )if dmin= 2e + 2 (for even dmin )
A (6 , 3) Linear Block Code Example
Examine the following coding assignment that describes a (6, 3) code. There are 2k= 23= 8
information words, and therefore eight codewords.
There are 2n= 26= sixty-four 6-tuples in the total 6-tuple set (or vector space)
Information digits, Codewords
C1,C2,C3 C1C2,C3C4C5,C6 parity check equations for this code are
c4 = c1 c2
000 000000 c5 = c2 c3
001 001011 c6 = c1 c3
010 010110 and its H matrix is
011 011101 110100100 100101 011010
101 101110 101001
110 110011
111 111000
It is easy to check that the eight codewords shown above form a linear code (the all-zeros
codeword is present, and the sum of any two codeword is another codeword member of the
code). Therefore, these codewords represent a linear binary block code.
It is also easy enough to check that the minimum Hamming distance of the code is dmin= 3
thus we conclude that this code is a single error correction code, since
dmin= 2e + 1 (for odd dmin) .
-
8/13/2019 An Introduction to Digital Communications - Part 2
29/38
In the simple case of single-parity-check codes, the single parity was chosen to be the
modulo-2 sum of all the information digits.
Linear block codes contain several check digits, and each check digit is a function of the
modulo-2 sum of some (or all) of the information digits.
Let us consider the (6 , 3) code, i.e. n = 6, k = 3, and there are r = n-k = 3 chek digits.
We shall label the three information digits by C1,C2,C3 and the three check digits as C4,C5
and C6.
Lets choose to calculate the check digits from the information digits according to the
following rules: (each one of these equations must be indepentof any or all of the others)
C4= C1C2
C5= C2C3
C6= C1 C3
or in matrix notation
3
2
1
6
5
4
101
110
011
C
C
C
C
C
C
The full codeword consists of the digits C1,C2,C3, C4,C5,C6.
Generally the n-tuple codeword is denoted as C= [C1,C2,C3, C4,C5,C6]
Every codeword must satisfy the parity-check equations
C1 C2 C4 = 0
C2 C3 C5 = 0
C1 C3 C6= 0
or in matrix notation
0
0
0
100101
010110
001011
6
5
4
3
2
1
C
C
C
C
C
C
which can be written a little more compactly as
0
0
0
100101
010110
001011tC
Here Ctdenotes the column which is the transpose of the codeword
-
8/13/2019 An Introduction to Digital Communications - Part 2
30/38
-
8/13/2019 An Introduction to Digital Communications - Part 2
31/38
We can say that the error pattern was E= [ 000100 ]
If we multiplied the transpose of the received word by the parity-check matrix H
what do we get ?
H Rt = H (CE)t= H Ct H Et = StThe r-tuple S= [ S1,S2,S3] is called the syndrome.
This shows that the syndrome test, whether performed on either the corrupted received word
or on the error pattern that caused it, yields the same syndrome
Since the syndrome digits are defined by the same equations as the parity-check equations,
the syndrome digits reveal the parity check failures on the received codeword. (This happens
because the code is linear. An important property of linear block codes, fundamental to the
decoding process, is that the mapping between correctable error patterns and syndromes is
one-to-one and this means that we not only can detect an error but we can also correct it.)
For example using the received word given above R=[ 110111]
H Rt =
0
0
1
1
1
1
0
1
1
100101
010110
001011
= St,
where S= [ S1,S2,S3] = [100]
and as we can see this points to the fourth bit being in error.
Now all the decoder has to do ( after calculating the syndrome) is to invert the fourth bit
position in the received word to produce the codeword that was sent i.e C= [ 110011 ].
having obtained a feel of what channel coding and decoding is about, lets apply this
knowledge to a particular type of linear binary block codes called the Hamming codes.
-
8/13/2019 An Introduction to Digital Communications - Part 2
32/38
HAMMING CODES
These are Linear binary single-error-correcting codes having the property that the columns of
the parity-check-matrix, H,consist of all the distinct non-zero rsequences of binary numbers.
Thus a Hamming code has as many parity-check matrix columns as there are single-error
sequences. these codes will correct all patterns of single errors in any transmitted codeword.
These codes have n = k + r , where n = 2r- 1 , and k = 2r- 1 - r .
These codes have a guaranteed minimum Hamming distance dmin= 3 .
for example the parity-check-matrix for the (7,4) Hamming code is
H=
1001101
0101011
0010111
a) Determine the codeword for the information sequence 0011
b) If the received word, R, is 1000010, determine if an error has occurred. If it has, find the
correct codeword.
Solution:
a) since H Ct = 0, we can use this equation to calculate the parity digits for the given
codeword as follows:
7
6
5
4
3
2
1
1001101
0101011
0010111
C
C
C
C
C
C
C
=
7
6
5
1
1
0
0
1001101
0101011
0010111
C
C
C
=
0
0
0
by multiplying out the left hand side we get
1.0 1.0 1.1 0.1 1.C50.C60.C7= 0
0 0 1 0 C5 0 0 = 0
i.e. 1 C5 = 0 and C5 = 1similarly by multiplying out the second row of the H matrix by the transpose of the
codeword we obtain
1.0 1.0 0.1 1.1 0.C51.C60.C7= 0
0 0 0 1 0 C6 0 = 0
i.e. 1 C6 = 0 and C6 = 1
similarly by multiplying out the third row of the H matrix by the transpose of the codeword
we obtain
1.0 0.0 1.1 1.1 0.C50.C61.C7= 0
0 0 1 1 0 0 C7 = 0
i.e. 1 1 C7 = 0 and C7 = 0
-
8/13/2019 An Introduction to Digital Communications - Part 2
33/38
so that the codeword is
C = [C1,C2,C3, C4,C5,C6, C7] = 0011110
b) to find whether an error has occurred or not we use the following equation
H Rt = St , if the syndrome is zero then no error has occurred, if not an error
has occurred and is pin pointed by the syndrome.
Thus to compute the syndrome we multiply out the rows of H by the transpose of the
received word.
0
10
0
0
0
1
1001101
0101011
0010111
=
1
0
1
because the syndrome is the third column of the parity-check matrix, the third position of the
received word is in error and the correct codeword is 1010010.
-
8/13/2019 An Introduction to Digital Communications - Part 2
34/38
The Generator Matrixof a linear binary block code
We saw above that the parity-check matrix of a systematic linear binary block code can be
written in the following (n-k) by n matrix form
H = [ h In-k]
The Generator Matrixof this same code is written in the following k by n matrix form
G= [ Ik ht ]
The generator matrix is useful in obtaining the codeword from the information sequence
according to the following formula
C= mG
Where,
Cis the codeword [C1,C2,.........,Cn-1,Cn]
mis the information digit sequence [m1, m2, ....., mk], and
Gis the generator matrix of the code as given by the formula for Gabove.
Thus if we consider the single-error-correcting (n,k) = (7,4) Hamming code disscussed
previously, its parity-check matrix was
H=
1 1 1 0 1 0 0
1 1 0 1 0 1 0
1 0 1 1 0 0 1
and thus its generator matrix would be
G
1 0 0 0 1 1 1
0 1 0 0 1 1 0
0 0 1 0 1 0 1
0 0 0 1 0 1 1
now if we had an information sequence given by the following digits 0011 , the codeword
would be given by C= mG, i.e.
C 0 0 1 1
1 0 0 0 1 1 1
0 1 0 0 1 1 0
0 0 1 0 1 0 1
0 0 0 1 0 1 1
0011110
Thus the (n,k) = (7,4) Hamming code generator matrix and the code book:
-
8/13/2019 An Introduction to Digital Communications - Part 2
35/38
H = [ h In-k]
G= [ Ik ht ]
4
3
2
1
1101000
1010100
0110010
1110001
row
row
row
row
G
combinations codeword
1 1 row1 1000111
2 2 row2 0100110
3 3 row3 0010101
4 4 row4 0001011
5 1 2 rowirowj 1100001
6 1 3 etc 1010010
7 1 4 . 1001100
8 2 3 . 0110011
9 2 4 . 0101101
10 3 4 . 0011110
11 1 2 3 . 1110100
121 2 4
. 1101010
13 1 3 4 . 1011011
14 2 3 4 . 0111000
15 1 2 3 4 . 1111111
16 11 or 2 2 or 33 or 4 4 0000000
-
8/13/2019 An Introduction to Digital Communications - Part 2
36/38
-
8/13/2019 An Introduction to Digital Communications - Part 2
37/38
-
8/13/2019 An Introduction to Digital Communications - Part 2
38/38