An Introduction to Digital Communications - Part 2

download An Introduction to Digital Communications - Part 2

of 38

Transcript of An Introduction to Digital Communications - Part 2

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    1/38

    INFORMATION THEORY

    Information theory deals with mathematical modelling and analysis of a communications

    system rather than with physical sources and physical channels.

    Specifically, given an information sourceand a noisy channel, information theory provides

    limitson :1- The minimum number of bits per symbol required to fully represent the source.

    (i.e. the efficiency with which information from a given source can be represented.)

    2- The maximum rate at which reliable (error-free) communications can take place over the

    noisy channel.

    Since the whole purpose of a communications system is to transport information from a

    source to a destination, the question arises as to how much information can be transmitted in

    a given time. ( Normally the goal would be to transmit as much information as possible in as

    small a time as possible such that this information can be correctly interpreted at the

    destination.)This of course leads to the next question, which is :

    How can information be measured and how do we measure the rate at which information is

    emitted from a source ?

    Suppose that we observe the output emitted by a discrete source ( every unit interval or

    signalling interval.)

    The source output can be considered as a set, S, of discrete random events ( or outcomes).

    These events are symbols from a fixed finite alphabet.

    ( for example the set or alphabet can be the numbers 1 to 6 on a die and each roll of the die

    outputs a symbol being the number on the die upper face when the die comes to rest.

    Another example is a digital binary source, where the alphabet is the digits "0" and "1", and

    the source outputs a symbol of either "0" or "1" at random .)

    If in general we consider a discrete random source which outputs symbols from a fixed finite

    alphabet which has k symbols. Then the set S contains all the k symbols and we can write

    S = { s0, s1, s2, ......., sk-1} and

    p sii

    i k

    ( )( )

    1

    0

    1

    (3.1)

    In addition we assume that the symbols emitted by the source during successive signalling

    intervals are statistically independenti.e. the probability of any symbol being emitted at any

    signalling interval does not depend on the probability of occurrence of previous symbols. i.e.

    we have what is called a discrete memoryless source.

    Can we find a measure of how much "information" is produced by this source ?

    The idea of information is closely related to that of "uncertainty" and "surprise".

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    2/38

    If the source emits an output si, which has a probability of occurrence p(si) = 1, then all other

    symbols of the alphabet have a zero probability of occurrence and there is really no

    "uncertainty", "surprise", or information since we already know before hand ( a priori) what

    the output symbol will be.

    If on the other hand the source symbols occur with different probabilities, and the probability

    p(si) is low, then there is more "uncertainty", "surprise" and therefore "information" when thesymbol si is emitted by the source, rather than another one with higher probability.

    Thus the words "uncertainty", "surprise", and "information" are all closely related.

    - Before the output sioccurs, there is an amountof "uncertainty".

    - When the output sioccurs, there is an amountof "surprise".

    - After the occurrence of the output si, there is gain in the amountof "information".

    All three amountsare really the same and we can see that the amount of information is

    related to the inverse of the probability of occurrence of the symbol.

    Definition:

    The amount of information gained after observing the event si which occurs with probability

    p(si), is

    ])

    ip(s

    1[

    2log)

    iI(s bits, for i = 0,1,2, ..., (k-1) (3.2)

    The unit of information is called "bit" , a contraction of "binary digit"

    This definition exhibits the following important properties that are intuitively satisfying:

    1- I (si) = 0 for p(si) = 1

    i.e. if we are absolutely certain of the output of the source even before it occurs (a

    priory), then there is no information gained.

    2- I(si) 0 because 0 p(si) 1 for symbols of the alphabet.

    i.e. the occurrence of an output sj either provides some information or no information

    but never brings about a loss of information ( unless it is a severe blow to the headwhich is highly unlikely from the discrete source !)

    3- I(sj) I(si) for p(sj) p(si)

    i.e. the less probability of occurrence an output has the more information we gain when it

    occurs.

    4- I(sjsi) = I(sj) + I(si) if the outputs sjand siare statistically independent.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    3/38

    The use of the logarithm to the base 2 ( instead of to the base 10 or to the base e ) has been

    adopted in the measure of information because usually we are dealing with digital binary

    sources, (however it is useful to remember that log2(a) = 3.322 log10(a)). Thus if the source

    alphabet was the binary set of symbols, i.e. "0" or "1" , and each symbol was equally likely to

    occur i.e. s0 having p(s0) = 1/2 and s1 having p(s1) = 1/2

    we have :

    1)2(log]21

    1[log=]

    )i

    p(s

    1[

    2log)

    iI(s 22 bit

    Hence "one bit" is the amount of information that is gained when one of two possible and

    equally likely (equiprobable)outputs occurs.

    [Note that a "bit" is also used to refer to a binary digit when dealing with the transmission of

    a sequence of 1's and 0's].

    Entropy

    The amount of information , I(si), associated with the symbol si emitted by the source during

    a signalling interval depends on the symbol's probability of occurrence. In general, each

    source symbol has a different probability of occurrence. Since the source can emit any one of

    the symbols of its alphabet, a measure for the average information content per source

    symbolwas defined and called the entropy of the discrete source, H, (i.e. taking all the

    discrete source symbols into account ).

    DefinitionThe entropy, H, of a discrete memoryless source with source alphabet composed of the set

    S = { s0, s1, s2, ......., sk-1}, is a measure of the average information content per source

    symbol, and is given by :

    ])p(s

    1[log)p(s

    )I(s)p(sH

    i

    2

    1)(ki

    0i

    i

    i

    1)(ki

    0i

    i

    bits/symbol (3.3)

    We note that the entropy, H, of a discrete memoryless source with equiprobable symbols is

    bounded as follows:

    0 2 H klog , where k is the number of equiprobable source symbols.

    Furthermore, we may state that :

    1- H = 0 , if and only if the probability p(si) = 1 for some symbol si , and the remaining

    source symbols probabilities are all zero. This lower bound on entropy corresponds to no

    uncertainty and no information.

    2- H = log2k bits/symbol, if and only if p(si) = 1/k for all the k source symbols (i.e. they are

    all equiprobable). This upper bound on entropy corresponds to maximum uncertainty andmaximum information.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    4/38

    Example:

    Calculate the entropy of a discrete memoryless source with source alphabet S = { s 0, s1, s2}

    with probabilities p(s0) = 1/4 , p(s1) = 1/4, p(s3) = 1/2 .

    H p(s ) I(s )

    p(s ) log [ 1

    p(s )]

    i

    i 0

    i ( k 1 )

    i

    ii 0

    i ( k 1 )

    2i

    H p(s ) log [ 1

    p(s )]+ p(s log [

    1

    p(s )]+ p(s log [

    1

    p(s )]

    =1

    4log

    1

    4log

    1

    2log

    0 20

    1 21

    2 22

    2 2 2

    ) )

    ( ) ( ) ( ) . /4 4 2 3

    2 1 5 bits symbol

    Information RateIf we consider that the symbols are emitted from the source at a fixed time rate (the signalling

    interval), denoted by rs symbols/second. We can define the

    average source information rate Rin bits per second as the product of the average

    information content per symbol, H, and the symbol rate rs.

    R = rsH bits/sec (3.4)

    ExampleA discrete source emits one of five symbols once every millisecond. The source symbols

    probabilities are 1/2, 1/4, 1/8, 1/16, and 1/16 respectively.

    Find the source entropy and information rate.

    H p(s ) log [ 1

    p(s )]i

    i 0

    i ( k 1 )

    2i

    bits where, in this case k = 5

    H p(s ) log [ 1p(s )

    ]+ p(s log [ 1p(s )

    ]+ p(s log [ 1p(s )

    ]+ p(s log [ 1p(s )

    ]+ p(s log [ 1p(s )

    ]

    =1

    2log

    1

    4log

    1

    8log

    1

    16log

    1

    16log

    0 20

    1 21

    2 22

    3 23

    4 24

    2 2 2 2 2

    ) ) ) )

    ( ) ( ) ( ) ( ) ( )

    . . . . . . /

    2 4 8 16 16

    0 5 0 5 0 375 0 25 0 25 1 875 bits symbol

    R = rsH bits/sec

    The information rate R = (1/10-3) x 1.875 = 1875 bits/second.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    5/38

    Entropy of a Binary Memoryless Source:

    To illustrate the properties of H, let us consider a memoryless digital binary source for which

    symbol 0 occurs with probability p0and symbol 1 with probability p1 = (1 - p0).

    The entropy of such a source equals:

    bits])p-(1

    1[log)p-(1+]

    p

    1[logp

    ]p

    1

    [logp+]p

    1

    [logpH

    0

    20

    0

    20

    121

    020

    We note that

    1- When p0= 0, the entropy H = 0. This follows from the fact that x log x 0 as x 0.

    2- When p0= 1, the entropy H = 0.

    3- The entropy H attains its maximum value, Hmax= 1 bit, when p0 = p1=1/2, that is symbols

    0 and 1 are equally probable. (i.e. H = log2k = log22 = 1 )

    ( Hmax= 1 can be verified by differentiating H with respect to p and equating to zero )

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    6/38

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    7/38

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    8/38

    CHANNEL CAPACITY

    In Information Theory, the transmission medium is treated as an abstract and noisy filter

    called the channel. The maximum rate of information transmission through a channel is

    called the channel capacity, C.

    Channel Coding Theorm

    Shannon showed that, if the information rate R [remember that R = rsHbits/sec] is equal to

    or less than C, R C, then there exists a coding technique which enables transmission overthe noisy channel with an arbitrarily small frequency of errors.

    [A converse to this theorem states that it is not possible to transmit messages without error if

    R > C ]

    Thus the channel capacity is defined as the maximum rate of reliable (error-free) information

    transmission through the channel.

    Now consider a binary source with an available alphabet of k discrete messages (or symbols)

    which are equiprobable and statistically independent (these messages could be either single

    digit symbols or could be composed of several digits each depending on the situation). We

    assume that each message sent can be identified at the receiver; therefore this case is often

    called the discrete noiseless channel. The maximum entropy of the source is log2k bits,

    and if T is the transmission time of each message, (i.e. rs=T

    1symbols/sec), the channel

    capacity is

    krHrRC ss 2log bits per second.

    To attain this maximum the messages must be equiprobable and statistically independent.

    These conditions form a basis for the coding of the information to be transmitted over the

    channel.

    In the presence of noise, the capacity of this discrete channel decreases as a result of theerrors made in transmission.

    In making comparisons between various types of communications systems, it is convenient to

    consider a channel which is described in terms of bandwidth and signal-to-noise ratio.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    9/38

    Review of Signal to Noise Rat io

    The analysis of the effect of noise on digital transmission will be covered later on in this

    course but before proceeding, we will review the definition of signal to noise ratio. It is

    defined as the ratio of signal power to noise power at the same point in a system. It is

    normally measured in decibels.

    Signal to Noise Ratio (dB) =N

    S

    10log10 dB

    Noise is any unwanted signal. In electrical terms it is any unwanted introduction of energy

    tending to interfere with the proper reception and reproduction of transmitted signals.

    Channel Capacity Theorm

    Bit error and signal bandwidths are of prime importance when designing a communications

    system. In digital transmission systems noise may change the value of the transmitted digit

    during transmission. (e.g. change a high voltage to a low voltage or vice versa).

    This raises the question : Is it possible to invent a system with no bit error at the output even

    when we have noise introduced into the channel? Shannons Channel Capacity Theorm

    (also called the the Shannon -Hartley Theorm) answers this question

    C = B log2 (1 + S/N) bits per second,where C is the channel capacity, B is the channel bandwidth in hertz and S/N is the signal-to-noise

    power ratio (watts/watts, not dB).

    Although this formula is restricted to certain cases (in particular certain types of random

    noise), the result is of widespread importance to communication systems because many

    channels can be modelled by random noise.

    From the formula, we can see that the channel capacity, C, decreases as the available

    bandwidth decreases. C is also proportional to the log of (1+S/N), so as the signal to noise

    level decreases C also decreases.

    The channel capacity theorem is one of the most remarkable results of information theory. In

    a single formula, it highlights the interplay between three key system parameters: Channel

    bandwidth, average transmitted power (or, equivalently, average received power), and noise

    at the channel output.

    The theorem implies that, for given average transmitted power S and channel bandwidth B,

    we can transmit information at the rate C bits per seconds, with arbitrarily small probability

    of error by employing sufficiently complex encoding systems. It is not possible to transmit at

    a rate higher than C bits per second by any encoding system without a definite probability of

    error.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    10/38

    Hence, the channel capacity theorem defines the fundamental limit on the rate of error-free

    transmission for a power-limited, band-limited Gaussian channel. To approach this limit,

    however, the noise must have statistical properties approximating those of white Gaussian

    noise.

    Problems:

    1. A voice-grade channel of the telephone network has a bandwidth of 3.4 kHz.

    (a) Calculate the channel capacity of the telephone channel for a signal-to-noise ratio of 30

    dB.

    (b) Calculate the minimum signal-to-noise ratio required to support information transmission

    through the telephone channel at the rate of 4800 bits/sec.

    (c) Calculate the minimum signal-to-noise ratio required to support information transmission

    through the telephone channel at the rate of 9600 bits/sec.

    2. Alphanumeric data are entered into a computer from a remote terminal through a voice-

    grade telephone channel. The channel has a bandwidth of 3.4 kHz, and output signal-to-noise

    ratio of 20 dB. The terminal has a total of 128 symbols. Assume that the symbols are

    equiprobable, and the successive transmission are statistically independent.

    (a) Calculate the channel capacity.

    (b) Calculate the maximum symbol rate for which error-free transmission over the channel is

    possible.

    3. A black-and-white television picture may be viewed as consisting of approximately 3 x 105

    elements, each one of which may occupy one 10 distinct brightness levels with equal

    probability. Assume (a) the rate of transmission is 30 picture frames per second, and (b) the

    signal-to-noise ratio is 30 dB.

    Using the channel capacity theorem, calculate the minimum bandwidth required to support

    the transmission of the resultant video signal.

    4. What is the minimum time required for the facsimile transmission of one picture over a

    standardtelephone circuit?

    There are about 2.25 x 106picture elements to be transmitted and 12 brightness levels are tobe used for good reproduction. Assume all brightness levels are equiprobable. The telephone

    circuit has a

    3-kHz bandwidth and a 30-dB signal-to-noise ratio (these are typical parameters).

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    11/38

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    12/38

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    13/38

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    14/38

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    15/38

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    16/38

    THE BINARY SYMMETRIC CHANNEL

    Usually when a "1" or a "0" is sent it is received as a "1" or a "0", but occasionally a "1" will

    be received as a "0" or a "0" will be received as a "1".

    Let's say that on the average 1 out of 100 digits will be received in error, i.e. there is a

    probability p = 1/100 that the channel will introduce an error.This is called a Binary Symmetric Channel(BSC), and is represented by the following

    diagram.

    p

    p

    (1-p)

    (1-p)

    0

    1

    0

    1

    Representation of the Bi nary Symmetric Channel

    with an error probability of p

    Now let us consider the useof this BSC model.

    Say we transmit one information digit coded with a single even parity bit . This means that if

    the information digit is 0 then the codeword will be 00 , and if the information digit is a 1

    then the codeword will be 11.

    As the codeword is transmitted through the channel, the channel may (or may not) introducean error according to the following error patterns:

    E = 00 i.e. no errors

    E = 01 i.e. a single error in the last digit

    E = 10 i.e. a single error in the first digit

    E = 11 i.e. a double error

    The probability of no error , is the probability of receiving the second transmitted digit

    correctly on condition that the first transmitted digit was received correctly.

    Here we have to remember our discussion on joint probability:

    p(AB) = p(A) p(B/A) = p(A) p(B) when the occurrence of any of the two outcomes is

    independent of the occurrence of the other.

    Thus the probabilty of no error is equal to the probability of receiving each digit correctly.

    This probability, according to the BSC model, is equal to (1 - p), where p is the probability of

    one digit being received incorrectly.

    Thus the probability of no error = (1 - p) ( 1- p) = (1 - p)2.

    Similarly, the probability of a single error in the first digit = p ( 1- p)

    and the probability of a single error in the second digit = (1 - p) p ,

    i.e. the probability of a single error is equal to the sum of the above two probabilities ( since

    the two events are mutually exclusive), i.e.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    17/38

    the probability of a single error ( when a code with block length, n = 2 , is used, as in this

    case)

    is equal to 2 p(1 - p)

    Similarly the probability of a double error in the above example ( i.e. the error pattern E = 11

    )is equal to p2.

    In summary these probabilities would be

    p(E = 00) = (1 - p)2p(E = 01) = (1 - p) p

    p(E = 10) = p (1 - p)

    p(E = 11) = p2.

    and if we substitute for p = .01 ( given in the above example) we find that

    p(E = 00) = (1 - p) = 0.98

    p(E = 01) = (1 - p) p = 0.0099

    p(E = 10) = p (1 - p) = 0.0099Thus the probability of a single error per codeword = (1 - p) p + p (1 - p) = 2 p (1-p)

    p(E = 11) = p2 = 0.0001

    This shows that if p < 1/2 , then the probability of no error is higher than the probability of a

    single error occurring which in turn is higher than the probability of a double error.

    Again, if we consider a block code with block length n = 3, then the

    probability of no error p(E = 000) = (1 - p)3,

    probability of an error in the first digit p(E = 100) = p (1 -p)2,

    probability of a single error per codeword p(1e) = 3 p (1 -p)2,

    probability of a double error per codeword = p(2e) = ( )23 p2(1 - p) = 3 p2(1 - p)

    probability of a triple error per codeword = p(3e) = p3.

    And again, if we have a code with block length n = 4, then the

    probability of no error p(E = 0000) = (1 - p)4,

    probability of an error in the first digit p(E = 1000) = p (1 -p)3,

    probability of a single error per codeword p(1e) = 4 p (1 -p)3,

    probability of a double error per codeword = p(2e) = ( )24 p2(1 - p)2= 6 p2(1 - p)2

    probability of a triple error per codeword = p(3e) = ( )34 p3(1 - p) = 4 p3(1 - p)

    probability of four errors per codeword = p(4e) = p4.

    And again, if we have a code with block length n = 5, then the

    probability of no error p(E = 00000) = (1 - p)5,

    probability of an error in the first digit p(E = 10000) = p (1 -p)4,

    probability of a single error per codeword p(1e) = 5 p (1 -p)4,

    probability of a double error per codeword = p(2e) = ( )25 p2(1 - p)3= 10 p2(1 - p)2

    probability of a triple error per codeword = p(3e) = ( )35 p3(1 - p)2= 10 p3(1 - p)2

    probability of four errors per codeword = p(4e) = ( )45 p4(1 - p) = 5 p4(1 - p).

    probability of five errors per codeword = p(5e) = p5.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    18/38

    From all of the disscussion, we realise that if the error pattern (of length n) has weight

    of say e

    then the probability of occurrence of eerrors in a codeword with blocklength nis

    )(ne

    pe(1 - p)n-e.

    We also realise that, since p < 1/2 , we have (1 - p) p, and

    (1 - p)n p (1 - p)n-1 p2 (1 - p)n-2...............

    Therefore an error pattern of weight 1 is more likely to occur than an error pattern of weight

    2., and so on.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    19/38

    The Communications System from the channel Coding Theorem point of view

    The Communications System from the channel Coding Theorem point of view

    source Encoder Decoder user

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    20/38

    Information Theory Summary

    1-A discrete memoryless source (DMS) is one that outputs symbols taken from a fixed finte

    alphabet which has k symbols. These symbols form a set S = {s0, s1, s2, . , sk-1}

    where the occurrence of each symbol (si) at the output of the source has a probability ofoccurrence p(si) .( The probabilities of occurrence of the symbols are called the source

    statistics.)

    and

    iki

    i

    isp

    0

    1)(

    2- The amount of information gained after observing the output symbol (si) which

    occurrs

    with probability p(si) is

    )(

    1log)( 2

    ii

    spsI i= 0,1,2,,(k-1)

    3- The entropy, H, of a discrete memoryless source with source alphabet composed of the set

    S = {s0, s1, s2, . , sk-1}, is a measure of the average information content per source

    symbol, and is given by:

    1

    0

    2

    1

    0 )(

    1log)()()(

    ki

    i ii

    ki

    i

    iisp

    spsIspH bits/symbol

    4- Information rate (bit rate) = symbol rate * entropyR = rsH bits/sec

    5-

    N

    SBWC 1log 2 bits/sec

    6- BSC = Binary Symmetric Channel

    7- Prob of e errors in n digits = )(n

    epe(1 - p)n-e.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    21/38

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    22/38

    CHANNEL CODING

    Suppose that we wish to transmit a sequence of binary digits across a noisy channel. If we

    send a one, a one will probably be received; if we send a zero, a zero will probably be

    received. Occasionally, however, the channel noise will cause a transmitted one to be

    mistakenly interpreted as a zero or a transmitted zero to be mistakenly interpreted as a one.Although we are unable to prevent the channel from causing such errors, we can reduce their

    undesirable effects with the use of coding.

    The basic idea, is simple. We take a set of k information digitswhich we wish to transmit,

    annex to them r check digits, and transmit the entire block of n = k + r channel digits.

    Assuming that the channel noise changes sufficiently few of these transmitted channel digits,

    the r check digits may provide the receiver with sufficient information to enable it to detect

    and/or correct the channel errors.

    (The detection and/or correction capability of a channel code will be discussed at some length

    in the following pages.)

    Given any particular sequence of k message digits, the transmitter must have some rule for

    selecting the r check digits. This is called channel encoding.Any particular sequence of n digits which the encoder might transmit is called a codeword.

    Although there are 2ndifferent binary sequences of length n, only 2kof these sequences arecodewords, because the r check digits within any codeword are completely determined by the

    k information digits. The set consisting of these 2kcodewords, of length n each, is called a

    code(some times referred to as a code book.)

    No matter which codeword is transmitted, any of the 2npossible binary sequences of length n

    may be received if the channel is sufficiently noisy. Given the n received digits, the decoder

    must attempt to decide which of the 2kpossible codewords was transmitted.

    Repetition codes and single-parity-check codes

    Among the simplest examples of binary codes are the repetition codes, with k = 1, r arbitrary,

    and n = k + r = 1 + r . The code contains two codewords, the sequence of n zeros and the

    sequence of n ones.

    We may call the first digit the information digit; the other r digits, check digits. The value of

    each check digit (each 0 or 1) in a repetition code is identical to the value of the information

    digit. The decoder might use the following rule:

    Count the number of zeros and the number of ones in the received bits. If there are more

    received zeros than ones, decide that the all-zero codeword was sent; if there are more ones

    than zeros, decide that the all-one codeword was sent. If the number of ones equal the number

    of zeros do not decide (just flag the error)..This decoding rule will decode correctly in all cases when the channel noise changes less

    than half the digits in any one block. If the channel noise changes exactly half of the digits in

    any one block, the decoder will be faced with a decoding failure(i.e. it will not decode the

    received word into any of the possible transmitted codewords) which could result in an ARQ

    (automatic request to repeat the message). If the channel noise changes more than half of the

    digits in any one block, the decoder will commit a decoding error; i.e. it will decode the

    received word into the wrong codeword.

    If channel errors occur infrequently, the probability of a decoding failure or a decoding error

    for a repetition code of long block length is very small indeed. However repetition codes are

    not very useful. They have only two codewords and have very low information rateR = k/n

    ( also called code rate),all but one of the digits are check digits.We are usually more interested in codes which have a higher information rate.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    23/38

    Extreme examples of such very high rate codes which use a single-parity-checkdigit. This

    check digit is taken to be the modulo-2 sum (Exclusive-OR) of the codeword (n -1)

    information digits. (

    The information digits are added according to the exclusive-OR binary operation : 0 + 0 = 0 ,

    0 + 1 = 1, 1 + 0 = 1, 1 + 1 = 0 ). If the number of ones in the information word is even themodulo-2 sum of all the information digits will be equal to zero, If the number of ones in the

    information word is odd their modulo-2 sum will be equal to one.

    Even paritymeans that the total number of ones in the codeword is even, odd paritymeans

    that the total number ones in the codeword is odd. Accordingly the parity bit (or digit) is

    calculated and appended to the information digits to form the codeword.

    This type of code can only detecterrors. A single digit error (or any number of odd digit

    errors) will be detected but any combination of two digit errors (or any number of even digit

    errors) will cause a decoding error. Thus the single-parity-check type of code cannot correct

    errors.

    These two examples, the repetitive codes and the single-parity-check codes, provide theextreme, relatively trivial, cases of binary block codes. ( Although relatively trivial single-

    parity-checks are used quite often because they are simple to implement.)

    The repetition codes have enormous error-correction capability but only one information bit

    per block. The single-parity-check codes have very high information rate but since they

    contain only one check digit per block, they are unable to do more than detect an odd number

    of channel errors.

    There are other codes which have moderate information rate and moderate error-

    correction/detection capability, and we will study few of them.

    These codes are classified into two major categories:

    Block codes , and Convolutional codes.

    In block codes, a block of k information digits is encoded to a codeword of n digits

    (n > k). For each sequence of k information digits there is a distinct codeword of n digits.

    In convolutional codes, the coded sequence of n digits depends not only on the k

    information digits but also on the previous N - 1 information digits (N > 1). Hence the coded

    sequence for a certain k information digits is not unique but depends on N - 1 earlier

    information digits.

    In block codes, k information digits are accumulated and then encoded into an n-digitcodeword. In convolutional codes, the coding is done on a continuous, or running, basis

    rather than by accumulating k information digits.

    We will start by studying block codes. (and if there is time we might come back to study

    convolutional codes).

    BLOCK CODES

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    24/38

    The block encoder input is a stream of information digits. The encoder segments the input

    information digit stream into blocks of k information digits and for each block it calculates

    a number of r check digits and outputs a codeword of n digits, where n = k + r (or r = n -

    k).

    The code efficiency (also known as the code rate ) is k/n.

    Such a block code is denoted as an (n,k) code.Block codes in which the k information digits are transmitted unaltered first and followed by

    the transmission of the r check digits are called systematic codes, as shown in figure 1

    below.

    Since systematic block codes simplify implementation of the decoder and are always used in

    practice we will consider only systematic codes in our studies.

    ( A non-systematic block code is one which has the check digits interspersed between the

    information digits. For Linear block codes it can be shown that a non systematic block code

    can always be transformed into a systematic one).

    C1 CnCn-1C2 ..........Ck.....................................

    r check digitsk information digits

    Figure 1 an (n,k) block codeword in systematic form

    LINEAR BLOCK CODES

    Linear block codes are a class of parity check codes that can be characterized by the (n, k)notation described earlier.

    The encoder transforms a block of k information digits (an information word) into a longer

    block of n codeword digits, constructed from a given alphabet of elements. When the

    alphabet consists of two elements (0 and 1), the code is a binary code comprised of binary

    digits (bits). Our discussion of linear block codes is restricted to binary codes.

    Again, the k-bit information words form 2k distinct information sequences referred to as

    k-tuples(sequences of k digits).

    An n-bit block can form as many as 2ndistinct sequences, referred to as n-tuples.

    The encoding procedure assigns to each of the 2kinformation k-tuples one of the 2nn-tuples.

    A block code represents a one-to-one assignment, whereby the 2kinformation k-tuples are

    uniquely mapped into a new set of 2kcodeword n-tuples; the mapping can be accomplished

    via a look-up table, or via some encoding rules that we will study shortly.

    Definition:

    An (n, k) binary block code, is said to be linearif, and only if, the modulo-2 addition (Ci

    Cj) of any two codewords, Ciand Cj, is alsoa codeword. This property thus means that (for

    linear block code) the all-zero n-tuple mustbe a member of the code book (because the

    modulo-2 addition of a codeword with itself results in the all-zero n-tuple).

    A linear block code, then, is one in which n-tuples outside the code book cannot be created

    by the modulo-2 addition of legitimate codewords (members of the code book).

    For example, the set of all 24= 16, 4-tuples (or 4-bit sequences ) is shown below:

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    25/38

    0000 0001 0010 0011 0100 0101 0110 0111

    1000 1001 1010 1011 1100 1101 1110 1111

    an example of a block code ( which is really a subset of the above set ) that forms a linear

    code is

    0000 0101 1010 1111

    It is easy to verify that the addition of any two of these 4 code words in the code book can

    only yield one of the other members of the code book and since the all-zero n-tuple is a

    codeword this code is a linearbinary block code.

    Figure 5. 13 illustrates, with a simple geometric analogy, the structure behind linear block

    codes. We can imagine the total set comprised of 2n n-tuples. Within this set (also calledvector space) there exists a subset of 2kn-tuples comprising the code book . These 2k

    codewords or points , shown in bold "sprinkled" among the more numerous 2npoints,

    represent the legitimate or allowable codeword assignments.

    An information sequence is encoded into one of the 2kallowable codewords and thentransmitted. Because of noise in the channel, a corrupted version of the sent codeword

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    26/38

    (one of the other 2nn-tuples in the total n-tuple set) may be received.

    The objective of coding is that the decoder would be able to decide whether the received

    word is a valid codeword, or whether it is a codeword which has been corrupted by noise (

    i.e. detect the occurrence of one or more errors ). Ideally of course the decoder should be able

    to decide which codeword was sent even if this transmitted codeword was corrupted by noise,and this process is calld error-correction.

    By thinking about it, if one is going to attempt to correct errors in a received word

    represented by a sequence of n binary symbols, then it is absolutely essential not to allow the

    use of all 2nn-tuples as being legitimate codewords.

    If, in fact, every possible sequence of n binary symbols were a legitimate codeword, then in

    the presence of noise one or more binary symbols could be changed, and one would have no

    possible basis for determining if a received sequence was any more valid than any other

    sequence.

    Carrying this thought a little further, if one wished that the coding system would correct theoccurrence of a single error, then it is both necessary and sufficient that each codeword

    sequence differsfrom every other codeword in at least 3positions.

    In fact, if one wished that the coding system would correct the occurrence of e errors, then it

    is both necessary and sufficient that each codeword sequence differsfrom every other

    codeword in at least (2e + 1) positions.

    DEFINITIONThe number of positions in which any two codewords differ from each other is called the

    Hamming distance, and is normally denoted by d .

    For example:

    Looking at the (n,k) = (4,2) binary linear block code, mentioned earlier, which has the

    following codewords:

    C1 0000

    C2 0101

    C3 1010

    C4 1111

    we see that the Hamming distance, d, :

    between C2

    and C3

    is equal to 4

    between C2 and C4 is equal to 2

    between C3 and C4 is equal to 2

    We also observe that the Hamming distance between C1and any of the other codewords is

    equal to the "weight" that is the number of onesin each of the other codewords.

    We can also see that the minimum Hamming distance( i.e. the smallest Hamming distance

    between any pair of the codewords), denoted by dmin, of this code is equal to 2

    ( The minimum Hamming distance of a binary linear block code is simply equal to the

    minimum weight of its codewords. This is due to the fact that the code is linear, meaning that

    if any two codewords are added together modulo-2 the result will be another codeword. thus

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    27/38

    to find the minimum Hamming distance of a linear block code all we need to do is to find the

    minimum-weight codeword).

    Looking at the above code again, and keeping in mind what we said earlier about the

    "Hamming distance" property of the codewords for a code to correct a single error.

    We said that, to correct a single error, this code must have any of its codewords differingfrom any of the other codewords by at least (2e + 1), where e in our case is 1 (i.e. a single

    error). That is the minimum Hamming distance of the code must be at least 3. Therefore the

    above mentioned code cannot correct the result of occurrence of a single error, (since its dmin

    = 2), but it can detect it.

    To explain this further let us consider the following diagram in figure

    C1 C2

    xx x

    b) Hamming sphere of radius e = 1 around each codew ord

    Hamming distance betw een codew ords = 2

    Code can only detect e = 1 error but cannot correct it

    because d = e + 1 ( i.e. d < 2e + 1)

    xx x

    x

    C1 C2

    a) Hamming sphere of radius e = 1 around each codew ord

    Hamming distance between codew ords = 3

    Code can correct a single error since d = 2e + 1

    FIGURE 2

    If we imagine that we draw a sphere ( called a Hamming sphere) of radius e= 1 around each

    codeword. This sphere will contain all n-tuples which are at a distance 1 away from each

    codeword ( i.e. all n-tuples which differ from this code word in one position ).

    If the minimum Hamming distance of the code dmin< 2e + 1 (as in figure 2b, where d = 2)

    the occurrence of a single error will result in changing the codeword to the next n-tuple and

    the decoder does not have enough information to decide if codeword C1or C2was

    transmitted. The decoder however can detect that an error has occurred.

    If we look at figure 2a we see that the code has a dmin= 2e + 1 and that the occurrence of a

    single error results in the next n-tuple being received and in this case the decoder can make

    an unambiguous decision, based on what is called nearest neighbour decoding rule, as towhich of the two codewords was transmitted.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    28/38

    If the corrupted received n-tuple is not too unlike (not too distant from) the valid codeword,

    the decoder could make a decision that the transmitted codeword was the code word "nearest

    in distance" to the received the word.

    Thus in general we can say that a binary linear code will correct e errors

    if dmin= 2e + 1(for odd dmin )if dmin= 2e + 2 (for even dmin )

    A (6 , 3) Linear Block Code Example

    Examine the following coding assignment that describes a (6, 3) code. There are 2k= 23= 8

    information words, and therefore eight codewords.

    There are 2n= 26= sixty-four 6-tuples in the total 6-tuple set (or vector space)

    Information digits, Codewords

    C1,C2,C3 C1C2,C3C4C5,C6 parity check equations for this code are

    c4 = c1 c2

    000 000000 c5 = c2 c3

    001 001011 c6 = c1 c3

    010 010110 and its H matrix is

    011 011101 110100100 100101 011010

    101 101110 101001

    110 110011

    111 111000

    It is easy to check that the eight codewords shown above form a linear code (the all-zeros

    codeword is present, and the sum of any two codeword is another codeword member of the

    code). Therefore, these codewords represent a linear binary block code.

    It is also easy enough to check that the minimum Hamming distance of the code is dmin= 3

    thus we conclude that this code is a single error correction code, since

    dmin= 2e + 1 (for odd dmin) .

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    29/38

    In the simple case of single-parity-check codes, the single parity was chosen to be the

    modulo-2 sum of all the information digits.

    Linear block codes contain several check digits, and each check digit is a function of the

    modulo-2 sum of some (or all) of the information digits.

    Let us consider the (6 , 3) code, i.e. n = 6, k = 3, and there are r = n-k = 3 chek digits.

    We shall label the three information digits by C1,C2,C3 and the three check digits as C4,C5

    and C6.

    Lets choose to calculate the check digits from the information digits according to the

    following rules: (each one of these equations must be indepentof any or all of the others)

    C4= C1C2

    C5= C2C3

    C6= C1 C3

    or in matrix notation

    3

    2

    1

    6

    5

    4

    101

    110

    011

    C

    C

    C

    C

    C

    C

    The full codeword consists of the digits C1,C2,C3, C4,C5,C6.

    Generally the n-tuple codeword is denoted as C= [C1,C2,C3, C4,C5,C6]

    Every codeword must satisfy the parity-check equations

    C1 C2 C4 = 0

    C2 C3 C5 = 0

    C1 C3 C6= 0

    or in matrix notation

    0

    0

    0

    100101

    010110

    001011

    6

    5

    4

    3

    2

    1

    C

    C

    C

    C

    C

    C

    which can be written a little more compactly as

    0

    0

    0

    100101

    010110

    001011tC

    Here Ctdenotes the column which is the transpose of the codeword

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    30/38

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    31/38

    We can say that the error pattern was E= [ 000100 ]

    If we multiplied the transpose of the received word by the parity-check matrix H

    what do we get ?

    H Rt = H (CE)t= H Ct H Et = StThe r-tuple S= [ S1,S2,S3] is called the syndrome.

    This shows that the syndrome test, whether performed on either the corrupted received word

    or on the error pattern that caused it, yields the same syndrome

    Since the syndrome digits are defined by the same equations as the parity-check equations,

    the syndrome digits reveal the parity check failures on the received codeword. (This happens

    because the code is linear. An important property of linear block codes, fundamental to the

    decoding process, is that the mapping between correctable error patterns and syndromes is

    one-to-one and this means that we not only can detect an error but we can also correct it.)

    For example using the received word given above R=[ 110111]

    H Rt =

    0

    0

    1

    1

    1

    1

    0

    1

    1

    100101

    010110

    001011

    = St,

    where S= [ S1,S2,S3] = [100]

    and as we can see this points to the fourth bit being in error.

    Now all the decoder has to do ( after calculating the syndrome) is to invert the fourth bit

    position in the received word to produce the codeword that was sent i.e C= [ 110011 ].

    having obtained a feel of what channel coding and decoding is about, lets apply this

    knowledge to a particular type of linear binary block codes called the Hamming codes.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    32/38

    HAMMING CODES

    These are Linear binary single-error-correcting codes having the property that the columns of

    the parity-check-matrix, H,consist of all the distinct non-zero rsequences of binary numbers.

    Thus a Hamming code has as many parity-check matrix columns as there are single-error

    sequences. these codes will correct all patterns of single errors in any transmitted codeword.

    These codes have n = k + r , where n = 2r- 1 , and k = 2r- 1 - r .

    These codes have a guaranteed minimum Hamming distance dmin= 3 .

    for example the parity-check-matrix for the (7,4) Hamming code is

    H=

    1001101

    0101011

    0010111

    a) Determine the codeword for the information sequence 0011

    b) If the received word, R, is 1000010, determine if an error has occurred. If it has, find the

    correct codeword.

    Solution:

    a) since H Ct = 0, we can use this equation to calculate the parity digits for the given

    codeword as follows:

    7

    6

    5

    4

    3

    2

    1

    1001101

    0101011

    0010111

    C

    C

    C

    C

    C

    C

    C

    =

    7

    6

    5

    1

    1

    0

    0

    1001101

    0101011

    0010111

    C

    C

    C

    =

    0

    0

    0

    by multiplying out the left hand side we get

    1.0 1.0 1.1 0.1 1.C50.C60.C7= 0

    0 0 1 0 C5 0 0 = 0

    i.e. 1 C5 = 0 and C5 = 1similarly by multiplying out the second row of the H matrix by the transpose of the

    codeword we obtain

    1.0 1.0 0.1 1.1 0.C51.C60.C7= 0

    0 0 0 1 0 C6 0 = 0

    i.e. 1 C6 = 0 and C6 = 1

    similarly by multiplying out the third row of the H matrix by the transpose of the codeword

    we obtain

    1.0 0.0 1.1 1.1 0.C50.C61.C7= 0

    0 0 1 1 0 0 C7 = 0

    i.e. 1 1 C7 = 0 and C7 = 0

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    33/38

    so that the codeword is

    C = [C1,C2,C3, C4,C5,C6, C7] = 0011110

    b) to find whether an error has occurred or not we use the following equation

    H Rt = St , if the syndrome is zero then no error has occurred, if not an error

    has occurred and is pin pointed by the syndrome.

    Thus to compute the syndrome we multiply out the rows of H by the transpose of the

    received word.

    0

    10

    0

    0

    0

    1

    1001101

    0101011

    0010111

    =

    1

    0

    1

    because the syndrome is the third column of the parity-check matrix, the third position of the

    received word is in error and the correct codeword is 1010010.

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    34/38

    The Generator Matrixof a linear binary block code

    We saw above that the parity-check matrix of a systematic linear binary block code can be

    written in the following (n-k) by n matrix form

    H = [ h In-k]

    The Generator Matrixof this same code is written in the following k by n matrix form

    G= [ Ik ht ]

    The generator matrix is useful in obtaining the codeword from the information sequence

    according to the following formula

    C= mG

    Where,

    Cis the codeword [C1,C2,.........,Cn-1,Cn]

    mis the information digit sequence [m1, m2, ....., mk], and

    Gis the generator matrix of the code as given by the formula for Gabove.

    Thus if we consider the single-error-correcting (n,k) = (7,4) Hamming code disscussed

    previously, its parity-check matrix was

    H=

    1 1 1 0 1 0 0

    1 1 0 1 0 1 0

    1 0 1 1 0 0 1

    and thus its generator matrix would be

    G

    1 0 0 0 1 1 1

    0 1 0 0 1 1 0

    0 0 1 0 1 0 1

    0 0 0 1 0 1 1

    now if we had an information sequence given by the following digits 0011 , the codeword

    would be given by C= mG, i.e.

    C 0 0 1 1

    1 0 0 0 1 1 1

    0 1 0 0 1 1 0

    0 0 1 0 1 0 1

    0 0 0 1 0 1 1

    0011110

    Thus the (n,k) = (7,4) Hamming code generator matrix and the code book:

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    35/38

    H = [ h In-k]

    G= [ Ik ht ]

    4

    3

    2

    1

    1101000

    1010100

    0110010

    1110001

    row

    row

    row

    row

    G

    combinations codeword

    1 1 row1 1000111

    2 2 row2 0100110

    3 3 row3 0010101

    4 4 row4 0001011

    5 1 2 rowirowj 1100001

    6 1 3 etc 1010010

    7 1 4 . 1001100

    8 2 3 . 0110011

    9 2 4 . 0101101

    10 3 4 . 0011110

    11 1 2 3 . 1110100

    121 2 4

    . 1101010

    13 1 3 4 . 1011011

    14 2 3 4 . 0111000

    15 1 2 3 4 . 1111111

    16 11 or 2 2 or 33 or 4 4 0000000

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    36/38

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    37/38

  • 8/13/2019 An Introduction to Digital Communications - Part 2

    38/38