Optimum Soft Decision Decoding of Linear Block Codesstaff.ustc.edu.cn/~jingxi/Lecture 12.pdf ·...

Optimum Soft Decision Decoding

of Linear Block Codes

Channel encoder

(n,k,d) linear

block code

BPSK

modulator

Optimal

receiver

AWGN

C=(Cn-1,···,C0){mi} S(t)

•Assume that [n,k,d] linear block code C is binary.

•The channel encoder maps the K information bits m0,···,mk-1 into a

codeword C =(Cn-1,···,C0)

•The transmitted waveform s(t) is

•The vector representation of s(t)

•The receiver is optimal with respect to the coded signals

2( ) (2 1) cos 2 , [( 1 ) , ( ) ], 0, 1c

j c b b

b

s t C f t t n j T n j T j nT

επ= − ∈ − − − = −⋯

1 0[(2 1), , (2 1)].c ns C Cε −= − −⋯



•Assume that the transmitted information are equally likely.

•The optimal decision rule is the minimum distance decision rule.

• Let si be the vector representation of the coded signal waveform si(t)

corresponding to the codeword Ci=(Ci(n-1),···,Ci0)

•Let us compute the block error prob. Pe:

• Since C is linear, Pe(error|Ci) is the same for all codewords Ci.

Therefore,

Pe = Pe(error|C0)

K

( 1) 0[(2 1), , (2 1)], 0 i 2 1i c i n is C Cε −= − − ≤ ≤ −⋯

2 1

i

0

2 (error|C )

K

k

e e

i

P P−

−

=

= ∑



2 1 2 10

1 10 0

2 1

1 0

2 1

1 0

2 ( )|| ||

2 2

2 ( )

2As ,we have ( )

K K

K

K

c iie

i i

c i

i

c c bb e c i

ic

wt Cs sP Q Q

N N

wt CQ

N

n EE P Q R wt C

K R N

ε

ε

ε ε

− −

= =

−

=

−

=

−≤ =

=

= = ≤

∑ ∑

∑

∑

•The above upper bound depends on the weight distribution of C

•A simpler bound is as follows

•Applying the union bound

0

22

the minimum distance of

K be c

EP Q R d

N

d

≤

= C

Hard Decision Decoding of Linear Block Codes

(n,k,d) linear

block codeCBPSK

modulator

BPSK

demodulator

AWGN

C=(Cn-1,···,C0){mi} S(t)

Channel

decoder

0 1ˆ ˆ ˆ or , KC m m −⋯

�In soft decision, we need to compute M distance metric. When M is

large, the computation complexity is very high.

�To reduce the computation burden, we can quantize the analog

signals, and the decoding is performed digitally.

Hard decision decoding diagram


(n,k,d) linear

block codeC BSCC=(Cn-1,···,C0)

{mi} S(t) Channel

decoder

ˆ C

Equivalent diagram

�The error prob. of the BSC is

• Assume that p<1/2.

• Assume that the transmitted information are equally likely.

• We want to design an optimal decoder and analyze its performance.

0

2 cp QN

ε =

� Because of noise, the received vector y =(yn-1, ··· , y0) may be different

from the transmitted codeword C.

� The difference vector e = y – C = (en-1, ··· , e0) is called an error

vector. ej = 0 with probability 1-p, and ej= 1 with probability p.

� Since there are 2K possible error vectors, the decoder can not decide

which codeword was actually transmitted.

� Since the transmitted information is equally likely, the MAP rule is

equivalent to the ML decision rule. Thus the optimal decoder


j0 2 1

( ) ( ) ( ) ( )

0 2 1

j i

ˆdecode as c=C iff p( | ) max ( | )

(1 ) max (1 )

( ) min ( ) ( ,C ) min ( ,C )

k

j j i i

k

j ji

wt e n wt e wt e n wt e

i

j ii i

y y C p y C

p p p p

wt e wt e d y d y

≤ ≤ −

− −

≤ ≤ −

=

⇔ − = −

⇔ = ⇔ =

j i( ,C ) min ( ,C )i

d y d y=

C

e

Minimum Hamming Distance Decoding Rule

�The optimal decision rule now becomes the minimum Hamming

distance decision rule.

�The optimal decoder decodes y as the nearest codeword

�In other words, the decoder picks the error vector that has the least

weight and then forms an estimate

ˆ ˆC y e= −

Error Detection Capability

• Let C be a block code with block length n and minimum Hamming

distance d.

• Suppose C is used only for error detection. That is, the receiver just

tests if the received vector is codeword or not.

• If it is not, the receiver detects that an error has occurred, and ask

for a retransmission of the codeword.

• The code C can detect up to d-1 errors. On the other hand, when

more than d-1 errors occur, the receiver may be fooled since it is

possible for an error vector and weight d to transform one codeword

into another codeword.

• Therefore C is capable of detecting up to d-1 errors.

Error Detection and Error Correction Capability

When C is used only for error correction, C can correct up to errors.

1( 1)

2t d

= −

·C1 ·C2

·C3

·Ci

t t

tt

···

We associate with each codeword Ci

a ball of radius t and center Ci, where

1( 1)

2d

−

All these balls are non-overlapping.

•If Ci is transmitted and t or fewer errors occur, then the received vector y

is inside the ball centered at Ci, and is closer to Ci than any other

codeword. Thus the nearest neighbor decoding will correct these errors.

•On the other hand, if more than t errors occur, the received vector y may

be closer to some other codeword. If this is the case, then the decoder

will be fooled.

• The block code can also be used for both error correction and error

detection. Suppose d=7. Then C can correct/detect 3 errors.

• We may take a different decoding strategy to increase the error

detection capability at the expense of the error correction capability.

For example, C can be used to correct up to 2 errors and at the

same time detect 4 possible errors.

Error Detection and Error Correction

·Ci ·Cjtc tc

d

td

In general, a block code C with the

minimum distance d can simultaneously

correct tc errors and detect td errors as long

as tc+td≤d-1.

For any two disjoint codewords Ci and Cj, the ball of radius td and

centre Ci and the ball of radius tc and centre Cj are disjoint.

Syndrome Decoding

• A brute force method for performing the nearest neighbor decoding

will involve 2k possible comparisons.

• More efficient method is syndrome decoding:

Given the received vector y, one first computes the vector

s’ = Hy’

Where H is a fixed parity check matrix for C. The (n-k)-dimensional

vector s=(sn-k-1, sn-k-2, ···, s0) is called the syndrome of y.

The syndrome provides some information about the possible error

vector e. There is one to one correspondence between syndromes and

sets of all possible error vectors.

Syndrome Decoding

{ : }i iy C C y− ∈ +C C≜

∈

Given the received vector y, the set of all possible error vectors is

The set y + C is called a coset of C. Note that y C. Thus

A coset containing y = the set of all possible error vectors with

respect to y.

Property 1: Two cosets are either disjoint and coincide.

Property 2: Two vectors y1 and y2 have the same syndrome iff y1 and

y2 are in the same coset.

1 2

' '

1 2

1 2 1 2

( ) ' 0

( )

Hy Hy H y y

y y y y

= ⇔ − =

⇔ − ∈ ⇔ ∈ +C C

Syndrome Decoding

e

•There is a one to one correspondence between syndrome and

cosets.

•Each coset contains 2K vectors. This implies that there are 2n-k cosets

and hence 2n-k syndromes.

•In view of the nearest neighbor decoding rule, we get the following

decoding algorithm:

Step 1: Given a received vector y, compute the syndrome s of y.

Step 2: Find the least weight vector in the coset given by the

syndrome s.

Step 3: Decode y as ˆ ˆc y e= −

The least weight vector in a coset is called the coset leader of the coset.

Syndrome Decoding

Codewords 0 C1 · ···· C2k-1 s0Coset e1 C1+e1 ····· C2k-1+ e1 s1

Coset e2n-k-1 C1+ e2n-k-1 ····· C2k-1+ e2n-k-1 s2n-k-1

Standard array for syndrome decoding ( n -k is small )

Coset leaders Syndrome

When y is received, its position in the standard array is located. The

decoder the decides that the error vector is the left-most vector in the

row containing y, and y is decoded as the codeword at the top of the

column containing y.

Example

Codewords 00000 01011 10101 11110

00001 01010 10100 11111

00010 01001 10111 11100

00100 01111 10001 11010

01000 00011 11101 10110

10000 11011 00101 01110

11000 10011 01101 00110

10010 11001 00111 01100

Coset leaders Syndrome

1 0 1 0 1

0 1 0 1 1G

=

dmin= 3

How about the actual error is (10100)?

Syndrome Computing in the Case of Cyclic Codes

•Let g(x) = xn-k+gn-k-1xn-k-1+···g1x+g0 be the generator polynomial of C.

•Let y=(yn-1, ··· y0) be the received vector. Associate with y a polynomial

y(x) = yn-1xn-1 + ••• +y1x + y0

•Assume that we compute the syndrome s=(sn-k-1, ··· s0) of y by using the

systematic parity check matrix H. Associate with s a polynomial

s(x) = sn-1xn-1 + ••• +s1x + s0

•One can show that s(x) is the remainder obtained by dividing y(x) by g(x)

y(x) = f(x)g(x) + s(x)

Example

The [7,4,3] binary Hamming code revisited

g(x) = x3+x+1

1 0 0 0 1 0 1

0 1 0 0 1 1 1

0 0 1 0 1 1 0

0 0 0 1 0 1 1

G

=

H=?

Suppose y = [ 1 1 0 1 1 0 1]

What is syndrome s

1. Use matrix multiplication

2. Use polynomial division

Shift Register Implementation

Quotient

y0y1···y61011011

Performance of Hard Decision Decoding

0

21/ 2cp Q

N

ε = <

Block error probability

n-k

2

0

0

2 1( ) ( )

i=0

1(error | ) (error | ) (error | )

= 1- (1 )

K

i i

e e i i

i

wt e n wt e

P P C P C P CM

p p

=

−−

= = =

−

∑

∑

The error probability of BSC

ei’s are the coset leaders, and e0 is the zero vector.

1( 1)

2t d

= −

2 1( ) ( )

0 0

0 1

(1 ) (1 )

1 (1 ) (1 )

n k

i i

twt e n wt e l n l

i l

t nl n l l n l

e

l l t

np p p p

l

n nP p p p p

l l

− −− −

= =

− −

= = +

− ≥ −

⇒ ≤ − − = −

∑ ∑

∑ ∑

Let

The balls of radius t and centers Ci are all disjoint. Thus any vector

with weight ≤ t is a coset leader. Therefore,

The equality holds if the coset leaders consist of all vectors with weight

≤t. In this case, we have

0

2 2t

k n

l

n

l=

=

∑


The linear code is called a perfect code.

0

2 -Hamming boundt

n k

l

n

l

−

=

≤

∑

In general,

Hamming bound gives another relationship among n, k, d.

An [n,k,d] linear binary code is called quasi-perfect if

1

0

2t

n k

l

n

l

+−

=

≤

∑

•For an [n,k,d] perfect code, the balls of radius t are disjoint and

together contain all vectors of length n.

•For an [n,k,d] quasi-perfect code, the balls of radius t+1 may overlap

and together contain all vectors of length n.


Error Probability for Quasi-Perfect Codes

0

2 -t

n k

l

n

l

−

=

∑The coset leaders have weight≤t+1,

The number of the cost leaders having weight t+1 is

2 1( ) ( ) 1 1

0 0 0

1 1

1 0

(1 ) (1 ) 2 (1 )

(1 ) 2 (1 )

n k

i i

t twt e n wt e l n l n k t n t

i l l

n tl n l n k t n t

e

l t l

n np p p p p p

l l

n nP p p p p

l l

− −− − − + − −

= = =

− − + − −

= + =

− = − + − −

⇒ = − − − −

∑ ∑ ∑

∑ ∑

This formula is also a lower bound to Pe for any [n, k, d] linear binary

block code.

Other Bounds

Consider the communication of two equally likely n-dimensional vectors

a = (an-1, ··· , a0), b =(bn-1, ··· , b0) over the BSC.

The minimum decoding error prob. depends only on the Hamming

distance d(a, b) between a and b. Denote this minimum error prob. as

2 12 1

1

22 1

2

1

2 1(1 ) ( , ) 2 1

( ( , ))2 21

(1 ) + (1- ) ( , ) 22

ul u l

l u

ul u l u u

l u

up p if d a b u

lp d a b

u up p p p if d a b u

l u

++ −

= +

+−

= +

+ − = +

=

− =

∑

∑

p2(.) is strictly decreasing on the set of possible integers

p2(l+1) < p2(l)

In terms of union bounds2 1

2 0 0 2 00

1

2 1

2 2

1

max ( ( , )) (error | ) ( ( , ))

That is, p ( ) ( ( ))

K

K

i e ii

i

e i

i

p d C C P C p d C C

d P p wt C

−

≠=

−

=

≤ ≤

≤ ≤

∑

∑

d = the minimum distance of C

The upper bound depends on the weight distribution of C. Since

p2(l+1) < p2(l), we have

p2(d) ≤ pe≤ (2K-1) p2(d)

Other Bounds (cntd)

Other Bounds (cntd)

We can also show that

p2(l) ≤ p( x1+···+xl≥l/2) ≤ [4p(1-p)]l/2

For any l≥1, where x1···xl are i.i.d. random variables and

1

1 with prob. p

0 with prob. 1-pX

=

2 1

2 2

1

2 1( ) / 2

2

1

p ( ) ( ( ))

p ( ) [4 (1 )]

K

K

i

e i

i

wt C

e

i

d P p wt C

d P p p

−

=

−

=

≤ ≤

⇒ ≤ ≤ −

∑

∑

Comparison of Performance Between Hard-

Decision and Soft-Decision Decoding

Method 1: use the bounds developed in the last two sections to

evaluate the hard decision performance and soft decision

performance of specific linear block codes.

Method 2: Hard decision results in the BSC channel

0

2 bEp QN

=

0

1

0

1

1-p

p

p

1-p

On the other hand, soft decision gives rise to the discrete input,

continuous output channel:

cY E n=± +

n is a Gaussian random variable. Compute the channel capacity

of these two channels and compare them.

Method 3: Use the random coding argument to find the hard decision

and soft decision random coding rates. Then compare these two

rates.

All these methods reveal that soft decision performance is roughly

2dB better than hard decision performance.

Comparison of Performance Between Hard-

Decision and Soft-Decision Decoding

Concatenated Block Codes

Most of the codes discussed so far are designed for correcting and

detecting random errors, but not suitable for correcting burst errors.

Definition: An error burst of length b in an n-bit received vector is a

contiguous sequence of b bits in which the first and the last bits and

any number of intermediate bits are received in error.

An error vector containing a single error burst of length 7 looks like

this:

00…1xxxxx10…0,

Where x may be 0 or 1.

Fact: Binary codes obtained from nonbinary codes, particularly from

Reed-Solumn (RS) codes are particularly suited to correcting burst errors.

Consider a [255, 249, 7] RS code C over GF(28). Each code symbol is an

element in GF(28) and hence represent 8 bits. Replace each code

symbol in every codeword in C by its binary representation.

We then gets a binary code C’. The resulting binary code C’ has

parameters:

n = 255 x 8 =2040, k = 249 x 8 =1992, d≥7

The original nonbinary RS code C can correct 3 symbol errors. The

binary code C’ can correct an error burst of length 17.



•In general, if C is an [N, K, D] RS code over GF(2m) with

N = 2m-1, K = N-2t, D = 2t + 1

•Then the binary code C’ obtained from C can correct an error burst of

length m(t-1)+1.

•The binary code C’ obtained from [N, K, D] RS code C over GF(2m)

has good burst error correction capability, but it does not help correct

random errors.

•To improve its capability of correcting random errors, we may use a

linear [n, m, d] block code to further encode the output sequence of the

binary code C’.

•The resulting overall code is called a concatenated code.

Outer

encoder

(N,K,D]Channel

Inner

Decoder

Input

data

Demodulator

Inner

Encoder

[n,m,d]Modulator

Outer

Decoder

Output

data


A concatenated code

K x m bits

N x m bits

N x n bits


Thus the concatenated code has parameters

block length = N x n

# of information bits = k x m

minimum distance ≥d x D

Interleavers

Another effective method for dealing with error bursts is to interleave the

block coded sequence so that a whole codeword is not transmitted in

consecutive time intervals.

As a result, error bursts are spread out among many codewords so that

errors within a codeword appear to be random.

Interleavers

Mn···16115

Mn-1···15104

Mn-2···1493

Mn-3···1382

Mn-4···1271

n-k parity bits k inform.

Read in

coded

bits

from

the

encoder

Read out bits to modulator

C0,C1, ···Cn,Cn+1

M rows

If the original [n, k, d] block code can correct an error burst of length

b, then the combination of the original block and the block

interleaver of degree m can correct an error burst of length up to mb

Optimum Soft Decision Decoding of Linear Block Codesstaff.ustc.edu.cn/~jingxi/Lecture 12.pdf ·...

Documents

Transcript of Optimum Soft Decision Decoding of Linear Block Codesstaff.ustc.edu.cn/~jingxi/Lecture 12.pdf ·...