A Cascadable Pragmatic Block Decoding Algorithm Exploiting ...
Optimum Soft Decision Decoding of Linear Block Codesstaff.ustc.edu.cn/~jingxi/Lecture 12.pdf ·...
Transcript of Optimum Soft Decision Decoding of Linear Block Codesstaff.ustc.edu.cn/~jingxi/Lecture 12.pdf ·...
Optimum Soft Decision Decoding
of Linear Block Codes
Channel encoder
(n,k,d) linear
block code
BPSK
modulator
Optimal
receiver
AWGN
C=(Cn-1,···,C0){mi} S(t)
•Assume that [n,k,d] linear block code C is binary.
•The channel encoder maps the K information bits m0,···,mk-1 into a
codeword C =(Cn-1,···,C0)
•The transmitted waveform s(t) is
•The vector representation of s(t)
•The receiver is optimal with respect to the coded signals
2( ) (2 1) cos 2 , [( 1 ) , ( ) ], 0, 1c
j c b b
b
s t C f t t n j T n j T j nT
επ= − ∈ − − − = −⋯
1 0[(2 1), , (2 1)].c ns C Cε −= − −⋯
Optimum Soft Decision Decoding
of Linear Block Codes
•Assume that the transmitted information are equally likely.
•The optimal decision rule is the minimum distance decision rule.
• Let si be the vector representation of the coded signal waveform si(t)
corresponding to the codeword Ci=(Ci(n-1),···,Ci0)
•Let us compute the block error prob. Pe:
• Since C is linear, Pe(error|Ci) is the same for all codewords Ci.
Therefore,
Pe = Pe(error|C0)
K
( 1) 0[(2 1), , (2 1)], 0 i 2 1i c i n is C Cε −= − − ≤ ≤ −⋯
2 1
i
0
2 (error|C )
K
k
e e
i
P P−
−
=
= ∑
Optimum Soft Decision Decoding
of Linear Block Codes
2 1 2 10
1 10 0
2 1
1 0
2 1
1 0
2 ( )|| ||
2 2
2 ( )
2As ,we have ( )
K K
K
K
c iie
i i
c i
i
c c bb e c i
ic
wt Cs sP Q Q
N N
wt CQ
N
n EE P Q R wt C
K R N
ε
ε
ε ε
− −
= =
−
=
−
=
−≤ =
=
= = ≤
∑ ∑
∑
∑
•The above upper bound depends on the weight distribution of C
•A simpler bound is as follows
•Applying the union bound
0
22
the minimum distance of
K be c
EP Q R d
N
d
≤
= C
Hard Decision Decoding of Linear Block Codes
(n,k,d) linear
block codeCBPSK
modulator
BPSK
demodulator
AWGN
C=(Cn-1,···,C0){mi} S(t)
Channel
decoder
0 1ˆ ˆ ˆ or , KC m m −⋯
�In soft decision, we need to compute M distance metric. When M is
large, the computation complexity is very high.
�To reduce the computation burden, we can quantize the analog
signals, and the decoding is performed digitally.
Hard decision decoding diagram
Hard Decision Decoding of Linear Block Codes
(n,k,d) linear
block codeC BSCC=(Cn-1,···,C0)
{mi} S(t) Channel
decoder
ˆ C
Equivalent diagram
�The error prob. of the BSC is
• Assume that p<1/2.
• Assume that the transmitted information are equally likely.
• We want to design an optimal decoder and analyze its performance.
0
2 cp QN
ε =
� Because of noise, the received vector y =(yn-1, ··· , y0) may be different
from the transmitted codeword C.
� The difference vector e = y – C = (en-1, ··· , e0) is called an error
vector. ej = 0 with probability 1-p, and ej= 1 with probability p.
� Since there are 2K possible error vectors, the decoder can not decide
which codeword was actually transmitted.
� Since the transmitted information is equally likely, the MAP rule is
equivalent to the ML decision rule. Thus the optimal decoder
Hard Decision Decoding of Linear Block Codes
j0 2 1
( ) ( ) ( ) ( )
0 2 1
j i
ˆdecode as c=C iff p( | ) max ( | )
(1 ) max (1 )
( ) min ( ) ( ,C ) min ( ,C )
k
j j i i
k
j ji
wt e n wt e wt e n wt e
i
j ii i
y y C p y C
p p p p
wt e wt e d y d y
≤ ≤ −
− −
≤ ≤ −
=
⇔ − = −
⇔ = ⇔ =
j i( ,C ) min ( ,C )i
d y d y=
C
e
Minimum Hamming Distance Decoding Rule
�The optimal decision rule now becomes the minimum Hamming
distance decision rule.
�The optimal decoder decodes y as the nearest codeword
�In other words, the decoder picks the error vector that has the least
weight and then forms an estimate
ˆ ˆC y e= −
Error Detection Capability
• Let C be a block code with block length n and minimum Hamming
distance d.
• Suppose C is used only for error detection. That is, the receiver just
tests if the received vector is codeword or not.
• If it is not, the receiver detects that an error has occurred, and ask
for a retransmission of the codeword.
• The code C can detect up to d-1 errors. On the other hand, when
more than d-1 errors occur, the receiver may be fooled since it is
possible for an error vector and weight d to transform one codeword
into another codeword.
• Therefore C is capable of detecting up to d-1 errors.
Error Detection and Error Correction Capability
When C is used only for error correction, C can correct up to errors.
1( 1)
2t d
= −
·C1 ·C2
·C3
·Ci
t t
tt
···
We associate with each codeword Ci
a ball of radius t and center Ci, where
1( 1)
2d
−
All these balls are non-overlapping.
•If Ci is transmitted and t or fewer errors occur, then the received vector y
is inside the ball centered at Ci, and is closer to Ci than any other
codeword. Thus the nearest neighbor decoding will correct these errors.
•On the other hand, if more than t errors occur, the received vector y may
be closer to some other codeword. If this is the case, then the decoder
will be fooled.
• The block code can also be used for both error correction and error
detection. Suppose d=7. Then C can correct/detect 3 errors.
• We may take a different decoding strategy to increase the error
detection capability at the expense of the error correction capability.
For example, C can be used to correct up to 2 errors and at the
same time detect 4 possible errors.
Error Detection and Error Correction
·Ci ·Cjtc tc
d
td
In general, a block code C with the
minimum distance d can simultaneously
correct tc errors and detect td errors as long
as tc+td≤d-1.
For any two disjoint codewords Ci and Cj, the ball of radius td and
centre Ci and the ball of radius tc and centre Cj are disjoint.
Syndrome Decoding
• A brute force method for performing the nearest neighbor decoding
will involve 2k possible comparisons.
• More efficient method is syndrome decoding:
Given the received vector y, one first computes the vector
s’ = Hy’
Where H is a fixed parity check matrix for C. The (n-k)-dimensional
vector s=(sn-k-1, sn-k-2, ···, s0) is called the syndrome of y.
The syndrome provides some information about the possible error
vector e. There is one to one correspondence between syndromes and
sets of all possible error vectors.
Syndrome Decoding
{ : }i iy C C y− ∈ +C C≜
∈
Given the received vector y, the set of all possible error vectors is
The set y + C is called a coset of C. Note that y C. Thus
A coset containing y = the set of all possible error vectors with
respect to y.
Property 1: Two cosets are either disjoint and coincide.
Property 2: Two vectors y1 and y2 have the same syndrome iff y1 and
y2 are in the same coset.
1 2
' '
1 2
1 2 1 2
( ) ' 0
( )
Hy Hy H y y
y y y y
= ⇔ − =
⇔ − ∈ ⇔ ∈ +C C
Syndrome Decoding
e
•There is a one to one correspondence between syndrome and
cosets.
•Each coset contains 2K vectors. This implies that there are 2n-k cosets
and hence 2n-k syndromes.
•In view of the nearest neighbor decoding rule, we get the following
decoding algorithm:
Step 1: Given a received vector y, compute the syndrome s of y.
Step 2: Find the least weight vector in the coset given by the
syndrome s.
Step 3: Decode y as ˆ ˆc y e= −
The least weight vector in a coset is called the coset leader of the coset.
Syndrome Decoding
Codewords 0 C1 · ···· C2k-1 s0Coset e1 C1+e1 ····· C2k-1+ e1 s1
Coset e2n-k-1 C1+ e2n-k-1 ····· C2k-1+ e2n-k-1 s2n-k-1
Standard array for syndrome decoding ( n -k is small )
Coset leaders Syndrome
When y is received, its position in the standard array is located. The
decoder the decides that the error vector is the left-most vector in the
row containing y, and y is decoded as the codeword at the top of the
column containing y.
Example
Codewords 00000 01011 10101 11110
00001 01010 10100 11111
00010 01001 10111 11100
00100 01111 10001 11010
01000 00011 11101 10110
10000 11011 00101 01110
11000 10011 01101 00110
10010 11001 00111 01100
Coset leaders Syndrome
1 0 1 0 1
0 1 0 1 1G
=
dmin= 3
How about the actual error is (10100)?
Syndrome Computing in the Case of Cyclic Codes
•Let g(x) = xn-k+gn-k-1xn-k-1+···g1x+g0 be the generator polynomial of C.
•Let y=(yn-1, ··· y0) be the received vector. Associate with y a polynomial
y(x) = yn-1xn-1 + ••• +y1x + y0
•Assume that we compute the syndrome s=(sn-k-1, ··· s0) of y by using the
systematic parity check matrix H. Associate with s a polynomial
s(x) = sn-1xn-1 + ••• +s1x + s0
•One can show that s(x) is the remainder obtained by dividing y(x) by g(x)
y(x) = f(x)g(x) + s(x)
Example
The [7,4,3] binary Hamming code revisited
g(x) = x3+x+1
1 0 0 0 1 0 1
0 1 0 0 1 1 1
0 0 1 0 1 1 0
0 0 0 1 0 1 1
G
=
H=?
Suppose y = [ 1 1 0 1 1 0 1]
What is syndrome s
1. Use matrix multiplication
2. Use polynomial division
Shift Register Implementation
Quotient
y0y1···y61011011
Performance of Hard Decision Decoding
0
21/ 2cp Q
N
ε = <
Block error probability
n-k
2
0
0
2 1( ) ( )
i=0
1(error | ) (error | ) (error | )
= 1- (1 )
K
i i
e e i i
i
wt e n wt e
P P C P C P CM
p p
=
−−
= = =
−
∑
∑
The error probability of BSC
ei’s are the coset leaders, and e0 is the zero vector.
1( 1)
2t d
= −
2 1( ) ( )
0 0
0 1
(1 ) (1 )
1 (1 ) (1 )
n k
i i
twt e n wt e l n l
i l
t nl n l l n l
e
l l t
np p p p
l
n nP p p p p
l l
− −− −
= =
− −
= = +
− ≥ −
⇒ ≤ − − = −
∑ ∑
∑ ∑
Let
The balls of radius t and centers Ci are all disjoint. Thus any vector
with weight ≤ t is a coset leader. Therefore,
The equality holds if the coset leaders consist of all vectors with weight
≤t. In this case, we have
0
2 2t
k n
l
n
l=
=
∑
Performance of Hard Decision Decoding
The linear code is called a perfect code.
0
2 -Hamming boundt
n k
l
n
l
−
=
≤
∑
In general,
Hamming bound gives another relationship among n, k, d.
An [n,k,d] linear binary code is called quasi-perfect if
1
0
2t
n k
l
n
l
+−
=
≤
∑
•For an [n,k,d] perfect code, the balls of radius t are disjoint and
together contain all vectors of length n.
•For an [n,k,d] quasi-perfect code, the balls of radius t+1 may overlap
and together contain all vectors of length n.
Performance of Hard Decision Decoding
Error Probability for Quasi-Perfect Codes
0
2 -t
n k
l
n
l
−
=
∑The coset leaders have weight≤t+1,
The number of the cost leaders having weight t+1 is
2 1( ) ( ) 1 1
0 0 0
1 1
1 0
(1 ) (1 ) 2 (1 )
(1 ) 2 (1 )
n k
i i
t twt e n wt e l n l n k t n t
i l l
n tl n l n k t n t
e
l t l
n np p p p p p
l l
n nP p p p p
l l
− −− − − + − −
= = =
− − + − −
= + =
− = − + − −
⇒ = − − − −
∑ ∑ ∑
∑ ∑
This formula is also a lower bound to Pe for any [n, k, d] linear binary
block code.
Other Bounds
Consider the communication of two equally likely n-dimensional vectors
a = (an-1, ··· , a0), b =(bn-1, ··· , b0) over the BSC.
The minimum decoding error prob. depends only on the Hamming
distance d(a, b) between a and b. Denote this minimum error prob. as
2 12 1
1
22 1
2
1
2 1(1 ) ( , ) 2 1
( ( , ))2 21
(1 ) + (1- ) ( , ) 22
ul u l
l u
ul u l u u
l u
up p if d a b u
lp d a b
u up p p p if d a b u
l u
++ −
= +
+−
= +
+ − = +
=
− =
∑
∑
p2(.) is strictly decreasing on the set of possible integers
p2(l+1) < p2(l)
In terms of union bounds2 1
2 0 0 2 00
1
2 1
2 2
1
max ( ( , )) (error | ) ( ( , ))
That is, p ( ) ( ( ))
K
K
i e ii
i
e i
i
p d C C P C p d C C
d P p wt C
−
≠=
−
=
≤ ≤
≤ ≤
∑
∑
d = the minimum distance of C
The upper bound depends on the weight distribution of C. Since
p2(l+1) < p2(l), we have
p2(d) ≤ pe≤ (2K-1) p2(d)
Other Bounds (cntd)
Other Bounds (cntd)
We can also show that
p2(l) ≤ p( x1+···+xl≥l/2) ≤ [4p(1-p)]l/2
For any l≥1, where x1···xl are i.i.d. random variables and
1
1 with prob. p
0 with prob. 1-pX
=
2 1
2 2
1
2 1( ) / 2
2
1
p ( ) ( ( ))
p ( ) [4 (1 )]
K
K
i
e i
i
wt C
e
i
d P p wt C
d P p p
−
=
−
=
≤ ≤
⇒ ≤ ≤ −
∑
∑
Comparison of Performance Between Hard-
Decision and Soft-Decision Decoding
Method 1: use the bounds developed in the last two sections to
evaluate the hard decision performance and soft decision
performance of specific linear block codes.
Method 2: Hard decision results in the BSC channel
0
2 bEp QN
=
0
1
0
1
1-p
p
p
1-p
On the other hand, soft decision gives rise to the discrete input,
continuous output channel:
cY E n=± +
n is a Gaussian random variable. Compute the channel capacity
of these two channels and compare them.
Method 3: Use the random coding argument to find the hard decision
and soft decision random coding rates. Then compare these two
rates.
All these methods reveal that soft decision performance is roughly
2dB better than hard decision performance.
Comparison of Performance Between Hard-
Decision and Soft-Decision Decoding
Concatenated Block Codes
Most of the codes discussed so far are designed for correcting and
detecting random errors, but not suitable for correcting burst errors.
Definition: An error burst of length b in an n-bit received vector is a
contiguous sequence of b bits in which the first and the last bits and
any number of intermediate bits are received in error.
An error vector containing a single error burst of length 7 looks like
this:
00…1xxxxx10…0,
Where x may be 0 or 1.
Fact: Binary codes obtained from nonbinary codes, particularly from
Reed-Solumn (RS) codes are particularly suited to correcting burst errors.
Consider a [255, 249, 7] RS code C over GF(28). Each code symbol is an
element in GF(28) and hence represent 8 bits. Replace each code
symbol in every codeword in C by its binary representation.
We then gets a binary code C’. The resulting binary code C’ has
parameters:
n = 255 x 8 =2040, k = 249 x 8 =1992, d≥7
The original nonbinary RS code C can correct 3 symbol errors. The
binary code C’ can correct an error burst of length 17.
Concatenated Block Codes
Concatenated Block Codes
•In general, if C is an [N, K, D] RS code over GF(2m) with
N = 2m-1, K = N-2t, D = 2t + 1
•Then the binary code C’ obtained from C can correct an error burst of
length m(t-1)+1.
•The binary code C’ obtained from [N, K, D] RS code C over GF(2m)
has good burst error correction capability, but it does not help correct
random errors.
•To improve its capability of correcting random errors, we may use a
linear [n, m, d] block code to further encode the output sequence of the
binary code C’.
•The resulting overall code is called a concatenated code.
Outer
encoder
(N,K,D]Channel
Inner
Decoder
Input
data
Demodulator
Inner
Encoder
[n,m,d]Modulator
Outer
Decoder
Output
data
Concatenated Block Codes
A concatenated code
K x m bits
N x m bits
N x n bits
Concatenated Block Codes
Thus the concatenated code has parameters
block length = N x n
# of information bits = k x m
minimum distance ≥d x D
Interleavers
Another effective method for dealing with error bursts is to interleave the
block coded sequence so that a whole codeword is not transmitted in
consecutive time intervals.
As a result, error bursts are spread out among many codewords so that
errors within a codeword appear to be random.
Interleavers
Mn···16115
Mn-1···15104
Mn-2···1493
Mn-3···1382
Mn-4···1271
n-k parity bits k inform.
Read in
coded
bits
from
the
encoder
Read out bits to modulator
C0,C1, ···Cn,Cn+1
M rows
If the original [n, k, d] block code can correct an error burst of length
b, then the combination of the original block and the block
interleaver of degree m can correct an error burst of length up to mb