1st Combined R2E Workshop & School-Days Error Detection and Correction Techniques
description
Transcript of 1st Combined R2E Workshop & School-Days Error Detection and Correction Techniques
1st Combined R2E Workshop & School-Days
Error Detection andCorrection Techniques
A. Marchioro / PH-ESE-ME
Outline SEU Basic Facts
Special technologies for SEU protection Mitigation techniques
Circuit Techniques• In logic• In registers• In RAMS
Logic (Redundancy) Techniques Coding techniques
• Error detection only techniques
Conclusions
2 A. Marchioro / PH-ESE
Significant also in industry
A. Marchioro / PH-ESE3
Terrestrial cosmic rays and soft errorsVol. 40, No. 1, 1996
Soft Errors in Circuits and SystemsVol. 52, No. 3, 2008
SEU errors in “analog” circuitry We live in a (mostly) digital world:
(Occasional) errors in analog circuitry will be ignored or will be fixed at the digital level Particle strike at sensing elements:
Happens all the time at particle detectors• System should be designed to cope with single wrong measurement
Can happen easily in photo-receivers Particle strikes at critical nodes
Biasing nodes • Self-recovery• Hits at high current nodes are probably going to remain unobserved
DAC registers• Not self recovered, but detectable in digital way
Oscillator circuits and PLLs:• Recovery could take ms, but should eventually occur
• May require training or synchronization sequences to be sent• Can cause long sequences of errors in applications such as self-clocking serial streams
4 A. Marchioro / PH-ESE
SEU Basics
SEU: where does it occur
A. Marchioro / PH-ESE6
“0”
from Darracq et al.: IEEE Trans. on Nuclear Science, VOL. 49, NO. 3, JUNE 2002
All all particles equally “dangerous” for SEU?
A. Marchioro / PH-ESE7
Energy loss (dE/dx) for protons in Si
for reference see: http://pdg.lbl.gov/2008/reviews/rpp2008-rev-passage-particles-matter.pdf
Bethe-Bloch energy loss equation
When and where should we care?
“I have this particular component in my system, should I be worried about SEU?”
SEU: Impact on components
A. Marchioro / PH-ESE9
Component Type Technology (likely) used
Digital SEU risk
Mitigation technique applicable
High end microprocessor and DSP
< 90 nm Very high System level redundancy
Low-end microcontroller > 130 nm High System level redundancy, software protection techniques
High density memory < 90 nm Very high Error correction (coding)
Discrete digital logic > 250 nm Medium Logic Redundancy
Discrete analog components
> 250 nm & bipolar
Low n.a.
SRAM FPGA < 90 nm Very high (*) Redundancy or reload (needs special tool)
AntiFuse FPGA < 90 nm High (**) Redundancy
ASICs 130 or 90 nm Architectural and circuit level protections
(*) Both user and configuration logic are sensitive(**) Only user logic is sensitive
SEU in a circuit
SEU can occur in several places in a circuit: In a storage node (Register, Latch or RAM) Along a logic path (needs to be synchronized with
clock sampling to be relevant) On a clock line (rather bad!) On a global line such as Reset (catastrophic!)
Different techniques are necessary to protect from these different events
No one-size fits-all solution!
A. Marchioro / PH-ESE10
Device Techniques
Device level SEU protection: SOI
A. Marchioro / PH-ESE12
+ -- +
+ -- +
+ -- +
well
substrate
+ -Oxide
The majority of commercial ICs are fabricated on bulk technologies. Charge can be collected from several microns of silicon under a device.
In thin-film SOI, the active silicon layer can be very thin,< 300 nm, therefore little free charge can be produced.
STI
WAR
NING
: Dra
wing
not
to sc
ale!
SOI and SEU
A. Marchioro / PH-ESE13
Bulk SRAM - A
SOI SRAM 1
SOI SRAM 2
Bulk SRAM - B
From J. Doff, TNS, 8/2007
SOI based ASIC design
SOI could be considered for specific and very demanding custom designs, but: Requires special technology (few vendors) Has virtually no library support Has few if any IP available Requires high volume Price: Expensive to very expensive, no second source What about the other chips in your system?
Still, it is used in space and military applications
A. Marchioro / PH-ESE14
Circuit Techniques
Single Event Upset in logic
A. Marchioro / PH-ESE16
A
B
Y A
B
Y
A
B
Y
If the length of the spike is longer than the typical gate delay, it will propagate down the logic path and possible be sampled in the next FF
This used to be a very rare event in logic up to the .25 um generation
Unfortunately it is common in 130, 90 and 65 nm (which means in most commercial chips today)
CLK
Protection against SEU in logic
A. Marchioro / PH-ESE17
Reg
iste
r
Regular (fast) gates Slow gates(filter glitches)
.. or double sample at register
Circuit level mitigation techniques
A. Marchioro / PH-ESE18
Din
CK
CK*
Din
CK
CK*
Normal Latch Strong FeedbackLatch
Din
CK
CK*
Extra Cap Latch
Din
CK
CK*
Large SizeLatch
Special topology D-FF cell
A. Marchioro / PH-ESE19
SEU robust FF: DICE cell
From Calin et al. IEEE TNS Dec 1996
Single Event Upset in SRAM
A. Marchioro / PH-ESE20
WL
BL*BL
01
Sensitive nodes are the drains of off-state transistors
Circuit level protection
A. Marchioro / PH-ESE21
from Canaris, Whitaker: Circuit Techniques for the Radiation Environment of Space, IEEE 1995 CUSTOM INTEGRATED CIRCUITS CONFERENCE
Remarks about SEU in RAMs
In today’s technologies, cells are so small (< 1 m2) that single ions can hit two or more locations at once, multiple SEU are common.Single bit EDAC is likely not sufficient!
While it is true that most of the memory area is covered by the matrix of cells, hits in other areas (decoder, sense-amp), though rare, can be even more catastrophic
A. Marchioro / PH-ESE22
A 65 nm 2-Billion Transistor Itanium
A. Marchioro / PH-ESE23
More on SER…
A. Marchioro / PH-ESE24
Logic Techniques
Redundancy
Redundancy is actually a coding techniques, technically a simple “repetition” code, where the information is duplicated or triplicated and checked at convenient boundaries
Redundancy is well applicable in control blocksData paths are better protected by other
techniques, such as parity etc.
A. Marchioro / PH-ESE26
27
Repetition Code
Take each symbol si in S and repeat it n times.This is an (n, 1) code.
For example the word {s1s2s3} becomes the codeword {s1s1s1s2s2s2s3s3s3}
Efficiency (= rate) of the code is: 1/n
The minimum distance (see later) is n and the number of errors t that can be corrected is:
t = ½ (n – 1)
A. Marchioro / PH-ESE
28
Triple Module Redundancy Triple redundancy
Three copies of same user logic + state_register
Voting logic decides 2 out of three (majority)
Used regularly in: High reliability electronics Mainframes
Problems: 300% area and power corrects only 1 error can get very wrong with
two errors Problem: How do you make
sure that the voting logic itself is not affected by SEU?
FSM1
FSM2
FSM3
Votin
g lo
gic
InputOutput
CLK
A. Marchioro / PH-ESE
ABACBC
Logic for Voting
Example of triplicated design Gigabit Optical Link (CERN
design: GOL 0.8 and 1.60 Gb/s optical link
Unidirectional < 300 mW G-Link and Gigabit Ethernet
protocol Redundant logic
More than 20,000 units in Atlas, CMS, LHCb and Alice
http://proj-gol.web.cern.ch/proj-gol/)
29 A. Marchioro / PH-ESE
30
Reduced Module Redundancy
Double redundancy Two copies of same user logic + state_register Voting logic decides if outputs are unequal If mismatch:
• Report to system Problems:
200% area and power Can’t be used in “real-time” but may be sufficient for many
applications
FSM1
FSM2C
ompa
rison
logi
c
InputOutput
CLK
ResetRequest
A. Marchioro / PH-ESE
31
What to duplicate?
Reg
Input
Output
A. Marchioro / PH-ESE
Logic
Reg
Com
paris
on lo
gic R
eg
Input
Output
Logic
Reg
Com
paris
on lo
gic
Logic
Use this:If clock frequency is high and
technology is “advanced”.
Use this: If clock frequency is low and
technology is “old”.
Reg Logic
Reg
32
FSM general structure
A. Marchioro / PH-ESE
Reg
Input
Output
Logic
Reg
Com
paris
on lo
gic
Logic
Do this!Not This.
Logic
Reg
Reg
Input
Output
Logic
Reg
Com
paris
on lo
gic
Logic
Logic
Reg
33
Temporal Redundancy
Redundancy in time: Single user logic block and two state_registers Two clocks (F1 and F2) Voting logic decides if outputs are unequal at completion of F2 If error:
• Compute again Problems:
Needs time for 3 evaluations (…not really, three transients time constants are enough)
No problem at 40 MHz and “modern” technology Needs multi-phase clock
LogicC
ompa
rison
logi
c
InputOutput
CLK2
Re-evaluateRequest
CLK1
Reg1
Reg2
A. Marchioro / PH-ESE
CLK2
CLK1
34
Memory Boundary Redundancy Check for consistency only
when results will be committed to memory: For instance when two
computers/microcontrollers perform a STORE operation
Advantages: Processors can be “standard” Write operations are relatively
rare and therefore requirements on comparison resources are small
Less resources needed for checking
Used in some mainframes with triple redundancy Problem: if you detect an error in
processor, how do you resync it?
uP 1C
ompa
rison
logi
c
Error
Shared Memory
A. Marchioro / PH-ESE
uP 2
…
35
I/O Boundary Redundancy Check for consistency only
when results will become used by external devices: For instance when two
computers/microcontrollers want to commit results to disk
Advantages: Synchronization is less of a
problem Less resources needed for
checking• In some cases it could even be done in
software uP Architectures and/or hardware
could even be different Used in high-reliability
computer boxes and avionics
uP 1
Com
paris
on lo
gic
I/Odevice
I/O CLK
Re-evaluateRequest
I/O Intf1
I/OIntfc2
A. Marchioro / PH-ESE
uP 1
Mem1
Mem1
…
Mission critical redundancy
A. Marchioro / PH-ESE36
Various computer configurations used during a Shuttle mission.
from: NASA Shuttle documentation
Redundancy in avionics
A. Marchioro / PH-ESE37
from: IEEE Aerospace & Electronic Systems Magazine, October 2000
Coding Techniques
39
Hamming Coding
“Two weekends in a row I came in and found that all my stuff had been dumped and nothing was done. I was really aroused and annoyed and I wanted those answers and two weekends had been lost. And so I said, ‘Damn it, if the machine can detect an error, why can’t it locate the position of the error and correct it?’”
from an interview with R. Hamming, February 3-4, 1977, quoted in T. Thompson,
p.17
“The purpose of this memorandum is to give some practical codes which may detect and correct all errors of a given probability of occurrence, and which detect errors of even a rarer occurrence”.
from R. Hamming, ‘Self-Correcting Codes – Case
20878, Memorandum 1130-RWH-MFW, Bell Telephone Laboratories, July
27, 1947
A. Marchioro / PH-ESE
40
Coding for memory repair
A. Marchioro / PH-ESE
Mitigating SEU: Forward Error Correction
A. Marchioro / PH-ESE41
D
f(D)
TTP
Examples of FEC: Simple Parity (actually only error
detection) EDC: Hamming coding
• single error detection capability, popular in computer DRAM
BCH • Sophisticated multiple bit
error detection and correction; requires complex logic
Reed-Solomon• Sophisticated and efficient
multi-word error detection and correction; requires complex logic
R
f(R)
D
RP =? OK/NotOK
Transmitter
Receiver
Mitigating SEU: FEC (2)
A. Marchioro / PH-ESE42
The “parity” function must be such that, if an error is detected, one can also use it to recover the right data!
R
f(R)
D =
R f -1(R)
RP =?
OK/NotOK
Receiver
f -1(R)
Families of Error Control Methods Block Codes: codeword built only on current message-word Non-block codes: codeword depends on current message word and
of some past words, ex: Convolutional, used (obviously) in streaming channels
Examples of codes: Hamming Bose-Chauduri-Hocqueghem (BCH) Golay Reed-Solomon (RS) Reed-Müller Low Density Parity Check Codes Turbo Codes …43 A. Marchioro / PH-ESE
44
Parity
In B = {0,1}, start with a message word: S = {s1s2s3s4s5s6s7}
Compute a “Parity” character s8 defined as:
where is the exclusive-OR (or the sum mod 2).
Parity check can detect all single errors (but can not give the position)Parity check can not detect double (or even count) errors
Used:- often in computer memories- in serial terminals data transmission
A. Marchioro / PH-ESE
€
⊗
€
c8 = s1 ⊗ s2 ⊗ s3 ⊗ s4 ⊗ s5 ⊗ s6 ⊗ s7
45
Two-Dimensional Parity
Par
ityX
ParityY
2 Errors
1 0 1 1 1 0 0 00 1 0 0 0 1 1 11 1 0 0 0 0 1 10 1 0 1 1 0 1 01 0 0 1 0 1 1 00 0 0 1 0 1 0 01 1 0 0 1 0 0 10 0 1 0 1 1 0
1 0 1 1 1 0 0 00 1 0 0 0 1 1 11 1 0 0 0 0 1 10 1 0 1 0 0 1 11 0 0 1 1 1 1 10 0 0 1 0 1 0 01 1 0 0 1 0 0 10 0 1 0 1 1 0
A. Marchioro / PH-ESE
46
Two-Dimensional Parity
A. Marchioro / PH-ESE
47
Hamming (intuitive version)
s1 s2s3
s4
c5
c6c7
s1 s2 s3 s4 c5 c6 c70 0 0 0 0 0 00 0 0 1 0 1 10 0 1 0 1 1 10 0 1 1 1 0 00 1 0 0 1 1 00 1 0 1 1 0 10 1 1 0 0 0 10 1 1 1 0 1 01 0 0 0 1 0 11 0 0 1 1 1 01 0 1 0 0 1 01 0 1 1 0 0 11 1 0 0 0 1 11 1 0 1 0 0 01 1 1 0 1 0 01 1 1 1 1 1 1
Definition:cj = computed to give even parity in the circle
source parity
Notice:the 16 code words in Hamming(7,4) differ from each other by at least 3 bits.
A. Marchioro / PH-ESE
48
Hamming Codes (3)
a0
a1
a2
a3
a0
a1
a2
a3
p0
p1
p2
Hardware for encoder
A. Marchioro / PH-ESE
€
⊗
€
⊗
€
⊗
49
Hamming Codes (4)
a0
a1
a2
a3
a’0
a’1
a’2
a’3
p’0
p’1
p’2
Hardware for decoder
Correction Logic
+
+
+
+
A. Marchioro / PH-ESE
€
⊗
€
⊗
€
⊗
Cost of Hamming SEC
Data Word width[nbit]
Correction bits Total bits
4 3 7
8 4 12
16 5 21
32 6 38
64 7 71
50 A. Marchioro / PH-ESE
51
Hamming in use
A. Marchioro / PH-ESE
Multiple-Errors
Errors often come in bursts For example:
• An ion can strike more than one memory cell in an array• In close space proximity• In close time proximity
Most simple correction scheme can handle only one errorE.g. Parity or Hamming
Multiple bit correction scheme exists but they are considerably more complicated
52 A. Marchioro / PH-ESE
Interleaving: Basic idea
53 A. Marchioro / PH-ESE
Diffuse
Recombine
Byte_0 Byte_1 Byte_2 Byte_3
If the error correction capability is limited to one bit/byte, then try to spread error bursts across different data chunks
Interleaving in Memories
Requires more complicated addressing and decoders, but it is comparatively simple to implement in ASICs
54 A. Marchioro / PH-ESE
b0..b1..b2.. ..b7
a0..a
1..a
2..
..an
b0.. b1.. b2.. .. b7
55
Cross-Interleaverd
d d
d d d
d
d d
d d d
A. Marchioro / PH-ESE
Techniques for serial links
Today’s high (and low) speed links all use some form of coding for reasons related to the electrical or optical characteristic of the links
A. Marchioro / PH-ESE56
Elementary review of link types
Link Coding Error Det/Corr CommentRS-232 None + Parity 1/0
USB2 NRZI + Bit stuffing 0/0 Error detect through CRC at protocol layer
Ethernet 1000 Base X 8b/10b some/0 Line balancing
SATA 8b/10b some/0 Line balancing
GOL (CERN design) 8b/10b or 16/20 EC at protocol layer
GBT (CERN design) Reed-Solomon FEC 16 out of 120 Complex block coding
A. Marchioro / PH-ESE57
Error detection, no correction
In some cases detecting the presence of an error may be sufficient to avoid problemsIn applications or protocols allowing for re-
computation or re-transmission• Examples: file reading from a disk can be
reattempted in case of errorVery often measurements can be repeated
without bad consequences for systems.
A. Marchioro / PH-ESE58
Error detection with CRC
For occasional single or non-burst errors an extremely popular and powerful error detection technique is based on computing a “Cyclic Redundancy Check” code to attach to the data
This is based on the properties of so called “Cyclic Groups”, and the basic mathematics is related to the fact that while a protection code computed additively is relatively easy to fool, one computed on the properties of the remainder of a division turns out to be much more robust
A. Marchioro / PH-ESE59
CRC in practice
Use one of the recognized CRC polynomials standard:CRC-4 g(x) = x4+x3+x2+x+1CRC-7 g(x) = x7+x6+x4+1CRC-8 g(x) = x8+x7+x6+x4+x2+1CRC-12 g(x) = x12+x11+x3+x2+x+1CRC-ANSI g(x) = x16+x15+x2+1CRC-CCITT g(x) = x16+x12+x5+1CRC-24 g(x) = x24+x23+x14+x12+x8+1CRC-32b g(x) = x32+x26+x23+x22+x16+x12+x11+x10+x8+x7+x5+
x4+x2+x+1
A. Marchioro / PH-ESE60
Conclusion
SEU events are more and more important in digital logic Mitigation of SEU can be performed at several levels
Device, circuit, system levels The correct strategy can only be decided once the
relevance of a given error on an overall system is clear Do not apply expensive mission critical techniques when simple
recovery techniques are applicable! How efficient a given strategy really easy can (unfortunately)
only be assessed through thorough testing, rough estimations can be very wrong and can lead to disasters.
A. Marchioro / PH-ESE61
Extra material
63
Bibliography on Error Coding
Good books on Coding:R. Blahut, Algebraic Codes for Data Transmission, Cambridge U.P., 2003O. Pretzel, Error Correcting Codes and Finite Fields, Oxford U.P. 1992S. Wicker, Error Control Systems, Prentice Hall, 1995
The Mathematics underneath:J. A. Gallian, Contemporary Abstract Algebra, Houghton Mifflin, 2006McEliece, Finite Fields for Scientists and Engineers, Kluwer, 1986
A. Marchioro / PH-ESE
A. Marchioro / PH-ESE64
Density of e-h pairs is important
Heavy Ion
eh eh
eh
eh
eheh eh eh eh eh eheh eh ehh
hh e
eeeeeee
hhhhh
hhh
hhhhhh
e
e
eeeeeeeh
ehhhhh eeeee
eeehhhhhh
Nwell
p- silicon
p+
1.
1. Ion strike: ionization takes place along the track (column of high-density pairs)
e
h
e
h
eh
eh
eh
ehe
heh e
he
h eh
eh
e
h
e
h
h
hheeeeeee e
h
h
hhhh
hhhhhh
hh
e ee ee
eee
he
hhhhhe
e ee
eehhhhhh
+-
2.
2. Charges start to migrate in the electric field across the junctions. Some drift (fast collection, relevant for SEEs), some diffuse (slow collection, less relevant for SEEs)
e
h
e
he
h
e
h
e
h
e
h
e
h
e
h
e
h
e
h
e
h
e
h
e
h
e
h
h
hh
eeeeeeee
h
h
hhh
hhh
hhhhhh
e ee ee
e
e
e
h
e
hhhhh
e
e eeee
hhhhhh
+-
3.
3. Charges are collected at circuit nodes. Note that, if the relevant node for the SEE is the p+ diffusion, not all charge deposited by the ion is collected there.
Illustration from F. Faccio, this Course
Units
LET = Linear Energy Transfer, i.e. how much energy (to create charged pairs) has been deposited by a ionizing particle in a given amount of material Units:
or, multiplying by the density of the material
A. Marchioro / PH-ESE65
€
[LET] = [MeV ]*[cm2][gr]
€
[LET] = [MeV ][cm]
Metrics (1) MTTF: Mean Time To Failure
Time between two faults in a given component Total System MTTF
(Units: could be measured in hours, days or years)
66
€
MTTFSystem = 11
MTTFii= 0
n∑
MTTF MTTR
time
MTBF
Error detected
System Re-Start
A. Marchioro / PH-ESE
Metrics (2)
FIT: Failure In TimeDefinition: 1 FIT is one error in 109 device-
hours of operation
Total System FIT
67
€
FITSystem = FITii= 0
n∑
A. Marchioro / PH-ESE
Metrics (3)
Converting between them:
Example: a FIT of 500 corresponds to an MTTF of 228 years.
[This conversion is valid for an exponential probability distribution, i.e. a distribution where events (i.e. errors) have no memory of time, which is indeed the case for particle hits, under constant beam intensity assumptions. Notice that this would not apply for a distribution representing ageing]
68
€
MTTF[years] = 109
FIT ⋅24[hours]⋅365[days]
A. Marchioro / PH-ESE
A. Marchioro / PH-ESE69
A commercial fault-tolerant computerfor telecom applications
70
Coding as a map Fk Fn
Fk
Fn
A. Marchioro / PH-ESE
71
Error Detection in Fn
Degradation due to Transmission or storage(retrieval)
Recoverable
Undetected ErrorConfused, unrecoverable
A. Marchioro / PH-ESE
72
Cyclic Codes: Simple Example
The code:c0 : 0000000 c1 : 1011100c2 : 0101110 c3 : 1110010c4 : 0010111 c5 : 1001011c6 : 0111001 c7 : 1100101
is cyclic, in fact it can be noticed that using shift and linearity, starting with cg=(1011100):
c0 : 0000000 c1 : cg
c2 : cg>>1c3 : c2+c4
c4 : cg>>2c5 : c1+c4
c6 : c2+c4 c7 : c1+c2+c3
A. Marchioro / PH-ESE
73
Hamming (2)
r1 r2r3
r4
r5
r6r7
During transmission the message word
s1s2s3s4c5c6c7
is (potentially) modified by an error in (the unknown) position j and is received as:
r1r2r3r4r5r6r7
for example, for j = 2:
r1r2r3r4r5r6r7 = s1s2s3s4c5c6c7 0100000
A. Marchioro / PH-ESE
€
⊗
74
Hamming (3)
1 1*0
0
1
01
Example:for an original word: 1000101assume that e=0100000 occurred, resulting in r=1100101
Circles with odd (=wrong) parity are now marked
Decoding and correcting trick:can we find a single bit (assuming that there was just one error) that lies inside all the marked circles and outsideof the unmarked one?
0
A. Marchioro / PH-ESE
75
Hamming Codes (1)
Another simple construction of Hamming Code:Given the four data bits (a0,a1,a2,a3), construct three parity bits as follows:p0 = a0 + a1 + a2
p1 = a1 + a2 + a3
p2 = a0 + a1 + a3
(here “+” is modulo 2 addition) and send the codeword: (a0, a1, a2, a3, p0, p1, p2).
The valid codewords are therefore given in the table on the right:
Notice that we use the space of 27 code-words to represent 24 possible message-words
0 0 0 0 0 0 0
0 0 0 1 0 1 1
0 0 1 0 1 1 0
0 0 1 1 1 0 1
0 1 0 0 1 1 1
0 1 0 1 1 0 0
0 1 1 0 0 0 1
0 1 1 1 0 1 0
1 0 0 0 1 0 1
1 0 0 1 1 1 0
1 0 1 0 0 1 1
1 0 1 1 0 0 0
1 1 0 0 0 1 0
1 1 0 1 0 0 1
1 1 1 0 1 0 0
1 1 1 1 1 1 1A. Marchioro / PH-ESE
76
Hamming Codes (2)
The decoder receives: (a’0, a’1, a’2, a’3, p’0, p’1, p’2) and computes:s0 = p’0 + a’0 + a’1 + a’2
s1 = p’1 + a’1 + a’2 + a’3
s2 = p’2 + a’0 + a’1 + a’3
called the “syndromes”. If there has been no error, these are all zero, if there has been one error, one or more of them may be non-zero. The syndromes depend only on the error pattern, as in the table below:
0 0 0 0 0 0 0 0 0 0
0 0 1 0 0 0 0 0 0 1
0 1 0 0 0 0 0 0 1 0
0 1 1 0 0 0 1 0 0 0
1 0 0 0 0 0 0 1 0 0
1 0 1 1 0 0 0 0 0 0
1 1 0 0 0 1 0 0 0 0
1 1 1 0 1 0 0 0 0 0
Syndrome Error
A. Marchioro / PH-ESE
77
Hamming Codes (5)
A compact description of the encoding operation and of the syndrome computations may be given by using matrix notation such as:
3
2
1
0
2
1
0
3
2
1
0
1011111001111000010000100001
aaaa
pppaaaa
€
s0
s1
s2
⎡
⎣
⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥=
1 1 1 0 1 0 00 1 1 1 0 1 01 1 0 1 0 0 1
⎡
⎣
⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥
a'0a'1a'2a'3p'0p'1p'2
⎡
⎣
⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥
A. Marchioro / PH-ESE
78
Encoding block in CD
RS(28,24)
RS(32,28)…
d
2 d
26d
27d
Din{24x8} Dout{32x8}
C2 Encoder C1 Encoder
A. Marchioro / PH-ESE
79
RS combined with interleaving UDP packets in TCP/IP protocol do not
have guaranteed delivery RS is used to replace lost packets
(“erasures”) Data stream is framed into blocks of
249 bytes and encoded in RS(255,249) blocks, this has dmin = 7 and can correct 6 erasures
Messages are interleaved in blocks of 255xN
Blocks are send from columns If a packet is lost, it is replaced by a
“0” column The receiver knows that packet “j” is
lost because it is missing in the sequence
The RS code (organized in N rows) can recover up to 6 missing columns
c1,1 c1,2 c1,3 … c1,255
c2,1 c2,2 c2,3 … c2,255
… … … … …
cN,1 cN,2 cN,3 … cN,255
A. Marchioro / PH-ESE
80
Other coding techniques
Block coding introduces redundancy on finite blocks of data, without reference to previous blocks, and with all redundant information contained in the block itself.
Convolutional coding performs encoding based on the current set of data to be coded and on the history of previous blocks, i.e., a given data set is mapped on a number of different data sets, depending on the content of the previously coded sets. These coding techniques are extremely powerful and are largely used in
telecom and space applications.
A. Marchioro / PH-ESE