Viterbi Algorithm Advanced Architectures...Viterbi Algorithm Advanced Architectures Lecture 15 6.973...
Transcript of Viterbi Algorithm Advanced Architectures...Viterbi Algorithm Advanced Architectures Lecture 15 6.973...
Viterbi AlgorithmAdvanced Architectures
Lecture 15
6.973 Communication System Design – Spring 2006
Massachusetts Institute of Technology
Vladimir Stojanović
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Radix 2 ACS
Radix-2 trellis 2-way ACS Radix-2 ACS Unit
6.973 Communication System Design 2
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Γ
Γ
ΓΓ
Γ
ΓΓ
Γ
Γ Γ Γxn
0 x0 x0 0 λ λ= min ( (n - 1 n - 1,x1 0
n - 1n - 1 + +1x
n
nn
n
0
0
1
1
x0 x0
0x
x1
0x0 λ
n0x1 λ
nx11 λ
x1
n1x0 λ
n1x0 λ
n0x0 λ
n0x0 λ
n0x1 λ n
1x1 λ
n1x0 λ
nx0 d n
x0 d
nx1 dn
nx0
nx1
x0
n - 1
n - 1n - 1
Γ1xn - 1
Γ x1n - 1
Γ0xn - 1
n - 1
Com
pare
Sele
ct
2-wayACS
2-wayACS
⊕
Figure by MIT OpenCourseWare.
3 Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
Radix-4 trellis
8-state Radix-2 trellis 4-state subtrellis 8-state Radix-4 trellis
Figure by MIT OpenCourseWare.
0
n - 2 n - 2 n - 2n n nn - 1 n - 1
1 1 1
1 1
1 1 12
2
2
2
2
2
3 3 3
3 3
3 3
3
4 4 4 4 4
5 5 5
5
5 5
55
6 6 6
7 7 7 7 7 7 7 7
2 2 4
6
6 6
6
6
4 4
0 0 0 0 0 0 0
2Radix
Idealspeedup
Radix - 2k complexity speed measures
Complexityincrease
Area efficiencyk
1 1 1 110.750.5
2 233
24
4 448816 Figure by MIT OpenCourseWare.
Radix-4 ACS
Radix-4 trellis 4-way ACS Radix-4 ACS unit
6.973 Communication System Design 4
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
nx0000 λ
nx0000 λ n
x0001 λ nx0010 λ n
x0011 λ
nx0100 λ n
x0101 λ nx0110 λ n
x0111 λ
nx1000 λ n
x1001 λ nx1010 λ n
x1011 λ
nx1100 λ n
x1101 λ nx1110 λ n
x1111 λ
nx0001 λ
nx0010 λ
nx0011 λ
Γx00n
Γx01n
Γx10n
Γx11n
Γ x00n - 2
Γ x10n - 2
Γ x01n - 2
Γ x11n - 2
Γ x00n - 2 Γx 00
n
n
Γ x10n - 2 Γx 10
n
Γ x01n - 2 Γx 01
n
Γ x11n - 2 Γx 11
nΓ x11
n - 2
Γ x01n - 2
Γ x10n - 2
Γ x00n - 2
Γx 00n
d x 00n
d x 00n
d x 10n
d x 01n
d x 11n
n - 24-wayACS
4-wayACS
4-wayACS
4-wayACS
Com
pare
Sele
ct⊕
⊕
⊕
Figure by MIT OpenCourseWare.
Radix-4 ACS implementation
� Use ripple carry adders and comparators � Take advantage of the ripple
profile to hide the compare
� Delay 17% longer than 2-way ACS due to� Increased adder fanout� 4:1 mux instead of 2:1 mux
� Overall, results in 1.7x speedup compared to 2-way ACS
6.973 Communication System Design 5
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure from from Black, P. J., and T. H. Meng. "A 140-Mb/s, 32-state, Radix-4 Viterbi Decoder." IEEE Journal of Solid-State Circuits 27 (1992): 1877-1885. Copyright 1992 IEEE. Used with permission.
Image removed due to copyright restrictions.
Radix-4 placement and routing
� Paths given as Hamiltonian cycles (visit each node in the graph once)
6.973 Communication System Design 6
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Images removed due to copyright restrictions.
Modulo arithmetic for ACS� Viterbi algorithm inherently bounds the maximum
dynamic range ∆max of state metrics ∆max ≤ λmaxlog2N (N-number of states, λmax maximum
branch metric of the radix-2 trellis) � Number theory
� Given two numbers a and b such that |a-b|< ∆ � Comparison |a-b| can be evaluated as |a-b| mod 2∆ without
ambiguity � Hence state metrics can be updated and compared
modulo 2∆max � Choose state metric precision to implement modulo by ignoring
the state metric overflow � Required state metric precision equal to twice the maximum
dynamic range of the updated state metrics � Required number of bits is Γbits =ceil[ log2(∆max+kλmax) ] + 1
� k – accounts for branch metric addition � Example values (for the 32-state radix-4 decoder)
� k=2, λmax=14, ∆max=70, Γbits=8
6.973 Communication System Design 7
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Branch metric unit� Example (8-level soft input, R=1/2, K=6 (32 state)
� λ(S1S2)=|G1-S1|+|G2-S2| (G-received sample, S-expected sample) (λmax=14) � 4 bits required for radix-2 branch metrics � 5 bits for the radix-4 branch metrics
6.973 Communication System Design 8
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Images removed due to copyright restrictions.
State-metric initialization
� Need to start from right state metrics for dynamic range bound to hold (and for modulo arithmetic to be valid)
� This is b/c there are constraints on the state metric values imposed by the trellis structure � For example state-0 and state-1 have a common ancestor state
one iteration back � This constrains the state metrics to differ at most by the λmax
� Find the right initial metric through simulation (with all zero inputs) until steady state is reached
6.973 Communication System Design 9
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure removed due to copyright restriction.
Decoder block diagram
6.973 Communication System Design 10
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure by MIT OpenCourseWare.
Decoder output
Decoder inputs
2 x 3 G 1, G 2 G 1, G 2
4 x 4 4 x 4
2 x 3
Radix-16 trace-back unit
Radix-4 pretrace-back unit
Decision memory
Radix-4 ACS array
Radix-4 branch metric unit
Metrics Metrics
Radix-2 BMU Radix-2 BMU
32 x 4 Radix-16 decisions
32 x 4 Radix-16 decisions
32 x 2 Radix-4 decisions
16 x 5 Metrics
LIFO
4
4 Decisions
Decision memory organization
� Survivor paths always merge L=5K steps back
6.973 Communication System Design 11
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure by MIT OpenCourseWare.
Decision vector
Decodeblock
Survivor pathmerge block
Trace-back
Writeblock
Writeregion
Readregion
Write
D DL
{ {
Advanced algorithmic transformations
� Sliding block Viterbi decoder (SBVD) � Based on two important observations (µ+1=constraint
length of the code) � 1. Survivor paths merge L=5µ iterations back into the trellis � 2. After K=5µ steps, state metrics independent on the initial
value of state metrics � Unknown state at time n can be decoded using only
information from the block [n-K, n+L] � Cannot store all the values in the memory
� Have to obtain them “on-the-fly”
6.973 Communication System Design 12
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
SBVD implementation
Can find shortest path by running forward or backward
At step m � Forward processing
� 4 survivors
� Backward processing � 4 shortest paths
� Combined � Smallest concatenated state metric � Starting state for trace-back of the shortest path
�
�
� Continue fw and backw operation Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.
MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology. Downloaded on [DD Month YYYY].
6.973 Communication System Design 13
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Forward vs. Forward-Backward
� Can decode more than one state (M – states)� Fw-Bw has reduced decoding delay and skew buffer memory
6.973 Communication System Design 14
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Continuous stream processing
�
�
�
Cut the incoming stream in overlapping chunks Process in parallel Outputs are non-overlapping
6.973 Communication System Design 15
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Systolic SBVD architecture
� Since block is bounded � Can unroll and pipeline
� ACS � Trace-back
� M=2L � Has to be a multiple of L
6.973 Communication System Design 16
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
Example for L=2
6.973 Communication System Design 17
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figure from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
ACS units
6.973 Communication System Design 18
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Figures from Black, P. J., and T. Y. Meng. "A 1-Gb/s, Four-state, Sliding Block Viterbi Decoder." IEEE Journal of Solid-State Circuits 32 (1997): 797-805. Copyright 1992 IEEE. Used with permission.
A 64-state example
� Not fully parallel (8 radix-4 ACS units) � 2 radix-4 butterflies in each cycle � 8 cycles for 64 states radix-4 (i.e. two radix-2 steps)
6.973 Communication System Design 19
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 2006.MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
Images removed due to copyright restrictions.
Readings� [1] A.P. Hekstra "An alternative to metric rescaling in Viterbi
decoders," Communications, IEEE Transactions on vol. 37, no. 11, pp. 1220-1222, 1989.
� [2] P.J. Black and T.H. Meng "A 140-Mb/s, 32-state, radix-4 Viterbi decoder," Solid-State Circuits, IEEE Journal of vol. 27, no. 12, pp. 1877-1885, 1992.
� [3] P.J. Black and T.Y. Meng "A 1-Gb/s, four-state, sliding block Viterbi decoder," Solid-State Circuits, IEEE Journal of vol. 32, no. 6, pp. 797-805, 1997.
� [4] M. Anders, S. Mathew, R. Krishnamurthy and S. Borkar "A 64-state 2GHz 500Mbps 40mW Viterbi accelerator in 90nm CMOS," VLSI Circuits, 2004. Digest of Technical Papers. 2004 Symposium on, pp. 174-175, 2004.
6.973 Communication System Design 20
Cite as: Vladimir Stojanovic, course materials for 6.973 Communication System Design, Spring 20MIT OpenCourseWare (http://ocw.mit.edu/), Massachusetts Institute of Technology.
Downloaded on [DD Month YYYY].
06.