Rateless Wireless Networking Decoder Mikhail Volkov Edison Achelengwa Minjie Chen.
-
Upload
bethanie-henderson -
Category
Documents
-
view
224 -
download
3
Transcript of Rateless Wireless Networking Decoder Mikhail Volkov Edison Achelengwa Minjie Chen.
Rateless Wireless Networking Decoder
Mikhail VolkovEdison Achelengwa
Minjie Chen
Cortex: a rateless wireless system
• Very recent work here at CSAIL (Perry, 2011)• Use a novel rateless code called spinal code• Encoder and decoder agree on a seed s0, a
hash function h and an IQ constellation mapping
Spinal Encoder
• Wish to transmit a message M = m1m2 ... mn
• Break the message into k-bit segments Mi
• Apply h to generate a spine
Spinal Encoder
• Encoder performs passes over the spine, each time generating new constellation points
• These constellation points are sent across an AWGN channel
Spinal Decoder
• Decoder knows s0 so it can generate the 2k possible candidate symbols s1 using h
• Each time decoder receives symbol y it keeps the B best symbols from 2k candidates using ML
• The transmitted message is estimated as the one with the lowest ML cost
Spinal Decoder
Objectives
• Implement decoder on an FPGA• Evaluate feasibility of Cortex in a real
communications system• Identify key performance bottleneck and
develop a clear strategy for developing a practical Cortex system
Micro-architecture• Interface
• Takes stream of constellation symbols as input• Outputs a message (192-bit packet)
• Decoding Stages• Code Enumeration• Add-Compare-Select• Suggestion Update• Spine Evaluator Update• Get output message
Decoderrc
v (p
ut)
Sen
d_st
at
Symbol Mapper f(*)
Spine EvaluatorPuncturing
Scheduler
Inpu
t bi
t S
trea
ms
I
Q
backtrackMemmkSalsa, h(*)
seeding parameters
curr_schedule
curr_suggcosts
schedule params
getOutMsggetOutMsg
updateSymQ
out_
msg
(ge
t)
mkDecoder
Sortingmodule
doEnumerate
doACS
suggupd
outbitsQ
getSchedule
Schedule getput
EnumReq
Vect(B*2^k, EnumResp)
Sym
bol
Msg
upda
teTr
ee
getMsg
getBestMsgs
put
get
Vect(B*2^k, MarkedCost)
Vect(B, MarkedCost)
Vect(B, MarkedCost)
Vect(B, Mark)
Msg
toACSQ
get
evalupd
Micro-architecture• Sub-modules
• Puncturing Scheduler• Spine Evaluator• Sorter• Backtrack Memory
Decoderrc
v (p
ut)
Sen
d_st
at
Symbol Mapper f(*)
Spine EvaluatorPuncturing
Scheduler
Inpu
t bi
t S
trea
ms
I
Q
backtrackMemmkSalsa, h(*)
seeding parameters
curr_schedule
curr_suggcosts
schedule params
getOutMsggetOutMsg
updateSymQ
out_
msg
(ge
t)
mkDecoder
Sortingmodule
doEnumerate
doACS
suggupd
outbitsQ
getSchedule
Schedule getput
EnumReq
Vect(B*2^k, EnumResp)
Sym
bol
Msg
upda
teTr
ee
getMsg
getBestMsgs
put
get
Vect(B*2^k, MarkedCost)
Vect(B, MarkedCost)
Vect(B, MarkedCost)
Vect(B, Mark)
Msg
toACSQ
get
evalupd
Practical Salsa Implementation
• In practice we cannot have infinite precision floating point numbers
• Salsa produces two outputs: a 64-bit spine and 512-bit arrays of symbol bits
Development and Testing
• 3 point development and testing plan• Critical to our success with 3 people under
time constraintsStep 1: Develop Decoder backbone with dummy
Sorter and Spine Evaluator. Develop Sorter and Spine Evaluator independently.
- Sorter tested with MATLAB.- Spine Evaluator (and Salsa) tested with Python.
Development and Testing
Step 2: Integrate Decoder with Sorter and Spine Evaluator. Ensure correctness at the architectural level:
- Modules instantiate correctly- Rules fire as expected, no deadlocks etc.- Timing is correct- Bits flowing end-to-end
Development and Testing
Step 3: Ensure correctness at the semantic level, i.e. “bit-by-bit debugging”
in out
AWGNChannel
PythonEncoder ou
t
Python Decoder
Bluespec Decoder
- Encode string with Python encoder to produce symbols- Decode symbols and compare results
Development and Testing
• Finally, the algorithm was tested by adding noise to the transmitted symbols
• Strictly not our concern, as long as our implementation agreed with the source code
• Algorithm worked very well• Actually “outdid” the reference code at one
point: the Python code crashed but our decoder correctly decoded the message!
Performance Analysis – FPGA frequency
• The synthesized FPGA maximum frequency is 98.035 MHz.
• Different Salsas gives the same FPGA frequency .
Performance Analysis – Frequency, Latency, Throughput
Performance Analysis - Area
• Sorter and SpineEvaluator take the most area
Performance Analysis - Area
• Our implementation actually fits on the FPGA. (roughly taking 30% of the total area)
• Different Salsa implementation don’t vary too much on device utilization.
Performance Analysis - Code• The total lines of source code was 3104. Of these, the total
lines of test code was 1135 (36.5%) and non-test code was 1969 (63.4%).
How much better can we do?• We used a naive O(n2) algorithm for the sorter module. We
might be able to use other algorithm to reduce the cycle step from 149 to 32 in the best case, which brings a 5 times better performance and improve the bit rate ot 7.5Mbits/s.
• Given the current space requirement of Salsa, we can have B (B=4) of seperate hashing modules running in parallel with each other. In this case, we can have 4 times of better performance and improve the bit rates to 7.5*4 = 30 Mbits/s.
• Suppose we have sufficient area on the FPGA, we will be able to have B*2k = 32 of hash modules running in parallel with each other . This will bring 32 times of better performance and improve the bit rates to 7.5*32 = 240Mbits/s.