Ziria: Wireless Programming for Hardware Dummies Božidar Radunović joint work with Gordon Stewart,...

Ziria: Wireless Programming for Hardware Dummies

Božidar Radunovićjoint work with

Gordon Stewart, Mahanth Gowda, Geoff Mainland, Dimitrios Vytiniotis

http://research.microsoft.com/en-us/projects/ziria/

Layout Introduction Ziria Programming Language Compilation and Execution Case Study - WiFi Design Conclusions

Motivation – why this course?

Lots of innovation in PHY/MAC design Popular experimental platform: GNURadio

Relatively easy to program but slow, no real network deployment

Modern wireless PHYs require high-rate DSP Real-time platforms [SORA, WARP, …]

Achieve protocol processing requirements, difficult to program, no code portability, lots of low-level hand-tuning

Issues for wireless researchers CPU platforms (e.g. SORA)

Manual vectorization, CPU placement Cache / data sizing optimizations

FPGA platforms (e.g. WARP) Latency-sensitive design, difficult for new students/researchers to

break into

Portability/readability Manually highly optimized code is difficult to read and maintain Also: practically impossible to target another platform

Difficulty in writing and reusing code

hampers innovation

Hardware Platforms FPGA: Programmer deals with hardware issues

WARP, Airblue CPUs: SORA bricks [MSR Asia], GNURadio blocks

SORA was a huge breakthrough, design of RX/TX with PCI interface, 16Gbps throughput, ~ μs latency

Very efficient C++ library We build on top of SORA

Many other options now available: E.g. http://myriadrf.org/

What is wrong with current tools?

Current SDR Software Tools Portable (FPGA/CPU), graphical interface:

Simulink, LabView

CPU-based: C/C++/Python GnuRadio, SORA

Control and data separation CodiPhy [U. of Colorado], OpenRadio [Stanford]:

Specialized languages (DSL): Stream processing languages: StreamIt [MIT] DSLs for DSP/arrays, Feldspar [Chalmers]: we put more emphasis on

control Spiral

Issues Programming abstraction is tied to execution model Programmer has to reason about how the program will be

executed/optimized while writing the code

Verbose programming Shared state Low-level optimizationWe next illustrate on Sora code examples(other platforms are have similar problems)

Running example: WiFi receiver

removeDC

DetectCarrier

ChannelEstimation

InvertChannel

Packetstart

Channel info

Decode Header

InvertChannel

Decode Packet

Packetinfo

How do we execute this on CPU?

removeDC

DetectCarrier

ChannelEstimation

InvertChannel

Packetstart

Channel info

Decode Header

InvertChannel

Decode Packet

Packetinfo

Dataflow streaming abstractions

Events (messages) come in

Events (messages) come out

Why unsatisfactory? It does not expose: (1)When is vertex state (re-)

initialized?(2)Under which external “control”

messages can the vertex change behavior?

(3)How can vertex transmit “control” information to other vertices?

Predominant abstraction today [e.g. SORA, StreamIt, GnuRadio] is that of a “vertex” in a dataflow graph

Reasonable as abstraction of the execution model Unsatisfactory as programming and compilation model

Shared statestatic inlinevoid CreateDemodGraph11a_40M (ISource*& srcAll, ISource*& srcViterbi, ISource*& srcCarrierSense){CREATE_BRICK_SINK (drop, TDropAny, BB11aDemodCtx );CREATE_BRICK_SINK (fsink, TBB11aFrameSink, BB11aDemodCtx );CREATE_BRICK_FILTER (desc, T11aDesc, BB11aDemodCtx, fsink );typedef T11aViterbi <5000*8, 48, 256> T11aViterbiComm;CREATE_BRICK_FILTER (viterbi,T11aViterbiComm::Filter,BB11aDemodCtx, desc );CREATE_BRICK_FILTER (vit0, TThreadSeparator<>::Filter, BB11aDemodCtx, viterbi);// 6MCREATE_BRICK_FILTER (di6, T11aDeinterleaveBPSK, BB11aDemodCtx, vit0 );CREATE_BRICK_FILTER (dm6, T11aDemapBPSK::filter, BB11aDemodCtx, di6 );…

… CREATE_BRICK_SINK (plcp, T11aPLCPParser, BB11aDemodCtx );CREATE_BRICK_FILTER (sviterbik, T11aViterbiSig, BB11aDemodCtx, plcp );CREATE_BRICK_FILTER (dibpsk, T11aDeinterleaveBPSK, BB11aDemodCtx, sviterbik );CREATE_BRICK_FILTER (dmplcp, T11aDemapBPSK::filter, BB11aDemodCtx, dibpsk );CREATE_BRICK_DEMUX5 ( sigsel,TBB11aRxRateSel, BB11aDemodCtx,dmplcp, dm6, dm12, dm24, dm48 );CREATE_BRICK_FILTER (pilot, TPilotTrack, BB11aDemodCtx, sigsel );CREATE_BRICK_FILTER (pcomp, TPhaseCompensate, BB11aDemodCtx, pilot );CREATE_BRICK_FILTER (chequ, TChannelEqualization, BB11aDemodCtx, pcomp );CREATE_BRICK_FILTER (fft, TFFT64, BB11aDemodCtx, chequ );; CREATE_BRICK_FILTER (fcomp, TFreqCompensation, BB11aDemodCtx, fft );CREATE_BRICK_FILTER (dsym, T11aDataSymbol, BB11aDemodCtx, fcomp );CREATE_BRICK_FILTER (dsym0, TNoInline, BB11aDemodCtx, dsym );Shared

Separation of control and datavoid Reset() { Next0()->Reset(); // No need to reset all path, just reset the path we used in this frame

switch (data_rate_kbps) {case 6000:case 9000:

Next1()->Reset();break;

case 12000:case 18000:

case 24000:case 36000:

case 48000:case 54000:

Resetting whoever* is downstream*we don’t know who that is when we write this

component

VerbosityDEFINE_LOCAL_CONTEXT(TBB11aRxRateSel, CF_11RxPLCPSwitch, CF_11aRxVector );template<TDEMUX5_ARGS>class TBB11aRxRateSel : public TDemux<TDEMUX5_PARAMS>{ CTX_VAR_RO (CF_11RxPLCPSwitch::PLCPState, plcp_state ); CTX_VAR_RO (ulong, data_rate_kbps ); // data rate in kbps

public: …..public: REFERENCE_LOCAL_CONTEXT(TBB11aRxRateSel); STD_DEMUX5_CONSTRUCTOR(TBB11aRxRateSel) BIND_CONTEXT(CF_11RxPLCPSwitch::plcp_state, plcp_state) BIND_CONTEXT(CF_11aRxVector::data_rate_kbps, data_rate_kbps) {}

- Declarations are written in host language- Language is not specialized, so often verbose

- Hinders fast prototyping

SDR manual optimizations (LUT)struct _init_lut { void operator()(uchar (&lut)[256][128]) { int i,j,k;

uchar x, s, o; for ( i=0; i<256; i++) {

for ( j=0; j<128; j++) { x = (uchar)i; s = (uchar)j; o = 0; for ( k=0; k<8; k++) {

uchar o1 = (x ^ (s) ^ (s >> 3)) & 0x01;

s = (s >> 1) | (o1 << 6);

o = (o >> 1) | (o1 << 7);

x = x >> 1; } lut [i][j] = o; } } } }

Hand-written bit-fiddling code to create lookup

tables for specific computations that must run

very fast

Vectorization

removeDC

DetectCarrier

ChannelEstimation

InvertChannel

Packetstart

Channel info

Decode Header

InvertChannel

Decode Packet

Packetinfo

- Beneficial to process items in chunks

- But how large can chunks be?

My Own Frustrations Implemented several PHY algorithms in FPGA

Never been able to reuse them: Complexity of interfacing (timing and precision) was higher than

rewriting!

Implemented several PHY algorithms in Sora

Better reuse but still difficult Spent 2h figuring out which internal state variable I haven’t

initialized when borrowed a piece of code from other project.

I want tools to allow me to write reusable codeand incrementally build ever more complex systems!

Improving this situation New wireless programming platform

1. Code written in a high-level language2. Compiler deals with low-level code optimization3. Same code compiles on different platforms (not there just yet!)

Challenges1. Design PL abstractions that are intuitive and expressive2. Design efficient compilation schemes (to multiple platforms)

What is special about wireless1. … that affects abstractions: large degree of separation b/w data

and control2. … that affects compilation: need high-throughput stream

processing

Our Choice: Domain Specific Language What are domain-specific languages? Examples:

Make SQL

Benefits: Language design captures specifics of the task This enables compiler to optimize better

Why is wireless code special? Wireless = lots of signal processing Control vs data flow separation Data processing elements:

FFT/IFFT, Coding/Decoding, Scrambling/Descrambling Predictable execution and performance, independent of data

Control flow elements: Header processing, rate adaptation

Programming model

removeDC

DetectCarrier

ChannelEstimation

InvertChannel

Packetstart

Channel info

Decode Header

InvertChannel

Decode Packet

Packetinfo

How do we want code to look like? Example: IEEE 802.11a scrambler: S(x) = x7 + x4 + 1

Ziria:x <- take; do{

tmp := (scrmbl_st[3] ^ scrmbl_st[0]);scrmbl_st[0:5] := scrmbl_st[1:6];scrmbl_st[6] := tmp;y := x ^ tmp

}; emit (y)

What do we not want to optimize? We assume efficient DSP libraries:

FFT Viterbi/Turbo decoding

Same are used in many standards: WiFi, WiMax, LTE

This is readily available: FPGA (Xilinx, Altera) DSP (coprocessors) CPUs (Volk, Sora libraries, Spiral)

Most of PHY design is in connecting these blocks

Ziria: A 2-layer design Lower layer

Imperative C-like code for manipulating bits, bytes, arrays, etc. NB: You can plug-in any C function in this layer

Higher layer A monadic language for specifying and staging stream processors Enforces clean separation between control and data flow, clean state

semantics

Runtime implements low-level execution model

Monadic pipeline staging language facilitates aggressive compiler optimizations

A stream transformer t, of type:

ST T a b

Ziria: control-aware stream abstractions

inStream (a)

outStream (b)

inStream (a)

outStream (b)

outControl (v)

A stream computer c, of type:

ST (C v) a b

Staging a pipeline, in diagrams

repeat { v <- (c1 >>> t1) ; t2 >>> t3 }

“Vertical composition” (along data path -- “arrows”)

“Horizontal composition” (along control path --

“monads”)

Running example:WiFi Scrambler

let comp scrambler() = var scrmbl_st: arr[7] bit := {'1,'1,'1,'1,'1,'1,'1}; var tmp: bit; var y:bit;

repeat seq { x <- take;

do { tmp := (scrmbl_st[3] ^ scrmbl_st[0]); scrmbl_st[0:5] := scrmbl_st[1:6]; scrmbl_st[6] := tmp; y := x ^ tmp; };

emit y }in ...

emit y }in <rest of the code>

Start defining computational method

End defining computational method

emit y }in ...

Local variables

Types:- Bit- Array of

Constants

emit y }in ...

Special-purpose computers:

emit y }in ...

Imperative (C/Matlab-like) code:

emit y }in ...

repeat

take doemi

Computers and transformers

Whole program

Read >>> do_something >>> write

Reads and writes can come from RF, IP, file, dummy

Computation language primitives Define control flow Two groups:

Transformers Computers

Transformers Map:

let f(x : int) =

var y : int = 42;

y := y + 1;

return (x+y);

read >>> map f >>> write

Repeat

let f(x : int) =

x <- take;

if (x > 0) then

emit 1

read >>> repeat f >>> write

Computers While:

while (!crc > 0) {

x <- take;

do {crc = search(x);}

If-then-else:

if (rate == CR_12) then

emit enc12(x);

emit enc23(x);

Also: take, emit, for

Expression language – data processing Mix of C and Matlab Can be directly linked to any C function Subset of data types (mainly fixed point):

Expression language - examplelet build_coeff(pcoeffs:arr[64] complex16, ave:int16, delta:int16) =

var th:int16;

th := ave - delta * 26; for i in [64-26, 26] { pcoeffs[i] := complex16{re=cos_int16(th);im=-sin_int16(th)}; th := th + delta }; th := th + delta; for i in [1,26] { pcoeffs[i] := complex16{re=cos_int16(th);im=-sin_int16(th)}; th := th + delta }in

Array (equivalent to [64-26:64])

Fixed-point complex numbers

External C function

Function

Libraries Ziria header: let external v_sub_complex32(c:arr complex32, a:arr[length(c)] complex32, b:arr[length(c)] complex32 ) : () in

C method:int __ext_v_add_complex32(struct complex32* c, int len, struct complex32* a, int __unused_2, struct complex32* b, int __unused_1)

Libraries (mainly linked to existing Sora libraries): SIMD instructions, FFT and Viterbi, fixed-point trigonometry,

visualisation

Frequently Asked Questions Why defining a new language? Why not use C/Matlab/<your favourite language>?

How do you share state? Why using let x = 20+3*z in instead ofx := 20 + 3*z;?

Why x <- take and not x := take?

Question: How do you implement teleport message?

Decoding

Frequency mixing

Equalizernew_freq

reconfigurationmessage

Answer: Use repeat to reinitialize in the new state

let processor() = var new_freq := X; // initializerepeat { ret <- ( freq_mixing(new_freq)

>>> equalizer >>> decoding

) ; do{ new_freq := ret } }

Freq_mixing

Decoding

repeatEqualize

How to write a compiler? Haskell + libraries

Parsing, code generation, flexible types, pattern matching

First version in <2 months Easily extendible Moral: compilers can be a useful tool!

Compilation – High-level view Expression language -> C code Computation language -> Execution model Numerous optimizations on the way:

Vectorization Lookup tables Conventional optimizations: Folding, inlining, …

Execution model: How to execute code?

removeDC

DetectCarrier

ChannelEstimation

InvertChannel

Packetstart

Channel info

Decode Header

InvertChannel

Decode Packet

Packetinfo

Runtime

tick()

process(x)

YIELD (data_val)

DONE (control_val)

B2process(x)

tick()

Q: Why do we need ticks?

Actions: Return values:

A: Example: emit 1; emit 2; emit 3

Execution model - examplelet comp test1() =

repeat{

(x:int) <- take;

emit x;

tick()

YIELD(n)

read[int] >>> test1() >>> test1() >>> write[int]

process(n)

YIELD(n)

process(n)

process(n)DONE(n)

process(n)

YIELD(n)

Runtime main loop

L1: t.init() // init top-level componentL2: whatis := t.tick()L3: if (whatis == Yield b) then { put_buf(b); goto L2 } else if (whatis==Skip) then goto L2 else if (whatis==Done) then exit() else if (whatis==NeedInput) then { c = get_buf(); whatis := t.process(x); goto L3; }

In reality:• Very few function calls with a CPS-

based translation: every “process” function knows its continuation

• Optimizations: never tick components with trivial tick(), never generate process() for tick()-only components

• Only indirection is for bind: at different points in times, function pointers point to the correct “process” and “tick”

• Slightly different approach to input/output

How about performance?let comp test1() = repeat{ (x:int) <- take; emit x + 1; }in

read[int] >>> test1() >>> test1() >>> write[int]

(((read >>> let auto_map_6(x: int32) = x + 1 in {map auto_map_6}) >>> let auto_map_7(x: int32) = x + 1 in {map auto_map_7}) >>> write)

buf_getint32(pbuf_ctx, &__yv_tmp_ln10_7_buf);__yv_tmp_ln11_5_buf = auto_map_6_ln2_9(__yv_tmp_ln10_7_buf); __yv_tmp_ln12_3_buf = auto_map_7_ln2_10(__yv_tmp_ln11_5_buf); buf_putint32(pbuf_ctx, __yv_tmp_ln12_3_buf);

Type-preserving transformationslet block_VECTORIZED (u: unit) = var y: int; repeat let vect_up_wrap_46 () = var vect_ya_48: arr[4] int; (vect_xa_47 : arr[4] int) <- take1; __unused_174 <- times 4 (\vect_j_50. (x : int) <- return vect_xa_47[0*4+vect_j_50*1+0]; __unused_1 <- return y := x+1; return vect_ya_48[vect_j_50*1+0] := y); emit vect_ya_48 in vect_up_wrap_46 (tt)

let block_VECTORIZED (u: unit) = var y: int; repeat let vect_up_wrap_46 () = var vect_ya_48: arr[4] int; (vect_xa_47 : arr[4] int) <- take1; emit let __unused_174 = for vect_j_50 in 0, 4 { let x = vect_xa_47[0*4+vect_j_50*1+0] in let __unused_1 = y := x+1 in vect_ya_48[vect_j_50*1+0] := y } in vect_ya_48 in vect_up_wrap_46 (tt)

Dataflow graph iteration converted to tight loop! In this case we got x3

speedup

Transformations on Abstract Syntax Tree One of main benefit of compiler Computation optimizations:

Vectorization, inlining, convert to map, optimize tick/process, …

Expression optimization: Lookup tables, inlining,

calculate constant expressions, unroll loops, …

Tests: Array boundary checks

EArrRead

y := x[0,10]

Flexible compiler design Example: x[0,length(x)] x

subarr_inline_step e | EArrRead evals estart (LILength n) <- unExp e , EVal (VInt 0) <- unExp estart , TArr (Literal m) _ <- info evals , n == m = rewrite evals

Easy to add new transformations

expressionOptimization function

If expression e is of type EArrRead on array evals == m with start estart == 0 and length == (LILength n)

evals => xestart => 0(LILength n) => length(x)

Vectorization Idea: batch processing over multiple data itemsrepeat {(x:int)<-take; emit x} repeat {(x:arr[64] int)<-take; emit x}

Modifications of the execution model: Possible since the execution model is not hardcoded in the code We need to respect the operational semantics

Benefits: LUT: bits -> bytes Lower overhead of the execution model (ticks/processes) Faster memcpy Better cache locality

Vectorization Challenges

ParseHeader

CRC(Len,Rate)

If rate == 6 Mbps

scrambler

½ encoder

interleaver

2 bit1 bit

48 bit48 bit

1 bit1 complex

1 bit1 bit

scrambler

¾ encoder

interleaver

64 QAM

4 bit3 bit

288 bit288 bit

6 bit1 complex

1 bit1 bit

8 bit4 bit

48 bit48 bit

8 bit8 complex

8 bit8 bit

32 bit24 bit

288 bit288 bit

12 bit2 complex

8 bit8 bit

32 bit24 bit

288 bit288 bit

12 bit2 complex

8 bit8 bit

8 bit216 bit

8 bit4 bit

48 bit48 bit

8 bit8 complex

8 bit8 bit

8 bit24 bit

Look-up Table (LUT) Optimizations Key optimization for Sora TX Identify block of expressions that

transform data has limited input and output size

Replace it with a LUT Similar to FPGA compilation

Especially beneficial for bit operations

LUT Optimizations (by example)let comp scrambler() = var scrmbl_st: arr[7] bit := {'1,'1,'1,'1,'1,'1,'1}; var tmp,y: bit; repeat { (x:bit) <- take; do { tmp := (scrmbl_st[3] ^ scrmbl_st[0]); scrmbl_st[0:5] := scrmbl_st[1:6]; scrmbl_st[6] := tmp; y := x ^ tmp };

emit (y) }

let comp v_scrambler () = var scrmbl_st: arr[7] bit := {'1,'1,'1,'1,'1,'1,'1}; var tmp,y: bit;

var vect_ya_26: arr[8] bit; let auto_map_71(vect_xa_25: arr[8] bit) = LUT for vect_j_28 in 0, 8 { vect_ya_26[vect_j_28] := tmp := scrmbl_st[3]^scrmbl_st[0]; scrmbl_st[0:+6] := scrmbl_st[1:+6]; scrmbl_st[6] := tmp; y := vect_xa_25[0*8+vect_j_28]^tmp; return y }; return vect_ya_26 in map auto_map_71

Vectorization

Automatic lookup-table-compilationInput-vars = scrmbl_st, vect_xa_25 = 15 bitsOutput-vars = vect_ya_26, scrmbl_st = 2 bytesIDEA: precompile to LUT of 2^15 * 2 = 64K

Question How to implement bit permutation as LUT?(idea from SORA):

out_arr := perm({0,2,3,1}, in_arr);in_arr = {1,2,3,4} out_arr = {1,3,4,2}

Hint: permutation indices are constants!

Answerlet perm(p : arr int, iarr : arr bit) =

var oarr : arr[length(p)] bit;

var oarrtmp : arr[length(p)] bit;

var iarr1 : arr[8] bit;

unroll for j in [0,length(p)/8] {

let p1 = p[j*8,8];

iarr1 := iarr[j*8,8];

perm8(p1,iarr1,oarrtmp)

oarr := v_or(oarr,oarrtmp);

return oarr;

let perm8(p : arr[8] int, iarr : arr[8] bit, oarr : arr bit) =

for i in [0,8] {

oarr[p[i]] := iarr[i];

LUT size: 2^8 * sizeof(oarr)

Constants!

out_arr := perm({0,2,3,1}, in_arr);

Supporting different HW architectures Work in progress… SMP vs FPGA vs ASIC Pipeline and data parallelism SIMD, coprocessors (DSP or ASIC)

Pipeline parallelismofdm |>>>| decode >>> packetize

ofdm >>> write(q1) >>> read(q1) >>> decode >>> packetize

ofdm >>> write(q1)

Thread 1, pin to Core 1

read(q1) >>> decode >>> packetize

Thread 2, pin to Core 2

Sync queue

Code Examples

Performance evaluation

Msample*/sec (sample = 32bits)

SORA Ziria Wifi

RX 6Mbps 164 156 40

RX 12Mbps 125 100 40

RX 24Mbps 81 67 40

RX 48Mbps 61 52 40

RX CCA 289 163 40

Mbps/sec SORA Ziria Wifi

TX 6Mbps 54 51 6

TX 12Mbps 98 45 12

TX 24Mbps 145 53 24

TX 48Mbps 231 70 48

- WiFi RX and TX measurements

Real-time LTE-like demo

Status Released to GitHub under Apache 2.0 WiFi implementation included in release Currently supports SORA platform Essential dependency on CPU/SIMD Looking into porting to other CPU-based SDRs

Motivation Wireless architecture and design are fragmented:

EE considers PHY and parts of MAC CS considers MAC and above Opportunities for synergies missed: Q: How to change PHY to allow better/simpler network designs?

Examples: HARQ and/or rateless codes: don’t care about rate adaptation, just keep sending

Correlation and detection: detection takes time => huge MAC overheads and terrible efficiency

Channel impulse response and localization: more precise location information

Conventional WiFi design PHY: standardized and cannot be changed

data pipe, correct bits getting in and out

MAC: innovation happen Conventional: CSMA Many alternatives

Conventional cellular design PHY and MAC are standardized

Several modes of operations but none can be changed by applications

IP layer and above: innovations can happen

In reality: We have standard blocks:

Correlators, scramblers, coders/decoders, interleavers, FFT/IFFT

Why not allow building network from these standard blocks? Respect certain ground rules Allow innovations

How to design a wireless transceiver? Challenges: performance, complexity, (cost/patents)

CDMA: Uses pseudo-random code against multi-path Not as complicated to implement as OFDM based systems Difficult to equalise the overall wide spectrum

OFDM: Uses subcarriers to combat multi-path combat multipath with greater robustness and less complexity. OFDMA can achieve higher spectral efficiency with MIMO than CDMA

Rule of a thumb: For data rates >= 10 Mbps use OFDM

OFDM OFDM used in most of the contemporary PHYs: WiFi, LTE, WiMax, 60 GHz, UWB

OFDMA is OFDM variant used in cellular (LTE, WiMax): Multiple users sharing the same OFDM symbol MAC scheduling at sub-symbol level Blurred distinction between MAC and PHY

WiFi TX Overview OFDM transmitter (@56Mbps):emits createSTSinTime(); emits createLTSinTime();crc216(h.len) >>> scrambler() >>> encode34() >>> interleaver_m64qam() >>> modulate_64qam() >>> add_pilots() >>> ifft() Preamble for detection and channel estimation

CRC (with padding) to check for errors Scrambler prevents high peaks Interleaver decouples errors

IFFT example

OFDM Symbol Add pilots, perform IFFT and add cyclic prefix

Channel impulse response

Effect on OFDM symbol in time

Effect on OFDM symbol in frequency

Channel Estimation

Channel inversion/equalization

How to deal with multi-path? Add cyclic prefix

Missing Cyclic Prefix

Scrambler

Avoids large peak-to-avg power ratios

WiFi RX Overviewlet comp receiver() =seq{ (removeDC() >>> t<-detectSTS()) ; params <- ChannelEstimation() ; removeCP() >>> FFT() >>> ChannelEqualization(params) >>> PilotTrack() >>> RemovePilots() >>> receiveBits() }in

Pilots Channel estimation changes across OFDM symbols Channel changes Drift in oscillators between sender and receiver

Pilots are used for channel re-estimation Similar as initial channel estimation Interpolation for data points

Carrier Sensing and Synchronization Find where packet starts Accurate timing needed for the rest of RX Also used to estimate CFO Also used in carrier sensing

Preamble

Detection using correlation Correlate for a known preamble

How do we implement this in CPU/SIMD?

Detection

Performance of Detector

Conclusions Wireless innovations will happen at intersections of PHY and MAC levels

We need prototypes and test-beds to evaluate ideas

PHY programming in its infancy Difficult, limited portability and scalability Steep learning curve, difficult to compare and extend previous works

Wireless programming is easy and fun – go for it!http://research.microsoft.com/en-us/projects/

ziria/

Ziria: Wireless Programming for Hardware Dummies Božidar Radunović joint work with Gordon Stewart,...

Documents

Transcript of Ziria: Wireless Programming for Hardware Dummies Božidar Radunović joint work with Gordon Stewart,...

Rhea: automatic filtering for unstructured cloud storage Christos Gkantsidis, Dimitrios Vytiniotis, Orion Hodson, Dushyanth Narayanan, Florin Dinu, and.

TABLICE: - Centar za mir · Web viewBranko Dmitrović, Slobodan Borojević, Milinko Janjetović, Momčilo Kovačević, Stevo Radunović, Veljko Radunović, Katica Pekić, Marin Krivošić

EDITORIALA - Alpino Tabira urak ziria balitz bezala, dardarka bi- zian jartzen ñau. Goizeko zortzirak dira, ... Gozada bat zen egunetik egunero toki guztiz desberdinak

DRAGANA RADUNOVIĆ-Zaštita vinove loze

Prevela s ruskog Ana Jakovljević Radunović - delfi.rs · PDF fileStigli smo u nepoznato selo i nepoznati ljudi su nas odveli u kuće . Dugo nisam govorila . ... Kako je tata bežao

Contentsweb.mit.edu/avytin/www/site_files/Antonios Vytiniotis... · 2009-07-08 · 4 1 Introduction 1.1 Liquefaction Seismic risk is a very significant design factor in many areas

HASKELL TO LOGIC THROUGH DENOTATIONAL SEMANTICS Dimitrios Vytiniotis, Koen Claessen, Simon Peyton Jones, Dan Rosén POPL 2013, January 2013 1.

Prevela s ruskog Ana Jakovljević Radunović

Project Snow ake: Non-blocking Safe Manual Memory ... · PDF fileProject Snow ake: Non-blocking Safe Manual Memory Management in .NET Matthew Parkinson Dimitrios Vytiniotis Kapil Vaswani

HALO: Haskell to Logic through Denotational Semantics · HALO: Haskell to Logic through Denotational Semantics As submitted to POPL’13, July 11, 2012 Dimitrios Vytiniotis Simon

Stop when you are Almost-Full Adventures in constructive termination Dimitrios Vytiniotis Microsoft Research, Cambridge Thierry Coquand, David Wahlstedt.

Project Snowflake: Non-blocking Safe Manual Memory ... · 95 Project Snowflake: Non-blocking Safe Manual Memory Management in .NET MATTHEW PARKINSON, DIMITRIOS VYTINIOTIS, KAPIL VASWANI,

Ziria: Wireless Programming for Hardware Dummies Gordon Stewart (Princeton), Mahanth Gowda (UIUC), Geoff Mainland (Drexel), Cristina Luengo (UPC), Anton.

Prevela s ruskog Ana Jakovljević Radunović · 2016-11-01 · kupio je sve usput . Stigli smo u nepoznato selo i nepoznati ljudi su nas odveli u kuće . Dugo nisam govorila . Samo

Gordon Stewart (Princeton), Mahanth Gowda (UIUC), Geoff Mainland (Drexel), Cristina Luengo Agullo (UPC), Bozidar Radunovic (MSR), Dimitrios Vytiniotis.

CRNOGORSKI JEZIK · Dizajn i tehnička priprema Nevena Matijević Lektura urednička Korektura Jasmina Radunović Tehnički urednikRajko Radulović Štampa „Suton“ d. o. o. –

WiFi-NC: WiFi over Narrow Channels Krishna Chintalapudi, Božidar Radunović Vlad Balan, Michael Buettner, Vishnu Navda, Ram Ramjee, Srinivas Yerramalli.

Vitoria-Gasteiz, 2012 › contenidos › ...NESKA, ZUK KONDOIRIK ERABILI BADUZU, EDO JARTZEN DU EDO JO GERATZEN CARA. BATZUK ZIRIA SARTU DABILTZA, DUT BERDIN SENTITZEN' ESATEN; BANA

Ziria: A DSL for wireless systems programming

Ziria: Wireless Programming for Hardware Dummies