Download - Ramji Venkataramanan Sekhar Tatikondarv285/itw13_sparc_talk.pdf · Ramji Venkataramanan Sekhar Tatikonda University of Cambridge Yale University (Acknowledgements: A. Barron, A. Joseph,

Sparse Regression Codes:Recent Results & Future Directions

Ramji Venkataramanan Sekhar Tatikonda

University of Cambridge Yale University

(Acknowledgements: A. Barron, A. Joseph, S. Cho, T. Sarkar)

1 / 21

GOAL : Efficient, rate-optimal codes for

Point-to-point communication

Lossy Compression

Multi-terminal models:

Wyner-Ziv, Gelfand-Pinsker, Multiple-access, broadcast,multiple descriptions, . . .

2 / 21

Achieving the Shannon limits

Textbook Constructions

Random codes for point-to-point source & channel coding

Random Binning, Superposition

High Coding and Storage Complexity

- exponential in block length n

WANT: Compact representation + fast encoding/decoding

‘Fast’ ⇒ polynomial in n

LDPC/LDGM codes, Polar codes

- For finite input-alphabet channels & sources

3 / 21

SParse Regression Codes (SPARC)

Introduced by Barron & Joseph [ISIT ’10, Arxiv ’12]

- Provably achieve AWGN capacity with efficient decoding

Also achieve Gaussian R-D function [RV-Sarkar-Tatikonda ’13]

Codebook structure facilitates binning, superposition[RV-Tatikonda, Allerton ’12]

Overview of results and open problems

4 / 21

Codebook Construction

Block length n, rate R , construct enR codewords

Gaussian source/channel coding

5 / 21

Codebook Construction

A:

β: 0, c1, 0, c2, 0, cL, 0, , 00,T

n rows

A: design matrix or ‘dictionary’ with ind. N (0, 1) entries

Codewords Aβ - sparse linear combinations of columns of A

5 / 21

SPARC Codebook

A:

β: 0, c2, 0, cL, 0, , 00,

M columns M columnsM columnsSection 1 Section 2 Section L

T

n rows

0, c1,

n rows, ML columns

Choosing M and L:

For rate R codebook, need ML = enR

Choose M polynomial of n ⇒ L ∼ n/log n

Storage Complexity ↔ Size of A: polynomial in n

6 / 21

SPARC Codebook

A:

β: 0, c2, 0, cL, 0, , 00,


T

n rows

0, c1,

n rows, ML columns

Choosing M and L:

For rate R codebook, need ML = enR

Choose M polynomial of n ⇒ L ∼ n/log n

Storage Complexity ↔ Size of A: polynomial in n6 / 21

Point-to-point AWGN Channel

Noise

+X Z

MM

Z = X + Noise‖X‖2

n≤ P, Noise ∼ Normal(0,N)

GOAL: Achieve rates close to C = 12 log

(1 + P

N

)

7 / 21

SPARC Construction

A:

β: 0, c2, 0, cL, 0, , 00,


T

n rows

0, c1,

How to choose c1, c2, . . . for efficient decoding?

Think of sections as L users of a MAC (L ∼ n/log n)

If rate of each user CL , for successive decoding,

c2i = κ · exp(−2Ci/L), i = 1, . . . , L

c2i ∼ Θ(1/L), κ determined by

∑i c2

i = P

8 / 21

SPARC Construction

A:

β: 0, c2, 0, cL, 0, , 00,


T

n rows

0, c1,

How to choose c1, c2, . . . for efficient decoding?

Think of sections as L users of a MAC (L ∼ n/log n)

If rate of each user CL , for successive decoding,

c2i = κ · exp(−2Ci/L), i = 1, . . . , L

c2i ∼ Θ(1/L), κ determined by

∑i c2

i = P8 / 21

Efficient Decoder

A:

β: 0, c2, 0, cL, 0, , 00,


T

n rows

0, c1,

Z = Aβ + Noise

Each section has rate R/L

Successive decoding efficient, but does poorly

– Section errors propagate!

9 / 21

Efficient Decoder

A:

β: 0, c2, 0, cL, 0, , 00,


T

n rows

0, c1,

Z = Aβ + Noise

Adaptive Successive Decoder [Barron-Joseph]

Set R0 = Y. In Step k = 1, 2, . . .

1 Compute inner product of each column with Rk−1

2 Pick columns whose test statistic crosses threshold τ

3 Rk = Rk−1 − Fitk4 Complexity O(ML log M)

9 / 21

Performance

How many sections are decoded correctly?

Test statistics in each step are dependent on prev. steps

A variant of the adaptive successive decoder is analyzable

Theorem ([Barron-Joseph ’10])

Let δM = 1√π log M

and R = C(1− δM −∆).

The probability that the section error rate is more than δM2C is at

mostPe = exp(−cL∆2)

(Use outer R-S code of rate (1− δM/C) to get block error prob Pe)

10 / 21

Open Questions

Better decoding algorithms

- Different power allocation across sections

- [Cho-Barron ISIT ’12]: Adaptive Soft-decision Decoder

- Connections to message passing

Codebook has good structural properties

- With ML decoding, SPARC attains capacity

- Error-exponent same order as random coding exponent[Barron-Joseph IT ’12]

Low-complexity dictionaries

- ML Performance with ±1 matrix [Takeishi et al ISIT ’13]

11 / 21

Lossy Compression

Source Coding with SPARCs . . .

12 / 21

Lossy Compression

CodebookR bits/sample

S = S1, . . . , Sn

enR

S = S1, . . . , Sn

Distortion criterion: 1n

∑k(Sk − Sk)2

For i.i.d N (0, σ2) source, min distortion σ2e−2R

GOAL: achieve this with low-complexity algorithms?

- Computation & Storage

12 / 21

Compression with SPARC

A:

β: 0, c2, 0, cL, 0, , 00,


T

n rows

0, c1,

ML codewords of form Aβ (ML = enR)

With L ∼ n/log n, can we efficiently find β s.t.

1n‖S− Aβ‖2 < σ2e−2R + ∆

13 / 21

Successive Encoding

A:

β: 0,

M columns

Section 1

T

n rows

0, c1,

Step 1: Choose column in Sec.1 that minimizes ‖S− c1Aj‖2

- Max among inner products 〈S,Aj〉

- Residue R1 = S− c1A1

14 / 21

Successive Encoding

A:

β: 0,

M columns

Section 1

T

n rows

0, c1,

Step 1: Choose column in Sec.1 that minimizes ‖S− c1Aj‖2

- Max among inner products 〈S,Aj〉- Residue R1 = S− c1A1

14 / 21

Successive Encoding

A:

β: 0, c2, 0,

M columns

Section 2

T

n rows

Step 2: Choose column in Sec.2 that minimizes ‖R1 − c2Aj‖2

- Max among inner products 〈R1,Aj〉- Residue R2 = R1 − c2A2

14 / 21

Successive Encoding

A:

β: cL, 0, , 0

M columns

Section L

T

n rows

Step L: Choose column in Sec.L that minimizes ‖RL−1 − cLAj‖2

- Max among inner products 〈RL−1,Aj〉- Final residue RL = RL−1 − cLAL

14 / 21

Performance

Theorem

For any ergodic source of variance σ2,

P(

Distortion > σ2e−2R + ∆)≤ e−cL∆

for

∆ ≥ log log M

log M.

Encoding Complexity

ML inner products and comparisons ⇒ polynomial in n

Storage Complexity

Design matrix A: n ×ML ⇒ polynomial in n

15 / 21

Simulation

Gaussian source: Mean 0, Variance 1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Rate (bits/sample)

Dis

tort

ion

Shannon limit

Parameters: M=L3, L∈[30,100]

SPARC

16 / 21

Why does the algorithm work?

A:

β: 0,

M columns

Section 1

T

n rows

0, c1,

Unlike channel coding, there is no ‘correct’ codeword

L-stage successive refinement (L ∼ nlog n )

Each section is a code of rate R/L

17 / 21

Open Questions

Better encoders with smaller gap to D∗(R)?

- Iterative soft-decision algorithm?

With optimal encoding SPARCs achieve:

- the R-D function and optimal error exponent for D < D∗

Compression performance with ±1 dictionaries

Compression of finite alphabet sources

18 / 21

Codes for multi-terminal problems

Key ingredients

Superposition (MAC, broadcast, . . . )

Random binning (Wyner-Ziv, Gelfand-Pinsker, . . . )

19 / 21


Key ingredients



SPARC is based on superposition coding!

19 / 21


Key ingredients



2nR bins

2nR1 cwds

Binning

High rate source code divided intobins of lower rate channel codes

(or vice versa)

19 / 21

Binning with SPARCs

A:

β:T

0, c1, cL, 0, , 00,

M columns

Section L

M columns

Section 1

M columns

, c2, 0,

M ′

Divide each section into sub-sections

Bin: defined by 1 subsection from each section

20 / 21

Binning with SPARCs

A:

β:T

0, c1, cL, 0, , 00,

M columns

Section L

M columns

Section 1

M columns

, c2, 0,

M ′

Theorem (RV-Tatikonda, Allerton ’12)

SPARCs attain the optimal information-theoretic rates for theGaussian Wyner-Ziv and Gelfand-Pinsker problems with probabilityof error exponentially decaying in L (n/ log n).

20 / 21

Summary

Sparse Regression Codes

Rate-optimal for Gaussian compression and communication

Low-complexity coding algorithms

Nice structure that enables binning and superposition

Future Directions

Interference channels, Multiple descriptions, . . .

Better channel decoders and source encoders

- message passing, `1 minimization etc.?

Simplified design matrices

Finite-field analogs: binary SPARCs ?

21 / 21