Abbas El Gamal - abbas/presentations/gallager- ¢  Abbas El Gamal Director, Information

download Abbas El Gamal - abbas/presentations/gallager- ¢  Abbas El Gamal Director, Information

of 38

  • date post

    25-Jul-2020
  • Category

    Documents

  • view

    1
  • download

    0

Embed Size (px)

Transcript of Abbas El Gamal - abbas/presentations/gallager- ¢  Abbas El Gamal Director, Information

  • The Gallager Converse

    Abbas El Gamal Director, Information Systems Laboratory

    Department of Electrical Engineering

    Stanford University

    Gallager’s 75th Birthday 1

  • Information Theoretic Limits

    • Establishing information theoretic limits, e.g., channel capacity, requires:

    ◦ Finding a single-letter expression for the limit

    ◦ Proving achievability

    ◦ Proving a converse

    • While achievability tells us about how to improve system design,

    converse is necessary to prove optimality

    • Proving a converse is typically harder and there are very few tools

    available, e.g., Fano’s inequality, data processing inequality, convexity

    Gallager’s 75th Birthday 2

  • Information Theoretic Limits

    • Establishing information theoretic limits, e.g., channel capacity, requires:

    ◦ Finding a single-letter expression for the limit

    ◦ Proving achievability

    ◦ Proving a converse

    • While achievability tells us about how to improve system design,

    converse is necessary to prove optimality

    • Proving a converse is typically harder and there are very few tools

    available, e.g., Fano’s inequality, data processing inequality, convexity

    • Gallager’s identification of the auxiliary random variable:

    ◦ Crucial step in proof of the degraded broadcast channel converse

    ◦ Has been used in the proof of almost all subsequent converses in

    multi-user information theory

    ◦ I used it many times in my papers

    Gallager’s 75th Birthday 3

  • Broadcast Channel

    T. M. Cover, “Broadcast Channels,” IEEE Trans. Info. Theory,

    vol. IT-18, pp. 2-14, Jan. 1972

    • Discrete memoryless (DM) broadcast channel (X , p(y1, y2|x),Y1,Y2):

    (W1,W2) X n

    p(y1, y2|x)

    Y n1

    Y n2

    Ŵ1

    Ŵ2

    Encoder

    Decoder 1

    Decoder 2

    • Send independent message Wj ∈ [1, 2 nRj ] to receiver j = 1, 2

    • Average probability of error: P (n) e = P{Ŵ1 6= W1 or Ŵ2 6= W2}

    • (R1, R2) achievable if there exists a sequence of codes with P (n) e → 0

    • The capacity region is the closure of the set of achievable rates

    Gallager’s 75th Birthday 4

  • Example 1: Binary Symmetric BC

    • Assume that p1 ≤ p2 < 1/2, H(a), a ∈ [0, 1] binary entropy function

    X

    Z1 ∼ Bern(p1)

    Z2 ∼ Bern(p2)

    Y1

    Y2

    R1

    R2

    1 − H(p2)

    1 − H(p1)

    time-sharing

    Gallager’s 75th Birthday 5

  • Example 1: Binary Symmetric BC

    • Assume that p1 ≤ p2 < 1/2

    X

    Z1 ∼ Bern(p1)

    Z2 ∼ Bern(p2)

    Y1

    Y2

    {0, 1}n cloud (radius ≈ nβ)

    cloud center Un

    satellite codeword Xn

    • Superposition coding: Use random coding. Let 0 ≤ β ≤ 1/2;

    U ∼ Bern(1/2) and X ′ ∼ Bern(β) independent, X = U + X ′

    ◦ Generate 2nR2 i.i.d. Un(w2), w2 ∈ [1, 2nR2] (cloud centers)

    ◦ Generate 2nR1 i.i.d. X ′n(w1), w1 ∈ [1, 2nR1]

    ◦ Send Xn(w1, w2) = Un(w2) + X ′n(w1) (satellite codeword)

    Gallager’s 75th Birthday 6

  • Example 1: Binary Symmetric BC

    • Assume that p1 ≤ p2 < 1/2

    X

    Z1 ∼ Bern(p1)

    Z2 ∼ Bern(p2)

    Y1

    Y2

    R1

    R2

    C2

    C1

    time-sharing

    • Decoding:

    ◦ Decoder 2 decodes Un(W2) ⇒ R2 < 1 − H(β ∗ p2)

    ◦ Decoder 1 first decodes Un(W2), subtracts it off, then decodes

    X ′n(W1) ⇒ R1 < H(β ∗ p1) − H(p1)

    Gallager’s 75th Birthday 7

  • Example 2: AWGN BC

    • Assume N1 ≤ N2, average power constraint P on X

    X

    Z1 ∼ N (0, N1)

    Z2 ∼ N (0, N2)

    Y1

    Y2

    • Superposition coding: Let α ∈ [0, 1], U ∼ N (0, (1 − α)P ),

    X ′ ∼ N (0, αP ) independent, X = U + X ′

    ◦ Decoder 2 decodes cloud center ⇒ R2 ≤ C ((1 − α)P/(αP + N2))

    ◦ Decoder 1 decodes clound center, subtracts it off, then decode the

    sattelite codeword ⇒ R1 ≤ C (αP/N1)

    Gallager’s 75th Birthday 8

  • Degraded Broadcast Channel

    • Stochastically degraded BC [Cover 1972]: There exists p(y2|y1) such

    that

    p(y2|x) = ∑

    y1

    p(y1|x)p(y2|y1)

    • Special case of channel inclusion in:

    C. E. Shannon, “A note on a partial ordering for communication

    channels,” Information and Control, vol. 1, pp. 390-397, 1958

    Gallager’s 75th Birthday 9

  • Degraded Broadcast Channel

    • Stochastically degraded BC [Cover 1972]: There exists p(y2|y1) such

    that

    p(y2|x) = ∑

    y1

    p(y1|x)p(y2|y1)

    Special case of channel inclusion in:

    C. E. Shannon, “A Note on a Partial Ordering for Communication

    Channels,” Information and Control, vol. 1, pp. 390-397, 1958

    • Physically degraded version [Bergmans 73]:

    X p(y1|x) Y1

    p(y2|y1) Y2

    • Since the capacity region of any BC depends only on marginals p(y1|x),

    p(y2|x) ⇒ The capacity region of the degraded broadcast channel is the

    same as that of the physically degraded version

    Gallager’s 75th Birthday 10

  • Degraded Broadcast Channel

    • Cover conjectured that the capacity region is set of all (R1, R2) such that

    R2 ≤ I(U ; Y2),

    R1 ≤ I(X ; Y1|U ),

    for some p(u, x)

    • First time an auxiliary random variable is used in characterizing an

    information theoretic limit

    Gallager’s 75th Birthday 11

  • Achievability

    P. P. Bergmans, “Random Coding Theorem for Broadcast Channels

    with Degraded Components,” IEEE Trans. Info. Theory, vol. IT-19,

    pp. 197-207, Mar. 1973

    • Use superposition coding: Fix p(u)p(x|u)

    Xn

    Un X n Un

    ◦ Decoder 2 decodes the cloud center Un ⇒ R2 ≤ I(U ; Y2)

    ◦ Decoder 1 decodes first decodes the cloud center, then the

    sattelite codeword Xn ⇒ R1 ≤ I(X ; Y1|U )

    Gallager’s 75th Birthday 12

  • The Converse

    A. Wyner, “A Theorem on the Entropy of Certain Binary Sequences

    and Applications: Part II” IEEE Trans. Info. Theory, vol. IT-10,

    pp. 772-777, Nov. 1973

    ◦ Proved weak converse for binary symmetric broadcast channel

    (used Mrs. Gerber’s Lemma)

    Gallager’s 75th Birthday 13

  • The Converse

    A. Wyner, “A Theorem on the Entropy of Certain Binary Sequences

    and Applications: Part II” IEEE Trans. Info. Theory, vol. IT-10,

    pp. 772-777, Nov. 1973

    ◦ Proved weak converse for binary symmetric broadcast channel

    (used Mrs. Gerber’s Lemma)

    P.P. Bergmans, “A Simple Converse for Broadcast Channels with

    Additive White Gaussian Noise (Corresp.),” IEEE Trans. Info. Theory,

    vol. IT-20, pp. 279-280, Mar. 1974

    ◦ Proved the converse for the AWGN BC (used Entropy Power

    Inequality— very similar to Wyner’s proof for binary symmetric

    case)

    Gallager’s 75th Birthday 14

  • The Converse

    A. Wyner, “A Theorem on the Entropy of Certain Binary Sequences

    and Applications: Part II” IEEE Trans. Info. Theory, vol. IT-10,

    pp. 772-777, Nov. 1973

    ◦ Proved weak converse for binary symmetric broadcast channel

    (used Mrs. Gerber’s Lemma)

    P.P. Bergmans, “A Simple Converse for Broadcast Channels with

    Additive White Gaussian Noise (Corresp.),” IEEE Trans. Info. Theory,

    vol. IT-20, pp. 279-280, Mar. 1974

    ◦ Proved converse for AWGN BC (used Entropy Power Inequality)

    R. Gallager, “Capacity and Coding for Degraded Broadcast Channels,”

    Problemy Peredaci Informaccii, vol. 10, no. 3, pp. 3-14, July-Sept 1974

    ◦ Proved the weak converse for the discrete-memoryless degraded

    BC — identification of auxiliary random variable

    ◦ Established bound on cardinality of the auxiliary random variable

    Gallager’s 75th Birthday 15

  • Weak Converse for Shannon Channel Capacity

    • Shannon capacity: C = maxp(x) I(X ; Y ) (only uses “channel” variables)

    • Weak converse: Show that for any sequence of (n,R) codes with

    P (n) e → 0, R ≤ C

    ◦ For each code form the empirical joint pmf:

    (W, Xn, Y n) ∼ p(w)p(xn|w) ∏n

    i=1 p(yi|xi)

    ◦ Fano’s inequality: H(W |Y n) ≤ nRP (n) e + H(P

    (n) e ) = nǫn

    ◦ Now consider

    nR = I(W ; Y n) + H(W |Y n)

    ≤ I(W ; Y n) + nǫn

    ≤ I(Xn; Y n) + nǫn data processing inequality

    = H(Y n) − H(Y n|Xn) + nǫn

    ≤ n

    i=1

    I(Xi; Yi) + nǫn convexity, memorylessness

    ≤ n(C + ǫn)

    Gallager’s 75th Birthday 16

  • Gallager Converse

    • Show that given a sequence of (n,R1, R2) codes with P (n) e → 0,

    R1 ≤ I(X ; Y1|U ), R2 ≤ I(U ; Y2) for some p(u, x) such that

    U → X → (Y1, Y2)

    • Key is identifying U

    ◦ The converses for binary symmetric and AWGN BCs are direct and

    do not explicitly identify U

    • As before, by Fano’s inequality

    H(W1|Y n 1 ) ≤ nR1P

    (n) e + 1 = nǫ1n