The Role of Feedback in Communicationsece.drexel.edu/walsh/Solmaz_FeedbackComm.pdf · noiseless...
Transcript of The Role of Feedback in Communicationsece.drexel.edu/walsh/Solmaz_FeedbackComm.pdf · noiseless...
The Role of Feedback in Communications
Solmaz Torabi
Dept. of Electrical and Computer EngineeringDrexel [email protected]
Advisor: Dr. John M. Walsh
April 22, 2015
1/75
hey
1
References I
M. Horstein, “Sequential transmission using noiseless feedback,”Information Theory, IEEE Transactions on, vol. 9, no. 3, pp.136–143, 1963.
J. Schalkwijk and T. Kailath, “A coding scheme for additive noisechannels with feedback–i: No bandwidth constraint,” InformationTheory, IEEE Transactions on, vol. 12, no. 2, pp. 172–182, 1966.
O. Shayevitz and M. Feder, “Optimal feedback communication viaposterior matching,” Information Theory, IEEE Transactions on,vol. 57, no. 3, pp. 1186–1222, 2011.
N. Gaarder and J. K. Wolf, “The capacity region of a multiple-accessdiscrete memoryless channel can increase with feedback (corresp.),”Information Theory, IEEE Transactions on, vol. 21, no. 1, pp.100–102, 1975.
G. Kramer, “Directed information for channels with feedback,”Ph.D. dissertation, University of Manitoba, Canada, 1998.
2/75
hey
2
References II
A. El Gamal and Y.-H. Kim, Network information theory.Cambridge University Press, 2011.
R. Venkataramanan and S. S. Pradhan, “Source coding withfeed-forward: Rate-distortion theorems and error exponents for ageneral source,” Information Theory, IEEE Transactions on, vol. 53,no. 6, pp. 2154–2179, 2007.
3/75
hey
3
Outline
I Introduction
I Point to point communicationI Horstein coding Scheme
I Block Feedback Coding Scheme for BSC
I Schalkwijk-Kailath Coding scheme
I Posterior matching scheme
I Multiuser ChannelI Multiple access channel
I Two way channel
I Source coding with feedforward
4/75
hey
4
DMC with feedback
Encoder p(y|x) DecoderXi Yi
Y i�1
M M̂
I Memoryless channel: PYn|X n,Y n−1 (.|xn, yn−1) = P(.|xn)
I Messages: M
I Encoding functiongi :M×Y i−1 → X
I Xn is a function of (M,Y1,Y2, ...,Yn−1)
5/75
hey
5
Point to point feedback communication system
Encoder p(y|x) DecoderXi Yi
Y i�1
M M̂
channel is memoryless if P(yn|xn, yn−1) = P(yn|xn)
channel is used without feedback P(xn|xn−1, yn−1) = P(xn|xn−1)
DMC without feedback P(yN |xN) =N∏
n=1
P(yn|xn)
6/75
hey
6
Point to point feedback communication system
Encoder p(y|x) DecoderXi Yi
Y i�1
M M̂
I If the channel is memory less, there is no information you get fromthe feedback that can help you increase your rate
CFB = maxp(x)
I (X ;Y ) = C
7/75
hey
7
Capacity of Memoryless Feedback Channel
I Shannon 56: Feedback does not increase capacity of memorylesschannel
I Simplifiying schemes for attaining it:
I Horstein 63:Developed a recursive coding strategy for the BSC withnoiseless feedback(sequential coding acheme, varying block length)
I Schalkwijk-Kailath 66: AWGN channel
I Shayevtitz 2008: Extends the SK and Horstein schemes to generalmemoryless channels
I Gaarder-Wolf 75: Feedback enlarge the capacity region of multiuserchannels
8/75
hey
8
Point to point feedback communication system
Feedback can:
I Simplify coding scheme
I Improve reliability (decreases error prob. much faster)
I Increase capacity channels with memory
I Enlarge capacity region of multiuser channels (Gaarder-Wolf 1975)
9/75
hey
9
Iterative refinement for BEC
I First send a message at a rate higher than the channel capacity(without coding)
I Then iteratively refine the receiver’s knowledge about the message
0 0
1 1
?
p
p
1 � p
1 � p
n + pn + p2n + ... =n
1 � p
su�ces to transmit n bits reliably
I We can achieve the capacity C = 1− p by simply retransmittingeach bit after it is erased.
I There is no need for sophisticated error correcting codes.10/75
hey
10
noiseless binary forward channel
binary search algorithm would provide an effective procedure fortransmitting the information involved in the source’s choice.
I Receiver starts out with a uniform prior distribution for the messagepoint selected
I The a priori median of the receiver distribution is m0 = 1/2
I Suppose 1 was sent. Hence, the new receiver distribution is uniformover the interval (1/2, 1).
[0 11
2
✓ = 101
]3
4
⇥5
8
11/75
hey
11
Outline
I Introduction
I Point to point communicationI Horstein coding Scheme
I Block Feedback Coding Scheme for BSC
I Schalkwijk-Kailath Coding scheme
I Posterior matching scheme
I Multiuser ChannelI Multiple access channel
I Two way channel
I Source coding with feedforward
12/75
hey
12
Horstein Scheme
I Horstein developed a recursive coding strategy for the BSC withnoiseless feedback
I We could call Horstein’s feedback scheme to binary search algorithmwith lies.
I The transmitter transmits a 0 on the (i + 1)st transmission when thetrue message point θ is to the left of the current median mi , a 1otherwise.
I However, since crossovers can occur in the channel, the transmitterdoes not know what the receiver’s current median mi is.
I A noiseless feedback channel is used to provide the transmitter withthis information.
13/75
hey
13
Horstein Scheme
I Divide the interval [0, 1] into 2nR equidistance subinterval.
I Represent each message by the midpoint of each interval.
I receiver has no prior knowledge of the location of θ receiver densityos initially uniform. f0(θ) = 1 for θ ∈ [0, 1]
x1 = g(θ0) =
{1 if θ0 is greater than 1/20 o.w
✓
1
1
f(✓)
m0 =1
2
14/75
hey
14
Horstein Scheme
I Assume x1 = 1 is sent through the channel. It gets corrupted withprobability p.
I After the channel output is observed, the receiver distribution andmedian is updated.
I Through the noiseless feedback, the encoder learns the distribution.
f (θ|y1) =f (θ)p(y1|θ)∫ 1
0f (θ)p(y1|θ)dθ
I Assume y1 = 1 is received.I For 0 ≤ θ ≤ 1/2, then f (θ|y1 = 1) = 2p
I For 1/2 ≤ θ ≤ 1, then f (θ|y1 = 1) = 2p̄
15/75
hey
15
Horstein Scheme
I The encoder transmits 1, if θ > median of f (θ|y1),and 0 otherwise.
1
0 1m0
f(✓)
0 1m1
2p
2p̄
f✓|Y1(✓|1)
1m2
4pp̄
4p̄2
X1 = 1Y1 = 1
X2 = 0 X3 = 1Y2 = 0
f✓|Y 2(✓|10)
I Terminates when most of the probability mass is concentrated in theneighborhood of one of he possible message points
16/75
hey
16
Horstein Scheme
The schemes can admit a simple recursive structure
f (θ|y i−1) =
{2pf (θ|y i−2) if yi−1 is greater than median of f (θ|y i−1)2p̄f (θ|y i−2) o.w
Terminates when the receiver distribution is sufficiently steep.
1
0 1
1
2
m0 m1
p
CDF
✓17/75
hey
17
Horstein Scheme- Decoding
I The decoder uses maximal posterior decoding
I It finds the interval of length 2−nR that maximizes∫ β+2−nR
β
f (θ|yn) dθ
18/75
hey
18
Horstein Scheme- Error probability
I By analysis of the evolution of f (Θ|Y i ), i ∈ [1 : n], based on theiterated function system, it can be shown that
p(θ 6∈ [β, β + 2−nR ])→ 0 as n→∞ if R < C
I With high probability, θ(M) is the unique message point within the[β, β + 2−nR)
19/75
hey
19
Outline
I Introduction
I Point to point communicationI Horstein coding Scheme
I Block Feedback Coding Scheme for BSC
I Schalkwijk-Kailath Coding scheme
I Posterior matching scheme
I Multiuser ChannelI Multiple access channel
I Two way channel
I Source coding with feedforward
20/75
hey
20
Block Feedback Coding Scheme for BSC
Ahlswede 1973:
I Implement the iterative refinment at the block level
I The encoder initially transmits an uncoded block of information
I It then refines the receiver’s knowledge about it in subsequent blocks
21/75
hey
21
Block Feedback Coding Scheme for BSC
I Tx. 1: Sends N uncoded data bits over channel.
I Ch. 1: Adds (modulo-2) N samples of Bern(p) noise
I Rx. 1: Feeds its N noisy observations back to Tx.
I Tx.2:I (a) Finds N samples of noise added by channel.
I (b) Compresses noise into NH(p) new data bits.
I (c) Sends these data bits uncoded over the channel
I Ch.2: Adds (modulo-2) NH(p) samples of Bern(p) noise.
I Rx.2: Feeds its NH(p) noisy observations back to Tx
22/75
hey
22
Block Feedback Coding Scheme for BSC
the number of channel inputs used to send the N bits would be
N + NH(p) + NH(p)2 + ... =N
1− H(p)
which corresponds to a rate of 1− H(p), the capacity of BSC(p)
23/75
hey
23
Outline
I Introduction
I Point to point communicationI Horstein coding Scheme
I Block Feedback Coding Scheme for BSC
I Schalkwijk-Kailath Coding scheme
I Posterior matching scheme
I Multiuser ChannelI Multiple access channel
I Two way channel
I Source coding with feedforward
24/75
hey
24
Robbins-Monro procedure
How to determine θ, a zero of a function g(x), without knowing theshape of the function?
I The observations are noisy
I instead of g(x), one obtains Y (x) = g(x) + Z
25/75
hey
25
Robbins-Monro procedure
Xn+1 = Xn − anYn(Xn), n = 1, 2, ..
I where∑
an =∞, and∑
a2n <∞⇒ Xn → θ Almost surely
✓ X1
}
Y1(X1)
Z1
g(x)
}Z2
X2
1
a11
a2
X3
26/75
hey
26
Schalkwijk-Kailath scheme
put straight line g(x) = x − θI Start with X1 = 1
2 , send the receiver the number g(X1) = (X1 − θ)
I The receiver obtains the number Y1(X1) = (X1 − θ) + Z1, whereZ1 ∼ N(0, 1)
✓0[ ]
1X1
g(X) = X � ✓
} Z1
Y1(X1)
27/75
hey
27
Schalkwijk-Kailath scheme
The recursion is easily solved to yield
Xn+1 = θ − 1
n
n∑1
Zi ∼ N(θ, 1/n)
✓0[]
1X1
g(X) = X � ✓
+
Yn(Xn) = g(Xn) + Zn
Zn
Xn+1 = Xn � 1
nYn(Xn)
g(Xn) = Xn � ✓g(X1)Enc Dec
28/75
hey
28
Schalkwijk-Kailath scheme
✓ X1
}Y1(X1)
Z1
g(x) = x � ✓
X2X3
}
29/75
hey
29
Schalkwijk-Kailath Coding
Another interpretation of SK scheme, with expected average transmittedpower constraint
30/75
hey
30
Schalkwijk-Kailath Coding
Y = X + Z , Z ∼ N(0, 1)
I Expected average transmitted power constraint
n∑i=1
E (g2i (m,Y i−1)) ≤ nP m ∈ [1 : 2nR ]
I Divide the interval [−√p,√p] into 2nR message interval
I Represent each message m by the midpoint of its interval
[ ]�p
pp
p
⇥
✓(m)
⇥� = 2
pp.2�nR
31/75
hey
31
Schalkwijk-Kailath Coding
I The transmitter first sends the message point it self:
X0 = θ(m)
I It is corrupted by additive Gaussian noise, so received with some bias
Y0 = θ(m) + Z0
I The goal of the transmitter is to refine the receiver’s knowledge ofthe bias
I It computes the MMSE estimate of the bias given the outputsequence observed thus far, and sending the error term
32/75
hey
32
Schalkwijk-Kailath Coding
I For i = 1, encoder learns Z0 = Y0 − X0 and transmits
X1 = γ1Z0
γ1 =√p is chosen so that E (X 2
1 ) = P
I Sends the Gaussian random variable Z0 to the receiver, thusreducing the effect of the noise on the original transmission
I For i ∈ [2 : n], it transmits
Xi = γi (Z0 − E (Z0|Y i−1))
γi is chosen to meet power constraint
33/75
hey
33
Schalkwijk-Kailath Decoding rule
I After the n transmissions to convey Z0, the receiver combines itsestimate of Z0 with Y0 to get an estimate of message point
I The receiver, uses a nearest neighbor decoding rule to recover themessage point.
Θ̂n = Y0 − E (Z0|Y n) = θ(m) + Z0 − E (Z0|Y n)
34/75
hey
34
Schalkwijk-Kailath Error Analysis
TheoremThe probability of decoding error decreases as a second-order exponent inblock length for rates below capacity.
I Decoder makes an error if Θ̂n is closer to the nearest neighbors ofθ(m) than to θ(m)
|Θ̂n − θ(m)| > ∆/2
pne ≤ 2Q(2nC(P)∆/2), Q(x) =
∫ ∞x
1√2π
e−t2/2 dt
35/75
hey
35
Schalkwijk-Kailath Error Analysis
Distribution of Θ̂n: Gaussian with mean θ(m), and variance ?
Θ̂n = Y0 − E (Z0|Y n) = θ(m) + Z0 − E (Z0|Y n)
I (Z0;Y n) = h(Z0)− h(Z0|Yn) =1
2log
1
Var(Z0|Y n)
⇒ Var(Z0|Y n) = 2−nI (Z0;Y n)
36/75
hey
36
Schalkwijk-Kailath Error Analysis
I (Z0;Y n) =n∑
i=1
I (Z0;Yi |Y i−1)
=n∑
i=1
(h(Yi |Y i−1)− h(Yi |Z0,Yi−1)
=n∑
i=1
(h(Yi )− h(Zi |Z0,Yi−1))
=n∑
i=1
(h(Yi )− h(Zi ))
=n
2log(1 + P)
= nC (P)
37/75
hey
37
Schalkwijk-Kailath Error Analysis
TheoremChannel input is independent of the previous output Y i−1.
Proof.
I Z0 ⊥ Z1, and Gaussian
I Y1 = γ1Z0 + Z1 ⇒ E (Z0|Y1) is linear in Y1
I X2 = γ2(Z0 − E (Z0|Y1) is Gaussian ⊥ Y1
I Z2 is Gaussian ⊥ Y1
I Y2 = X2 + Z2 is Gaussian ⊥ Y1
...
38/75
hey
38
Error Exponent
Var(Z0|Y n) = 2−2nC(P)
Θ̂n ∼ N(θ(m), 2−2nC(P))
[Shannon 59]: No feedback
p(n)e = e−O(n)
With feedbackp(n)e = exp(− exp(O(n(C − R)))
39/75
hey
39
Recursion rule for SK scheme
Xi = γi (Z0 − E (Z0|Y i−1))
= γi (Z0 − E (Z0|Y i−2) + E (Z0|Y i−2)− E (Z0|Y i−1))
=γiγi−1
(Xi−1 − E (Xi−1|Y i−1)
=γiγi−1
(Xi−1 − E (Xi−1|Yi−1)
Xi ∝ Xi−1 − E (Xi−1|Yi−1)
40/75
hey
40
Schalkwijk-Kailath Coding
Important observation:
X1 ∝ Z0 ∼ N(0, 1)
Xi ∝ Z0 − E (Z0|Y i−1)⊥ Y i−1
41/75
hey
41
Schalkwijk-Kailath observation
TheoremChannel input is independent of the previous output Y i−1.
Proof.
I Z0 ⊥ Z1, and Gaussian
I Y1 = γ1Z0 + Z1 ⇒ E (Z0|Y1) is linear in Y1
I X2 = γ2(Z0 − E (Z0|Y1) is Gaussian ⊥ Y1
I Z2 is Gaussian ⊥ Y1
I Y2 = X2 + Z2 is Gaussian ⊥ Y1
...
42/75
hey
42
Outline
I Introduction
I Point to point communicationI Horstein coding Scheme
I Block Feedback Coding Scheme for BSC
I Schalkwijk-Kailath Coding scheme
I Posterior matching scheme
I Multiuser ChannelI Multiple access channel
I Two way channel
I Source coding with feedforward
43/75
hey
43
Posterior Matching Scheme
I At each time, the receiver calculates the a-posteriori density functionof the message point. fn(θ) = fθ|yn(θ|yn)
I Transmitter can track fn(θ) as well.
I The goal is to select the transmission function gn, for the fastconcentration of fn(θ) around θ0
44/75
hey
44
What is the best selection of the transmission functions gi?
gi (θ,Yi−1) = F−1
X ◦ FΘ|Y i−1 (θ|Y i−1)
information regarding θ0 still missing at the receiver is extracted by
I Encoder first extracts the information missing at the receiver fromthe a-posteriori by
I Generating a random variable that is statistically independent of pastobservations,
I When coupled with those observations, uniquely produces theintended message θ0
I This information is then matched to the optimal input distributionof the channel, FX , to achieve capacity
I Stretching the posterior into the desired input distribution
45/75
hey
45
Posterior Matching Scheme
The input to the channel is a set of random variables given by
X1 = F−1X (FΘ(Θ))
Xi = F−1X (FΘ|Y i−1 (Θ|Y i−1))
Note that because FΘ|Y i−1 (Θ|Y i−1) is distributed uniformly on [0, 1],
regardless of the sequence Y i−1, it follows that
I Xi is independent of Y i−1 and , due to the memoryless nature of thechannel, Yi is independent of Y i−1
I The marginal distribution on Xi is PX , the capacity achievingdistribution, Consequently, {Yi} are i.i.d.
46/75
hey
46
Proposition
Let X be a real-valued random variable. The random variableZ = FX (X ) is uniformly distributed on [0, 1].
Proof.Let Z = FX (X )
FZ (x) = p(Z ≤ x)
= p(FX (X ) ≤ x)
= p(X ≤ F−1X (x))
= FX (F−1X (x))
= x
For any x ∈ [0, 1], which shows that Z is a uniform random variable on[0, 1]
47/75
hey
47
Proposition
Suppose that Θ ∼ U [0, 1] and let X be a real-valued random variable.Then the random variable Y = F−1
X (Θ) has the same distribution as X .on [0, 1].
Proof.
FY (x) = p(Y ≤ x)
= p(F−1X (Θ) ≤ x)
= p(Θ ≤ FX (x))
= FX (x)
FX (x) ∈ [0, 1] for all x ⇒ X ,Y have the same distribution
48/75
hey
48
Posterior matching AWGN channel
I Let pY |X be an AWGN channel with noise variance N
I set Gaussian input distribution X ∼ N(0,P) (capacity achieving foran input power constraint P
I derive posterior matching scheme in this case
I Let SNR = P/N
Xi+1 =√
1 + SNR(Xi −
SNR
1 + SNRYi
)I The transmitter sends the error term pertaining to the MMSE
estimate of Xi from Yi
49/75
hey
49
Posterior Matching Scheme-BSC
I Set Px = Bern(1/2) (capacity achieving)
I the PM scheme coincides with Horstein’s median rule
Xn+1 = F−1X (FΘ0|Y n(Θo |Y n)) =
{1 if Θ0 > median{fΘ0|Y n(◦|Y n)}0 o.w
I F−1X quantizes above/below 1
2
50/75
hey
50
Outline
I Introduction
I Point to point communicationI Horstein coding Scheme
I Block Feedback Coding Scheme for BSC
I Schalkwijk-Kailath Coding scheme
I Posterior matching scheme
I Multiuser ChannelI Multiple access channel
I Two way channel
I Source coding with feedforward
51/75
hey
51
Multiple Access channel, No feedback
p(Y |X1X2)
X1
X2
Y
Capacity region:
R1 < I (X1;Y |X2)
R2 < I (X2;Y |X1)
R1 + R2 < I (X1X2;Y )
for all p(X1)p(X2)p(Y |X1X2)
52/75
hey
52
Multiple access channel, No feedback
1
1
1
2
1
2
R1
R2
R1 < 1
R2 < 1
R1 + R2 < 1.5
53/75
hey
53
Does feedback help in MAC?
Yes! Gaarder-Wolf 1975
54/75
hey
54
Erasure MAC with feedback
I Rsym = 2/3: N uncoded transmissions + N/2 one-sidedretransmissions:
transmitter 1: 010010101011100
transmitter 2: 110100011011001
Output: 120110112022101
prob. of erasure=1/2⇒ N/2 bits are erased.
I transmitter 1 retransmits the erased bits over the next N/2transmissions
N bits are sent over N + N/2 transmission ⇒ R = 2/3 is achievable
55/75
hey
55
block feedback coding scheme:Erasure MAC with feedback
I Rsym = 3/4 : N uncoded transmissions + N/4 two-sidedretransmissions + N/16 + ...
I the two encoders can cooperate by each sending half of the N/2erased bits over the following N/4 transmissions
R =N
N + N/4 + N/16 + ...= 3/4
1
1
R1
R2
3/4
3/4
56/75
hey
56
Erasure MAC with feedback (Gaarder-Wolf 1975)
I Rsym = 0.7602 : N uncoded transmissions + N/(2 log 3) cooperativeretransmission
I Can cooperate and use three symbols: (0, 0), (1, 1) and (1, 0)
resolve erasure at log2 3 bit/channel use
R =N
N + N/(2 log 3)= 0.7602
1
1
R1
R2
(0.76, 0.76)
57/75
hey
57
Can we do better? Cover-Leung inner bound
Rsym = 0.7911 (Cover-Leung 1981)
Theorem(R1,R2) is achievable for MAC with feedback if
R1 < I (X1;Y |X2,U),
R2 < I (X2;Y |X1,U)
R1 + R2 < I (X1,X2;Y )
for some p(u)p(x1|u)p(x2|u)
Enc1
Enc2
P (Y |X1X2) Dec
Y n(j � 1)
M̃2,j�1 , M1,j
, M2,jM2,j�1
Xn1 (j)
Xn2 (j)
58/75
hey
58
Cover-Leung Achievability proof
I Block Markov coding:
Messages are sent over b blocks of transmission.
59/75
hey
59
Outline
I Introduction
I point to point communicationI Horstein coding Scheme
I Block Feedback Coding Scheme for BSC
I Schalkwijk-Kailath Coding scheme
I Posterior matching scheme
I Multiuser ChannelI Multiple access channel
I Two way channel
I Source coding with feedforward
60/75
hey
60
Two way channel
Shannoninner bound:
R1 < I (X1;Y |X2)
R2 < I (X2;Y |X1)
for some p(x1)p(x2)
61/75
hey
61
Two way channel
Shannon outer bound:
R1 < I (X1;Y |X2)
R2 < I (X2;Y |X1)
for some p(x1, x2)
62/75
hey
62
Directed information
1 Entropy
H(Y n) =n∑
i=1
H(Yi |Y i−1)
2 Conditional entropy
H(Y n|X n) =n∑
i=1
H(Yi |Y i−1,X n)
3 Causally-conditioned entropy
H(Y n||X n) =n∑
i=1
H(Yi |Y i−1,X i )
1− 2⇒ I (Y n;X n) Mutual information1− 3⇒ I (Y n → X n) Directed information
63/75
hey
63
Directed infromation
Directed information from a random vector AN to another random vectorBN is:
I (AN → BN) =N∑
n=1
I (An;Bn|Bn−1)
mutual information:
I (AN ;BN) =N∑
n=1
I (AN ;Bn|Bn−1)
64/75
hey
64
TWC- Capacity region Kramer 2003
TheoremLet RN be the set of rate pairs (R1,R2)
R1 ≤1
NI (XN
1 → Y N ||XN2 )
R2 ≤1
NI (XN
2 → Y N ||XN1 )
for some p(xN1 ||yN−1)p(xN2 ||yN−1). Then R = ∪NRN
65/75
hey
65
Outline
I Introduction
I Point to point communicationI Horstein coding Scheme
I Block Feedback Coding Scheme for BSC
I Schalkwijk-Kailath Coding scheme
I Posterior matching scheme
I Multiuser ChannelI Multiple access channel
I Two way channel
I Source coding with feedforward
66/75
hey
66
Source coding with side information
Time
Source
Encoder
Side info
1 2 3 4 5 6 7 8 9 10
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Y10
X̂1 X̂2 X̂3 X̂4 X̂5
W W
Decoder
Encoder DecoderX
Y
X̂W
Block length = 5
67/75
hey
67
Source coding with feedforward
Time
Source
Encoder
Side info
1 2 3 4 5 6 7 8 9 10
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10
X1 X2 X3 X4
X̂1 X̂2 X̂3 X̂4 X̂5
W W
Decoder
Encoder DecoderX X̂W
Block length = 5 Delay = 6 Delay 1 FF
68/75
hey
68
Source coding with Feedforward
(N, 2NR) source code:encoding function
f : XN → {1, .., 2NR}Decoding function:
gn : {1, .., 2NR} × X n−1 → X̂ , n = 1, ..,N
69/75
hey
69
Directed information
1 Entropy
H(Y n) =n∑
i=1
H(Yi |Y i−1)
2 Conditional entropy
H(Y n|X n) =n∑
i=1
H(Yi |Y i−1,X n)
3 Causally-conditioned entropy
H(Y n||X n) =n∑
i=1
H(Yi |Y i−1,X i )
1− 2⇒ I (Y n;X n) Mutual information1− 3⇒ I (Y n → X n) Directed information
70/75
hey
70
Directed infromation
Directed information from a random vector AN to another random vectorBN is:
I (AN → BN) =N∑
n=1
I (An;Bn|Bn−1)
mutual information:
I (AN ;BN) =N∑
n=1
I (AN ;Bn|Bn−1)
71/75
hey
71
Directed information from the reconstruction X̂N to the source XN :
I (X̂N → XN) = I (XN ; X̂N)−N∑
n=2
I (X n−1; X̂n|X̂ n−1)
72/75
hey
72
I a direct coding theorem for a general source with feed-forwardassuming that the joint random process {Xn, X̂n} is discrete,stationary and ergodic
I for stationary and ergodic joint processes, the directed informationrate exists and is defined by
I (X̂ → X ) = limN→∞
1
NI (X̂N → XN)
73/75
hey
73
TheoremFor a discrete stationary and ergodic source X characterized by adistribution PX , all rates R such that
R ≥ R∗(D) = infPX̂|X :limN→∞ E [dN (XN ,X̂N )]≤D
I (X̂ → X )
are achievable at expected distortion D.
Proof.The proof uses AEP for directed qualities.
− 1
NlogP(X̂N ||XN)→ H(X̂ ||X ) w .p.1
define a new kind of typicality that we call ”directed typicality”.
74/75
hey
74
Thank you
Questions?
75/75
hey
75