NUMBER REPRESENTATION CHAPTER 3 – part 3. ONE’S COMPLEMENT REPRESENTATION CHAPTER 3 – part 3.
MATH3411 Chapter 3
-
Upload
david-vang -
Category
Documents
-
view
5 -
download
0
description
Transcript of MATH3411 Chapter 3
Chapter 3: Compression Coding
Lectures 10-11
Compression coding
Variable length codesAssume that there is no channel noise: source coding
Define
source S with symbols s1, . . . , sqwith probabilities p1, . . . , pq
code C with codewords c1, . . . , cqof lengths ℓ1, . . . , ℓq
and radix r
A code C isuniquely decodeable (UD) if it can always be decoded unambiguouslyinstantaneous if no codeword is the prefix of anotherSuch a code is an I-code.
ExampleMorse code is an I-code (due to the stop p).
A •−p N −•p 1 •−−−−pB −•••p O −−−p 2 ••−−−pC −•−•p P •−−•p 3 •••−−pD −••p 0.11% Q −−•−p 4 ••••−p
12.5% E •p R •−•p 5 •••••pF ••−•p S •••p 6 −••••pG −−•p 9.25% T −p 7 −−•••pH ••••p U ••−p 8 −−−••pI ••p V •••−p 9 −−−−•pJ •−−−p W •−−p 0 −−−−−pK −•−p X −••−pL •−••p Y −•−−pM −−p Z −−••p (See Appendix 1 for full list)
1
ExampleThe standard comma code of length 5 is
s1 → c1 = 0s2 → c2 = 10s3 → c3 = 110s4 → c4 = 1110s5 → c5 = 11110s6 → c6 = 11111
This code is an I-code.Decode
1 1 0 0 1 1 1 1 0 1 1 0 1 0 1 1 1 1 1 1 1 0
ass3s1s5s3s2s6s3
ExampleConsider the code C:
s1 → c1 = 0s2 → c2 = 01s3 → c3 = 11s4 → c4 = 00
This code is not uniquely decodable since, for example,
0011
can be decoded as s1s1s3 and s4s3.
2
ExampleConsider the code C:
s1 → c1 = 0s2 → c2 = 01s3 → c3 = 011s4 → c4 = 0111s5 → c5 = 1111
This code is uniquely decodable but is not instantaneous.
ExampleConsider the code C:
s1 → c1 = 00s2 → c2 = 01s3 → c3 = 10s4 → c4 = 11
This code is a block code and is thus an I-code.
ExampleConsider the code C:
s1 → c1 = 0s2 → c2 = 100s3 → c3 = 1011s4 → c4 = 110s5 → c5 = 111
This code is an I-code.Decode
0 0 1 1 0 0 1 0 1 1 1 1 1
ass1s1s4s1s3s5
3
Decision trees can represent I-codes.
ExampleThe standard comma code of length 5 is
s1 → c1 = 0s2 → c2 = 10s3 → c3 = 110s4 → c4 = 1110s5 → c5 = 11110s6 → c6 = 11111
s10
1
s20
1
s30
1
s40
1
s50
s61
ExampleConsider the block code C:
s1 → c1 = 00s2 → c2 = 01s3 → c3 = 10s4 → c4 = 11
0
s10
s21
1s30
s414
ExampleConsider the code C:
s1 → c1 = 0s2 → c2 = 100s3 → c3 = 1011s4 → c4 = 110s5 → c5 = 111
s10
1
0
s20
1s3
11
1s40
s51
Decision trees can represent I-codes.
Branches are numbered from the top down.Any radix r is allowed.Two codes are equivalent if their decision trees are isomorphic.By shuffling source symbols, we may assume that ℓ1 ≤ ℓ2 ≤ · · · ≤ ℓq .
Example
s1 → c1 = 00s2 → c2 = 01s3 → c3 = 02s4 → c4 = 1s5 → c5 = 20s6 → c6 = 21
Code
10111202120
Equivalent codes4
0
1
s10
s21
s32
2 s60
s515
The Kraft-McMillan TheoremThe following are equivalent:
1 There is a radix r UD-code with codeword lengths ℓ1 ≤ ℓ2 ≤ · · · ≤ ℓq2 There is a radix r I-code with codeword lengths ℓ1 ≤ ℓ2 ≤ · · · ≤ ℓq
3 K =
q∑
i=1
(1
r
)ℓi
≤ 1
ExampleIs there a radix 2 UD-code with codeword lengths 1, 2, 2, 3 ?No, by the Kraft-McMillan Theorem:
(1
2
)1
+(1
2
)2
+(1
2
)2
+(1
2
)3
=9
8� 1
ExampleIs there a radix 3 I-code with codeword lengths 1, 2, 2, 2, 2, 3 ?Yes, by the Kraft-McMillan Theorem:
(1
3
)1
+(1
3
)2
+(1
3
)2
+(1
3
)2
+(1
3
)2
+(1
3
)3
=22
27≤ 1
For instance,
s1 → c1 = 0s2 → c2 = 10s3 → c3 = 11s4 → c4 = 12s5 → c5 = 20s6 → c6 = 210
s10
1
s20
s31
s42
2 s50
1 s60
This is a standard I-code.
6
Proof 2 ⇒ 1 is trivial. We now prove that 1 ⇒ 3 .Suppose that a radix r UD-code has codeword lengths ℓ1 ≤ ℓ2 ≤· · ·≤ ℓq.Note that
Kn =( q∑
i=1
(1
r
)ℓi)n
=∞∑
j=1
Nj
rj
where Nj = |{(i1, . . . , in) ∈ {ℓ1, . . . , ℓq}n : i1 + · · · + in = j}|.
For instance, if (ℓ1, . . . , ℓq) = (2, 3) and n = 3, then
Kn =( q∑
i=1
(1
r
)ℓi)n
=( 1
r2+
1
r3
)3
=( 1
r2+
1
r3
)( 1
r2+
1
r3
)( 1
r2+
1
r3
)
=1
r2+2+2+
1
r2+2+3+
1
r2+3+2+
1
r2+3+3+
1
r3+2+2+
1
r3+2+3+
1
r3+3+2+
1
r3+3+3
=1
r6+
3
r7+
3
r8+
1
r9
Here, N6 = N9 = 1 and N7 = N8 = 3 and Nj = 0 if j 6= 6, 7, 8, 9 .
Now, Nj counts the ways to write n-codeword messages of length j.Since the code is UD, each such message can only be written in one way,so Nj ≤ rj. Therefore,
Kn =
nℓq∑
j=1
Nj
rj≤
nℓq∑
j=1
rj
rj=
nℓq∑
j=1
1 = nℓq
This inequality holds for all n = 1, 2, . . . ,but the left-hand side is exponential whereas the right-hand side is linear.We conclude that K ≤ 1.
We have proved that 2 ⇒ 1 and that 1 ⇒ 3 .To conclude the proof, let us also prove that 3 ⇒ 2 (just for r = 2).Therefore, suppose that K ≤ 1; we wish to construct an I-code.Set c1 = 0 · · · 0
︸ ︷︷ ︸
ℓ1
and c2 = 0 · · · 01︸ ︷︷ ︸
ℓ1
0 · · · 0︸ ︷︷ ︸
ℓ2−ℓ1
.
7
For i ≥ 3, setci = ci1ci2 · · · ciℓi
where ci1, ci2, . . . , ciℓi satisfy
i−1∑
j=1
1
2ℓj=
ci1
2+
ci2
22+ · · · +
ciℓi2ℓi
=
ℓi∑
k=1
cik
2k
Such ci1, . . . , ciℓi exist since
q∑
j=1
1
2ℓj= K ≤ 1 .
For instance,
ℓ1 = 2 c1 = 00ℓ2 = 3 c2 = 010ℓ3 = 3 c3 = 011ℓ4 = 4 c4 = 1000
3∑
j=1
1
2ℓj=
1
22+
1
23+
1
23=
1
2=
1
21+
0
22+
0
23+
0
24
These binary expansions are unique, so the code is UD.Assume that the code is not instantaneous.Then some cu is a prefix of some cv where u < v.Then ℓu < ℓv, so
v−1∑
j=1
1
2ℓj−
u−1∑
j=1
1
2ℓj=
v−1∑
j=u
1
2ℓj≥
1
2ℓu
However, we also havev−1∑
j=1
1
2ℓj−
u−1∑
j=1
1
2ℓj=
ℓv∑
k=1
cvk
2k−
ℓu∑
k=1
cuk
2k=
ℓv∑
k=ℓu+1
cvk
2k≤
ℓv∑
k=ℓu+1
1
2k
<
∞∑
k=ℓu+1
1
2k
=1
2ℓu
This is a contradiction, so the proof is finished. ✷8
Chapter 3: Compression Coding
Lecture 12
Define
source S with symbols s1, . . . , sqwith probabilities p1, . . . , pq
code C with codewords c1, . . . , cqof lengths ℓ1, . . . , ℓq
and radix r
By shuffling source symbols, we may assume that p1 ≥ p2 ≥ · · · ≥ pq.
The (expected or) average length and variance of codewords in C are
L =
q∑
i=1
piℓi V =
(
q∑
i=1
piℓ2
i
)
− L2
A UD-code is minimal with respect to p1, . . . , pq if it has minimal length.
Example
A code C has the codewords 0, 10, 11 with probabilities1
2,1
4,1
4.
Its average length and variance are
L =1
2× 1 +
1
4× 2 +
1
4× 2 =
3
2
V =1
2× 1
2+
1
4× 2
2+
1
4× 2
2 − L2=
5
2−
(
3
2
)2
=1
4
It is easy to see that C is minimal with respect to1
2,1
4,1
4.
Example
A code C ′ has the codewords 10, 0, 11 with probabilities1
2,1
4,1
4.
Its average length is
L =1
2× 2 +
1
4× 1 +
1
4× 2=
7
4>
3
2
We see that C ′ is not minimal with respect to1
2,1
4,1
4. 9
TheoremIf a binary UD-code has minimal average length L with respect to p1, . . . , pq,then, possibly after permuting codewords of equally likely symbols,
1 ℓ1 ≤ ℓ2 ≤ · · · ≤ ℓq2 The code may be assumed to be instantaneous.3 K =
∑q
i=12−ℓi = 1
4 ℓq−1 = ℓq5 cq−1 and cq differ only in their last place.
Proof1 Suppose that pm > pn and ℓm > ℓn.Swapping cm and cn gives a new code with smaller L, a contradiction.
2 Use the Kraft-McMillan Theorem.
3 If K < 1, then the code can be shortened, reducing L, a contradiction.
4 We know that ℓq−1 ≤ ℓq. If ℓq−1 < ℓq, then there must be nodes in thedecision tree where no choice is made, implying K < 1, a contradiction.
5 The tree must end with a simple fork:
sq−10
sq1
Therefore, cq−1 and cq differ only in their last place. ✷
10
Huffman’s Algorithm (binary)
Input: a source S = {s1, . . . , sq} and probabilities p1, . . . , pqOutput: a code C for S, given by a decision tree
Combining phaseReplace the last 2 symbols sq−1 and sq
by a new symbol sq−1,q with probability pq−1 + pq.Reorder the symbols s1, . . . , sq−2, sq−1,q by their probabilities.Repeat until there is only one symbol left.
Splitting phaseRoot-label this symbol.Draw edges from symbol sa,b to symbols sa and sb.Label edge sasa,b by 0 and label edge sbsa,b by 1.
The resulting code depends on the reordering of the symbols.
ExampleIn the place-low strategy, we place sa,b as low as possible.Consider a source s1, . . . , s6 with probabilities 0.3, 0.2, 0.2, 0.1, 0.1, 0.1 .
s1
s2
s3
s4
s5
s6
0.3
0.2
0.2
0.1
0.1
0.1
0.1
0.2
0.2
0.2
0.3
0
1
0.2
0.2
0.3
0.3
0
1 0.3
0.3
0.4
0
1 0.4
0.60
1
1.00
1
00
10
11
011
0100
0101
codew
ords
L = 0.3×2 + 0.2×2 + 0.2×2 + 0.1×3 + 0.1×4 + 0.1×4= 2.5
V = 0.3×22 + 0.2×22 + 0.2×22 + 0.1×32 + 0.1×42 + 0.1×42 − L2= 0.6511
ExampleIn the place-high strategy, we place sa,b as high as possible.Consider a source s1, . . . , s6 with probabilities 0.3, 0.2, 0.2, 0.1, 0.1, 0.1 .
s1
s2
s3
s4
s5
s6
0.3
0.2
0.2
0.1
0.1
0.1
0.1
0.2
0.2
0.2
0.3
0
1
0.2
0.2
0.2
0.3
0.3
0
1
0.3
0.3
0.3
0.4
0
1 0.4
0.60
1
1.00
1
01
11
000
001
100
101
codew
ords
L = 0.3×2 + 0.2×2 + 0.2×3 + 0.1×3 + 0.1×3 + 0.1×3= 2.5
V = 0.3×22 + 0.2×22 + 0.2×32 + 0.1×32 + 0.1×32 + 0.1×32 − L2= 0.25
The average length is the same as for the place-low strategy- but the variance is smaller. It turns out that this is always the case,so we will use only use the place-high strategy.
The Huffman Code TheoremFor any given source S and corresponding probabilities,the Huffman algorithm yields an instantaneous minimum UD-code.
12
Chapter 3: Compression Coding
Lectures 13-14
The Huffman Code TheoremFor any given source S and corresponding probabilities,the Huffman algorithm yields an instantaneous minimum UD-code.
ProofWe proceed by induction on q = |S|. For q = 2, each Huffman code isan instantaneous UD-code with minimum average length L = 1.
Now assume that each Huffman code on q−1 symbols is an instantaneousUD-code with minimum average length.Let C be a Huffman code on q symbols with average length L andlet C∗ be any UD-code q symbols with minimum average length L∗.Denote the codeword lengths of C and C∗ by ℓ1, . . . , ℓq and ℓ∗1, . . . , ℓ
∗
q.By construction, cq and cq−1 in C differ only in their last coordinate.By minimality, C∗ has codewords c∗q, c
∗
q−1 differing only in the last coordinate.Combine cq and cq−1 in C to get a Huffman code on q − 1 symbols andcombine c∗q and c∗q−1 in C∗ to get a UD-code on q − 1 symbols.Denote the average lengths of these codes by M and M ∗, respectively.By the induction hypothesis M ≤ M ∗, so
L− L∗ =
q∑
i=1
ℓipi −
q∑
i=1
ℓ∗i pi
=(( q−2
∑
i=1
ℓipi
)
+ ℓq−1pq−1 + ℓqpq
)
−(( q−2
∑
i=1
ℓ∗i pi
)
+ ℓ∗q−1pq−1 + ℓ∗qpq
)
=(( q−2
∑
i=1
ℓipi
)
+ ℓq(pq−1 + pq))
−(( q−2
∑
i=1
ℓ∗i pi
)
+ ℓ∗q(pq−1 + pq))
=(( q−2
∑
i=1
ℓipi
)
+(ℓq−1)(pq−1+pq))
−(( q−2
∑
i=1
ℓ∗i pi
)
+(ℓ∗q−1)(pq−1+pq))
= M −M ∗≤ 0
Thus L ≤ L∗, so the Huffman code has minimum average length.The code is created using a decision tree, so it is instantaneous.The proof follows by induction. ✷13
Theorem (Knuth)The average codeword length L of each Huffman code isthe sum of all child node probabilities.
s1
s2
s3
s4
s5
s6
0.3
0.2
0.2
0.1
0.1
0.1
0.1
0.2
0.2
0.2
0.3
0
1
0.2
0.2
0.2
0.3
0.3
0
1
0.3
0.3
0.3
0.4
0
1 0.4
0.60
1
1.00
1
01
11
000
001
100
101
codew
ords
L = 0.3×2 + 0.2×2 + 0.2×3 + 0.1×3 + 0.1×3 + 0.1×3= 2.5
L = 1.0 + 0.6 + 0.4 + 0.3 + 0.2 = 2.5
ProofThe tree-path for symbol si will pass through exactly ℓi child nodes.But pi will occur as part of the sum in each of the these child nodes.So, adding all child node probabilities adds ℓi copies of pi for each si;this is L. ✷
Huffman’s Algorithm also works for radix r:just combine r symbols at each step instead of 2.
However, there are at least two ways to do this:1 Combine the last r symbols at each combining step.2 First add dummy symbols; then combine the last r symbols at each step.
It turns out that 2 is the best strategy.If there are k combining steps, then we need
|S| = k(r − 1) + r = (k + 1)(r − 1) + 1
initial symbols. In other words, we must have |S| ≡ 1 (mod r − 1).We can ensure this by adding dummy symbols. 14
Huffman’s Algorithm (radix r)
Input: a source S = {s1, . . . , sq} and probabilities p1, . . . , pqOutput: a code C for S, given by a decision tree
Add dummy symbols until q = |S| ≡ 1 (mod r − 1).
Combining phaseReplace the last r symbols sq−r+1, . . . , sq
by a new symbol with probability pq−r+1 + · · · + pq.Reorder the symbols by their probabilities.Repeat until there is only one symbol left.
Splitting phaseRoot-label this symbol.Draw edges from each child node to the r preceding nodes.Label these edges from top to bottom by 0, . . . , q − 1.
Example
Consider a source s1, . . . , s6 with probabilities 0.3, 0.2, 0.2, 0.1, 0.1, 0.1 .
s1
s2
s3
s4
s5
s6
dummy
0.3
0.2
0.2
0.1
0.1
0.1
0
0.1
0.2
0.2
0.2
0.3
0
1
2
0.2
0.2
0.3
0.50
1
2
1
0
1
2
1
00
01
02
20
21
codew
ords
L = 1.0 + 0.5 + 0.2 = 1.7
V = 0.3× 12 + · · · + 0.1× 22 − L2 = 0.21
With radix r = 3, there are 6 (mod r − 1) = 0 symbols.We need to add 1 dummy symbol.
15
Extensions
Given a source S = {s1, . . . , sq} with associated probabilities p1, . . . , pq,the nth extension is the Cartesian product
Sn = S × · · · × S︸ ︷︷ ︸
n
= {s′1 · · · s′
n : s′1, . . . , s′
n ∈ S} = {σ1, . . . , σqn}
together with the following probability for each new symbol σi ∈ Sn:
p(n)i = P (σi) = P (s′1 · · · s
′
n) = P (s′1) · · ·P (s′n)
Note that we implictly assume that p1, . . . , pq are independent.
We usually order the symbols σi so that p(n)1 , . . . , p
(n)qn are non-increasing.
Example
Consider source S = {a, b} with associated probabilities3
4,1
4.
We apply the (binary) Huffman algorithm:
S1 = S pi ci S2 p(2)i ci S3 p
(3)i ci
a 34
0 aa 916
0 aaa 2764
1
b 14
1 ab 316
11 aab 964
001
ba 316
100 aba 964
010
bb 116
101 baa 964
011
abb 364
00000
bab 364
00001
bba 364
00010
bbb 164
00011
L(1) = L = 1
L(2) = 2716
L(3) = 15864
Average length per S-symbol for S: 1
Average length per S-symbol for S2: L(2)
2= 27
32≈ 0.84
Average length per S-symbol for S3: L(3)
3= 158
192≈ 0.82 16
Markov sources
A k-memory source S is one whose symbols each depend on the previous k.
If k = 0, then no symbol depends on any other, and S is memoryless.
If k = 1, then S is a Markov source.
pij = P (si|sj) is the probability of si occurring right after a given sj .
The matrix M = (pij) is the transition matrix.
Entry pij is the probability of getting from state sj to state si.
A Markov process is a set of states (the source S)and probabilities pij = P (si|sj) of getting from state sj to state si.
Example
Consider Sydney, Melbourne, and Elsewhere in Australia.A simple Markov model for the populations of these is that, each year,
population growth by births, deaths, and emmi-/immigration is 0%of people living in Sydney, 5%move to Melbourne; 3%move Elsewhereof people living in Melbourne, 4%move to Sydney; 2%move Elsewhereof people living Elsewhere, 7%move to Sydney; 6%move to Melbourne.
S = {Sydney, Melbourne, Elsewhere}
S M
E
0.92 0.94
0.87
0.05
0.04
0.07
0.03
0.
020.
06 M =
S M ES 0.92 0.04 0.07M 0.05 0.94 0.06E 0.03 0.02 0.87
From
To
LemmaThe sum of entries in any column of M is 1.
17
Let xk =
skmk
ek
denote the population distribution after k years.
Thenxk+1 = Mxk and xk = Mkx0
Suppose that the initial population distribution is x0 =
4.5M4M14M
.
After k = 20 years, the population distribution is then
x20 = M 20x0 =
0.41 0.34 0.380.42 0.52 0.440.16 0.15 0.19
4.5M4M14M
=
9.5M10.5M4M
Note that S and M 20 also form a Markov chain.Eg., after 20 years, most people will have left Sydney (41% remain),whereas most people will have stayed in Melbourne (52%).To find a stable population distribution, we need to find a state x0
for which xk = xk−1 = · · · = x1 = x0; that is, Mx0 = x0.
In other words, we need an eigenvector x0 of M for the eigenvalue 1; e.g.,
x0 =
0.60.760.26
. If we want an eigenvector with actual population numbers,
then we must scale x0 by 4.5M+4M+14M0.6+0.76+0.26
: x0 =
8.3M10.6M3.6M
A Markov process M is in equilibrium p if p = Mp.In this case, p is an eigenvector for M and the eigenvalue 1.
We will assume thatM is ergodic: we can get from any state j to any state i.M is aperiodic: the gcd of cycle lengths is 1.
TheoremUnder the above assumptions, M has a non-zero equilibrium state.
We will only consider equilibriums p with |p| = 1. 18
Chapter 3: Compression Coding
Lecture 15
Huffman coding for stationary Markov sources
Consider a Markov source S = {s1, . . . , sq} with probabilities p1, . . . , pq,transition matrix M and equilibrium p.
Define
HuffE : the binary Huffman code on p (ordered)
Huff(i): the binary Huffman code on the (ordered) ith column of M .
HuffM : s1 is encoded by HuffE ; for i > 1, si is encoded by Huff(i−1)
This gives average lengths
LE
L(1), . . . , L(q)
LM ≈ p1L(1) + · · · + pqL(q).
Importantly, LM ≤ LE.
Example
Transition matrix M =
0.3 0.1 0.10.5 0.1 0.550.2 0.8 0.35
has equilibrium p =1
8
134
.
pi HuffE
s118
01
s238
00
s312
1
pi Huff(1)
s1 0.3 00
s2 0.5 1
s3 0.2 01
pi Huff(2)
s1 0.1 10
s2 0.1 11
s3 0.8 0
pi Huff(3)
s1 0.10 11
s2 0.55 0
s3 0.35 10
LE = 1.5 L(1) = 1.5 L(2) = 1.2 L(3) = 1.45
LM =1
8L(1) +
3
8L(2) +
4
8L(3) ≈ 1.36 < LE
Therefore, compared to a 2-bit block code C,this Huffman code compresses the message length to
LM
LC
=1.36
2= 68%.
19
Let us now encode the message s1s2s3s3s2s1s2 :
symbol code to use encoded symbol
s1 HuffE 01s2 Huff(1) 1s3 Huff(2) 0s3 Huff(3) 10s2 Huff(3) 0s1 Huff(2) 10s2 Huff(1) 1
The message is encoded as 0110100101 .
Let us now decode the message 0110100101 :
code to use encoded symbol decoded symbol
HuffE 01 s1Huff(1) 1 s2Huff(2) 0 s3Huff(3) 10 s3Huff(3) 0 s2Huff(2) 10 s1Huff(1) 1 s2
The message is decoded as s1s2s3s3s2s1s2 .
20
Compression Coding
Huffman coding
Huffman coding of extensions
Huffman coding of Markov sources
Arithmetic coding
Dictionary methods
Lossy compression
much more
Arithmetic coding
Input: source S = {s1, . . . , sq} where sq = • is a stop-symbolprobabilities p1, . . . , pqA message si1 · · · sin where sin = sq = •
Output: The message encoded, given by a number between 0 and 1
Algorithm:Partition the interval [0, 1) into sub-intervals of lengths p1, . . . , pq.Crop to the i1th sub-interval.Partition this sub-interval according to relative lengths p1, . . . , pq.Crop to the i2th sub-sub-interval.Repeat in this way until the whole message has been encoded.
21
ExampleConsider symbols s1, s2, s3, s4 = • with probabilities 0.4, 0.3, 0.15, 0.15.Let us encode the message s2s1s3 • :
subinterval start subinterval width
0 1
s2 0 + .4 = .4 .3×1 = .3
s1 .4 + 0×.3 = .4 .4×.3 = .12
s3 .4 + .7×.12 = .484 .15×.12 = .018
• .484 + .85×.018 = .4993 .15×.018 = .0027
0 .4 .7 .85 1
We must therefore choose a number in the interval
[0.4993, 0.4993 + 0.0027) = [0.4993, 0.5020)
For instance, we may simply choose the number 0.5.
ExampleConsider symbols s1, s2, s3, s4 = • with probabilities 0.4, 0.3, 0.15, 0.15.Let us decode the number 0.5 :
0 .4 .7 .85 1s1 s2 s3 •
code number rescaled in interval decoded symbol
0.5 [0.4, 0.7) s2(0.5− 0.4)/.3 = 0.33333 [0, 0.4) s1
(0.3333− 0)/.4 = 0.83333 [0.7, 0.85) s3(0.8333− 0.7)/.15 = 0.88889 [0.85, 1) •
The decoded message is then s2s1s3 • .
22
Dictionary methods
LZ77, LZ78, LZW, othersFor instance used in gzip, gif, ps
LZ78
Input: a message r = r1 · · · rnOutput: The message encoded, given by a dictionary
Algorithm:Begin with a dictionary D = {∅}.Find the longest prefix s of r in D (possibly ∅), in entry ℓ.Find the symbol c just after s.Append sc to D, remove it from r, and output (ℓ, c).Repeat in this way until the whole message has been encoded.
Loosely speaking, LZ78 encodes by finding new codewords, adding themto a dictionary, and recognising them subsequently.
23
ExampleLet us encode the message abbcbcababcaa :
r s ℓ new dictionary entry output
abbcbcababcaa ∅ 0 1. a (0,a)
bbcbcababcaa ∅ 0 2. b (0,b)
bcbcababcaa b 2 3. bc (2,c)
bcababcaa bc 3 4. bca (3,a)
babcaa b 2 5. ba (2,a)
bcaa bca 4 6. bcaa (4,a)
The message is encoded as (0,a)(0,b)(2,c)(3,a)(2,a)(4,a)
ExampleLet us decode the message (0,c)(0,a)(2,a)(3,b)(4,c)(4,b) :
output new dictionary entry
(0,c) 1. c
(0,a) 2. a
(2,a) 3. aa
(3,b) 4. aab
(4,c) 5. aabc
(4,b) 6. aabb
The message is encoded as caaaaabaabcaabb
24