Arithmetic coding - EIT, Electrical and Information … · Arithmetic coding Irina Bocharova,...
Transcript of Arithmetic coding - EIT, Electrical and Information … · Arithmetic coding Irina Bocharova,...
Arithmetic coding
Irina Bocharova,University of Aerospace Instrumentation,
St.-Petersburg, Russia
Outline
� Shannon-Fano-Elias coding
� Gilbert-Moore coding
� Arithmetic coding as a generalization of SFEand GM coding
� Implementation of arithmetic coding
Lund, Sweden, February, 2005 1:22
g
Let x ∈ X = {1, . . . , M}, p(x) > 0,p(1) ≥ p(2) ≥ · · · ≥ p(M).The cumulative sum is associated with the sym-bol x
Q(x) =∑
a<x
p(a),
that is,Q(1) = 0,Q(2) = p(1), . . . , Q(M) =
∑M−1i=1 p(i).
Then �Q(m)�lm is a codeword for m,where lm = −�log2 p(m)�
Lund, Sweden, February, 2005 2:22
p
x p(x) Q Q in binary l(x) codeword1 0.6 0 0.0 1 02 0.3 0.6 0.1001. . . 2 103 0.1 0.9 0.1110. . . 4 1110
L = 1.6 bits H(X) = 1.3 bits
Lund, Sweden, February, 2005 3:22
g
If lm binary symbols have been already transmit-ted then the length of the interval of uncertaintyis 2−lm. Thus we can decode uniquely if
2−lm ≤ p(m)
or
lm ≥ − log2 p(m)
Choosing length lm we used only right segmentwith respect to the point Q(m). This segment isalways shorter than the corresponding left seg-ment since symbol probabilities are ordered indescending order.
H(X) ≤ L < H(X) + 1.
Lund, Sweden, February, 2005 4:22
g
Let x ∈ X = {1, . . . , M}, p(x) > 0.The cumulative sum is associated with the sym-bol x
Q(x) =∑
a<x
p(a),
that is,Q(1) = 0, Q(2) = p(1), . . . , Q(M) =
∑M−1i=1 p(i).
Introduce σ(x) = Q(x) + p(x)2
Then σ̂(m) = �σ(m)�lm is a codeword for m,where lm = −�log2(p(m)/2)�.
Lund, Sweden, February, 2005 5:22
g
We put point σ(m) to the center of the segmentQ(m) + p(m)/2 and choose length of codewordin such a manner that if lm binary symbols havebeen transmitted the length of the interval ofuncertainty is less than or equal to p(m)/2.
Lund, Sweden, February, 2005 6:22
p
x p(x) Q σ l GM ShFE1 0.1 0.0 0.00001... 5 00001 00002 0.6 0.0001.. 0.01100... 2 01 03 0.3 0.10110... 0.11011... 3 110 10
L = 2.6 bits H(X) = 1.3 bits
Lund, Sweden, February, 2005 7:22
g
Let i < j then σ(j) > σ(i)
σ(j)− σ(i) =j−1∑
l=1
p(l)−i−1∑
l=1
p(l) +p(j)
2− p(i)
2
=j−1∑
l=i
p(l) +p(j)− p(i)
2≥ p(i) +
p(j)− p(i)
2
≥ p(i) + p(j)
2≥ max{p(i), p(j)}
2
Since lm = �− log2p(m)
2 � ≥ − log2p(m)
2
we obtain
σ(j)− σ(i) ≥ max{p(i), p(j)}2
≥ 2−min{li,lj}.
H(X) + 1 ≤ L < H(X) + 2
Lund, Sweden, February, 2005 8:22
g
When symbol-by-symbol coding is not effi-
cient?
1. Memoryless source
For symbol-by-symbol coding
R = H(X) + α,
where α is coding redundancy.For block coding
R =H(Xn) + α
n=
nH(X) + α
n= H(X) +
α
n,
where H(Xn) denotes entropy of n random vari-ables.If H(X) << 1 R ≥ 1 for symbol-by-symbol cod-ing. For binary memoryless source with p(0) =0.99, p(1) = 0.01 H(X) = 0.081 bits and we caneasily construct the Huffman code with R = 1bit but it is impossible to obtain R < 1 bit.2.Source with memory
H(Xn) ≤ nH(X) and
R =H(Xn) + α
n≤ H(X) +
α
n.
R→ H∞(X)
when n→∞, H∞(X) denotes entropy rate.
Lund, Sweden, February, 2005 9:22
g
How to implement block coding?
Let x ∈ X = {1, . . . , M}, and we are going toencode sequences x = (x1, . . . , xn) which appearat the output of X during n consecutive timemoments.
We can consider a new source Xn with symbolscorresponding to the sequences x = (x1, . . . , xn)
of length n and apply any method of symbol-by-symbol coding to these symbols. We will obtain
R =H(Xn)
n+
α
n,
where α depends on the chosen coding proce-dure.
The problem is coding complexity. The alpha-bet of the new source is of size Mn. For example,if M = 28 = 256 then for n = 2 M2 = 65536,and for n = 3 M3 = 16777216.
The arithmetic coding provides redundancy 2/n
with complexity n2.
Lund, Sweden, February, 2005 10:22
g
Arithmetic coding is a direct extension of theGilbert-Moore coding scheme.
Let x = (x1, x2, . . . , xn) be an M-ary sequence oflength n. We construct the modified cumulativedistribution function
σ(x) =∑
a≺xp(a) +
p(x)
2= Q(x) +
p(x)
2,
where a ≺ x means that a is lexicographically lessthan x, l(x) = −�log2(p(x)/2)�.
The code rate R is equal to1
n
∑
xp(x)l(x) =
1
n
∑
xp(x)(�log2
1
p(x)�+ 1)
<H(Xn) + 2
n
If the source generates symbols independentlywe obtain
R < H(X) +2
n.
For source with memory
R→ H∞(X)
when n→∞.
Lund, Sweden, February, 2005 11:22
p g
Consider
Q(x[1,n]) =∑
a≺xp(a) =
∑
a:a[1,n−1]≺x[1,n−1],an
p(a)+
∑
a:a[1,n−1]=x[1,n−1],an≺xn
p(a),
where x[1,i] = x1, x2, . . . , xi. It is easy to see that
Q(x[1,n]) = Q(x[1,n−1])+∑
a:a[1,n−1]=x[1,n−1],an≺xn
p(a)
= Q(x[1,n−1]) + p(a[1,n−1])∑
an≺xn
p(an/a[1,n−1]).
If the source generates symbols independently
p(a[1,n−1]) =n−1∏
i=1
p(ai),
∑
an≺xn
p(an/a[1,n−1]) =∑
an≺xn
p(an) = Q(xn),
where Q(xi) denotes the cumulative probabilityfor xi.
Lund, Sweden, February, 2005 12:22
p g
We obtain the following recurrent equations
Q(x[1,n]) = Q(x[1,n−1]) + p(x[1,n−1])Q(xn),
p(x[1,n−1]) = p(x[1,n−2])p(xn−1).
Lund, Sweden, February, 2005 14:22
p g
Coding procedure
x = (x1, . . . , xn)
Initialization
F = 0;G = 1; Q(1) = 0;
for j = 2 : M
Q(j) = Q(j − 1) + p(j − 1);
end;
for i = 1 : n
F ← F + Q(xi)×G;
G← G× p(xi);
end;
F = F + G/2; l = −�log2 G/2�; F̂ ← �F ∗ 2l�;
Lund, Sweden, February, 2005 15:22
p g
X = {a, b, c},p(a) = 0.1, p(b) = 0.6, p(c) = 0.3
x = (bcbab), n = 5
i xi p(xi) Q(xi) F G0 - - - 0.0000 1.00001 b 0.6 0.1 0.1000 0.60002 c 0.3 0.7 0.5200 0.18003 b 0.6 0.1 0.5380 0.10804 a 0.1 0.0 0.5380 0.01085 b 0.6 0.1 0.5391 0.0065
Codeword length −�log2 G�+ 1 = 9
F + G/2 = 0.5423... andcodeword F̂ = �F + G/2�l = 100010101
H(X) = 1.3 bits R = 1.8 bit/symbol
Lund, Sweden, February, 2005 16:22
p g
At each step of the coding algorithm we perform1 addition and 2 multiplications.Let p(1), . . . , p(M) be numbers with binary rep-resentation of length d. Then at the first stepF and G will be numbers with binary representa-ton of length 2d. Next steps will require lengthof binary representation 3d, . . . , nd.
The complexity of coding procedure can be es-timated as
d + 2d + · · ·+ nd =n(n + 1)d
2
PROBLEMS
1. Algorithm requires high computational accu-racy (theoretically infinite)
2. Computational delay=length of the sequenceto be encoded.
Lund, Sweden, February, 2005 17:22
p g
Decoding of Gilbert-Moore code
Q(m), m = 1, . . . , M are known.Input:σ̂.
Set m = 1
While Q(m + 1) < σ̂ m← m + 1
end;
Output: x(m)
Example.
σ̂ = 0.01→ σ̂ = 0.25
Q(2) = 0.1 < 0.25 m = 2
Q(3) = 0.7 > 0.25 stop with m = 2.
Lund, Sweden, February, 2005 18:22
p g
Decoding procedure:
F̂ ← F̂ /2l;S = 0;G = 1;
for i = 1 : n
j = 1;
while S + Q(j + 1)×G < F̂andj ≤M
j ← j + 1
end;
S ← S + Q(j)×G;
G← G×Q(j);
xi = j;
end;
At the ith step G = p(x[1,i]) and S = Q(x[1,i]).
Lund, Sweden, February, 2005 19:22
p g
a, b, c p(a) = 0.1, p(b) = 0.6, p(c) = 0.3
Codeword 0100010101 F̂ = 0.541
F̂ = 0.0100010101
S G Hyp. Q S + QG xi p0.0000 1.000 a 0.0 0.0000 < F̂
b 0.1 0.1000 < F̂ b 0.6c 0.7 0.7000 > F̂
0.1000 0.6000 a 0.0 0.1000 < F̂b 0.1 0.1600 < F̂ c 0.3c 0.7 0.5200 < F̂
0.5200 0.1800 a 0.0 0.5200 < F̂
b 0.1 0.5380 < F̂ b 0.6c 0.7 0.6460 > F̂
0.5380 0.1080 a 0.0 0.5380 < F̂b 0.1 0.5488 > F̂ a 0.1
0.5380 0.0108 a 0.0 0.5380 < F̂
b 0.1 0.5391 < F̂ b 0.6c 0.7 0.5456 > F̂
Lund, Sweden, February, 2005 20:22
p g
1. High < 0.5
Bit = 0;
Normalization:
Low = Low × 2High = High× 2Low = 0;High = 0.00011000001Bit = 0;High = 0.0011000001Bit = 0;High = 0.011000001Bit = 0;High = 0.11000001
2.Low > 0.5
Bit = 1;
Normalization:
Low = Low − 0.5;Low = Low × 2High = High− 0.5; High = High× 2Low = 0.11000011Bit = 1;Low = 0.1000011Bit = 1;Low = 0.000011
Lund, Sweden, February, 2005 21:22