Huffman coding.ppt

vace1 80 views 37 slides May 29, 2023
Slide 1
Slide 1 of 37
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17
Slide 18
18
Slide 19
19
Slide 20
20
Slide 21
21
Slide 22
22
Slide 23
23
Slide 24
24
Slide 25
25
Slide 26
26
Slide 27
27
Slide 28
28
Slide 29
29
Slide 30
30
Slide 31
31
Slide 32
32
Slide 33
33
Slide 34
34
Slide 35
35
Slide 36
36
Slide 37
37

About This Presentation

huffman coding


Slide Content

Huffman coding

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 2
Optimal codes -I
A code is optimal if it has the shortest
codeword length L
This can be seen as an optimization problem1
m
ii
i
L p l

 1
1
min
subject to 1
i
m
ii
i
m
l
i
lp
D





Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 3
Optimal codes -II
Let’s make two simplifying assumptions
no integer constraint on the codelengths
Kraft inequality holds with equality
Lagrange-multiplier problem11
1
i
mm
l
ii
ii
J p l D



  


 0 log 0
log
jj
ll j
j
j
pJ
p D D D
lD



     

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 4
Optimal codes -III
Substitute into the Kraft
inequality
that is
Note thatlog
j
l j
p
D
D

 1
1
1
log log
i
m
li
i
i
p
pD
DD




     *
log
i D i
lp **
11
log ( ) !!
mm
i i i D i
ii
D
p l p pL H X

 
the entropy, when we use
base Dfor logarithms

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 5
Optimal codes -IV
In practice the codeword lengths must be
integer value, so obtained results is a lower
bound
Theorem
The expected length of any istantaneous D-ary code
for a r.v. Xsatisfies
this fundamental result derives frow the work of Shannon()
D
LHx

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 6
Optimal codes -V
What about the upper bound?
Theorem
Given a source alphabet (i.e. a r.v.) of entropy it
is possible to find an instantaneous binary code which
length satisfies
A similar theorem could be stated if we use the wrong
probabilities instead of the true ones ; the only
difference is a term which accounts for the relative entropy()HX ( ) ( ) 1H X L H X   
i
p 
i
q

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 7
The redundance
It is defined as the average codeword
legths minus the entropy
Note that
(why?)Redundancy log
ii
i
L p p

  


 0 redundancy 1

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 8
Compression ratio
It is the ratio between the average number
of bit/symbol in the original message and the
same quantity for the coded message, i.e.average original symbol length
average compressed symbol length
C


 ( )!!LX

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 9
Uniquely decodable codes
The set of the instantaneous codes are
a small subset of the uniquely
decodable codes.
It is possible to obtain a lower average
code length Lusing a uniquely
decodable code that is not
instantaneous? NO
So we use instantaneous codes that are easier to
decode

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 10
Summary
Average codeword length L
 for uniquely decodable codes
(and for instantaneous codes)
In practice for each r.v. with entropy
we can build a code with average
codeword length that satisfies()L H X ()HX X ( ) ( ) 1H X L H X  

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 11
Shannon-Fano coding
The main advantage of the Shannon-Fano
technique is its semplicity
Source symbols are listed in order of nonincreasing
probability.
The list is divided in such a way to form two groups
of as nearly equal probabilities as possible
Each symbol in the first group receives a 0 as first
digit of its codeword, while the others receive a 1
Each of these group is then divided according to the
same criterion and additional code digits are
appended
The process is continued until each group contains
only one message

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 12
example
H=1.9375 bits
L=1.9375 bits 1 2
1 4
1 8
1 16
1 32
1 32
a
b
c
d
e
f 0
1
1
1
1
1 0
1
1
1
1 0
1
1
1 0
1
1 0
1

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 13
Shannon-Fano coding -exercise
Encode, using Shannon-Fano
algorithmSymb. Prob.
* 12%
? 5%
! 13%
& 2%
$ 29%
€ 13%
§ 10%
° 6%
@ 10%

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 14
Is Shannon-Fano coding optimal?
H=2.2328 bits
L=2.31 bits 0.35
0.17
0.17
0.16
0.15
a
b
c
d
e 00
01
10
110
111 0
100
101
110
111
L1=2.3 bits

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 15
Huffman coding -I
There is another algorithm which
performances are slightly better than
Shanno-Fano, the famous Huffmancoding
It works constructing bottom-up a tree, that
has symbols in the leafs
The two leafs with the smallest probabilities
becomes sibling under a parent node with
probabilities equal to the two children’s
probabilities

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 16
Huffman coding -II
At this time the operation is repeated,
considering also the new parent node and
ignoring its children
The process continue until there is only
parent node with probability 1, that is the
root of the tree
Then the two branches for every non-leaf
node are labeled 0 and 1 (typically, 0 on the
left branch, but the order is not important)

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 17
Huffman coding -example
0Symbol Prob.
0.05
0.05
0.1
0.2
0.3
0.2
0.1
a
b
c
d
e
f
g
a
0.05
b
0.05
c
0.1
d
0.2
e
0.3
f
0.2
g
0.1
0.1
0.2
0.3
0.4
0.6
1.0
0
0
0
0
0
1
1
1
1
1
1
a
0.05
b
0.05
c
0.1
d
0.2
e
0.3
f
0.2
g
0.1
0.1
0.2
0.3
0.4
0.6
1.0

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 18
Huffman coding -example
Exercise: evaluate H(X) and L(X)
H(X)=2.5464 bits
L(X)=2.6 bits !!Symbol Prob. Codeword
0.05 0000
0.05 0001
0.1 001
0.2 01
0.3 10
0.2 11
a
b
c
d
e
f 0
0.1 111g

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 19
Huffman coding -exercise
Code the sequence
aeebcddegfced
and calculate the compression
ratio
Sol:0000 10 10 0001 001 01 01
10 111 110 001 10 01
Aver. orig. symb. length = 3 bits
Aver. compr. symb. length = 34/13
C=.....Symbol Prob. Codeword
0.05 0000
0.05 0001
0.1 001
0.2 01
0.3 10
0.2 11
a
b
c
d
e
f 0
0.1 111g

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 20
Huffman coding -exercise
Decode the sequence
0111001001000001111110
Sol: dfdcadgfSymbol Prob. Codeword
0.05 0000
0.05 0001
0.1 001
0.2 01
0.3 10
0.2 11
a
b
c
d
e
f 0
0.1 111g

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 21
Huffman coding -exercise
Encode with Huffman the sequence
01$cc0a02ba10
and evaluate entropy, average
codeword length and compression
ratioSymb. Prob.
0.10
0.03
0.14
0 0.4
1 0.22
2 0.04
$ 0.07
a
b
c

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 22
Huffman coding -exerciseSymb. Prob.
0 0.16
1 0.02
2 0.15
3 0.29
4 0.17
5 0.04
% 0.17
Decode (if possible) the
Huffman coded bit streaming
01001011010011110101...

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 23
Huffman coding -notes
In the huffman coding, if, at any time, there
is more than one way to choose a smallest
pair of probabilities, any such pair may be
chosen
Sometimes, the list of probabilities is inizialized to be
non-increasing and reordered after each node
creation. This details doesn’t affect the correctness of
the algorithm, but it provides a more efficient
implementation

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 24
Huffman coding -notes
There are cases in which the Huffman coding
does not uniquely determine codeword
lengths, due to the arbitrary choice among
equal minimum probabilities.
For example for a source with probabilities
it is possible to obtain
codeword lengths of and of
It would be better to have a code which codelength has
the minimum variance, as this solution will need the
minimum buffer space in the transmitter and in the
receiver 0.4, 0.2, 0.2, 0.1, 0.1  1, 2, 3, 4, 4  2, 2, 2, 3, 3

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 25
Huffman coding -notes
Schwarz defines a variant of the
Huffman algorithm that allows to build
the code with minimum .
There are several other variants, we
will explain the most important in a
while.max
l

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 26
Optimality of Huffman coding -I
It is possible to prove that, in case of
character coding(one symbol, one
codeword), Huffman coding is optimal
In another terms Huffman code has
minimum redundancy
An upper bound for redundancy has been found
where is the probability of the most likely simbol 
1 2 2 2 1
redundancy 1 log log log 0.086p e e p     1
p

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 27
Optimality of Huffman coding -II
Why Huffman code “suffers” when there is
one symbol with very high probability?
Remember the notion of uncertainty...
The main problem is given by the integer
constraint on codelengths!!
This consideration opens the way to a more powerful
coding... we will see it later( ) 1 log( ( )) 0p x p x   

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 28
Huffman coding -implementation
Huffman coding can be generated in
O(n) time, where n is the number of
source symbols, provided that
probabilities have been presorted
(however this sort costs O(nlogn)...)
Nevertheless, encoding is very fast

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 29
Huffman coding -implementation
However, spatial and temporal complexity of
the decoding phase are far more important,
because, on average, decoding will happen
more frequently.
Consider a Huffman tree with n symbols
nleafs and n-1 internal nodes

has the pointer to a symbol and
the info that it is a leaf
has two pointers2 2( 1) 4 words (32 bits)n n n

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 30
Huffman coding -implementation
1 million symbols 16 MB of memory!
Moreover traversing a tree from root to leaf
involves follow a lot of pointers, with little
locality of reference. This causes several
page faults or cache misses.
To solve this problem a variant of Huffman
coding has been proposed: canonical
Huffman coding

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 31
canonical Huffman coding -ISymb. Prob. Code 1 Code 2 Code 3
0.11 000
0.12 001
0.13 100
111
1
000
001
0

10
01

10
0
1

a
b
c
d .14 101
0.24 01
0.26 11
010
10
00
011

10
1 1
e
f
b
0.12
c
0.13
d
0.14
e
0.24
f
0.26
a
0.11
0.23 0.27
0.47
0.53
1.0
0
0
0
0
0
1
1
1 1
1
(0)
(0)
(0)
(0)(0)
(1)
(1)
(1)
(1) (1)
?

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 32
canonical Huffman coding -II
This code cannot be obtained
through a Huffman tree!
We do call it an Huffman code
because it is instantaneous and the
codeword lengths are the same than
a valid Huffman code
numerical sequence property
codewords with the same length are
ordered lexicographically
when the codewords are sorted in lexical
order they are also in order from the
longest to the shortest codewordSymb. Code 3



000
001
010
011




10
1 1
a
b
c
d
e
f

Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 33
canonical Huffman coding -III
The main advantage is that it is not necessary
to store a tree, in order to decoding
We need
a list of the symbols ordered according to the lexical
order of the codewords
an array with the first codeword of each distinct
length

34
canonical Huffman coding -IV
Encoding.Suppose there are ndisctinct symbols, that for symbol
i we have calculated huffman codelength andi
l
i
i l maxlength for 1 to { [ ] 0; }
for 1 to { [ ] [ ] 1; }
[ ] 0;
for 1 downto 1 {
[ ] ( [ 1] [ 1])/ 2 ; }
for 1 to
ii
k maxlength numl k
i n numl l numl l
firstcode maxlength
k maxlength
firstcode k firstcode k numl k
k maxlength

  


   

 
{ [ ]= [ ]; }
for 1 to {
[ ] [ ];
, [ ]- [ ] ;
[ ] [ ] 1; }
i
i i i
ii
nextcode k firstcode k
in
codeword i nextcode l
symbol l nextcode l firstcode l i
nextcode l nextcode l




numl[k]= number of
codewords with length k
firstcode[k]=
integer for first code of
length k
nextcode[k]=
integer for the next
codeword of length k to
be assigned
symbol[-,-]used for
decoding
codeword[i]the
rightmost bits of this
integer are the code for
symbol ii
l

35
canonical Huffman -example
1.Evaluate array numlSymb. length

2
5
5
3
2
5
5
2
i
il
a
b
c
d
e
f
g
h : [0 3 1 0 4]numl
2.Evaluate array firstcode: [2 1 1 2 0]firstcode
3. Construct array codewordand symbol 
for 1 to {
[ ]= [ ]; }
for 1 to {
[ ] [ ];
, [ ]- [ ] ;
[ ] [ ] 1; }
i
i i i
ii
k maxlength
nextcode k firstcode k
in
codeword i nextcode l
symbol l nextcode l firstcode l i
nextcode l nextcode l





----
aeh -
d ---
----
b c fg
symbol
0 1 2 3
1
2
3
4
5code bits
word
1 01
0 00000
1 00001
1 001
2 10
2 00010
3 00011
3 11 for 1 downto 1 {
[ ] ( [ 1]
[ 1]) / 2 ; }
k maxlength
firstcode k firstcode k
numl k

  


Gabriele Monfardini -Corso di Basi di Dati Multimediali a.a. 2005-2006 36
canonical Huffman coding -V
Decoding.We have the arrays firstcodeand symbols 
();
1;
while [ ] {
2* ();
1; }
Return , [ ] ;
v nextinputbit
k
v firstcode k
v v nextinputbit
kk
symbol k v firstcode k






nextinputbit()function that
returns next input bit
firstcode[k]= integer for first
code of length k
symbol[k,n]returns the
symbol number n with
codelength k

37
canonical Huffman -example 
();
1;
while [ ] {
2* ();
1; }
Return , [ ] ;
v nextinputbit
k
v firstcode k
v v nextinputbit
kk
symbol k v firstcode k






----
aeh -
d ---
----
b c fg
symbol
0 1 2 3
1
2
3
4
5: [2 1 1 2 0]firstcode
00 0000000001111 11
Decoded: dhebad
00 0000000001111 11
symbol[3,0] = d
symbol[2,2] = h
symbol[2,1] = e
symbol[5,0] = b
symbol[2,0] = a
symbol[3,0] = d
symbol[3,0] = d
symbol[2,2] = h
symbol[2,1] = e
symbol[5,0] = b
symbol[2,0] = a
symbol[3,0] = d
Tags