Shannon-Fano algorithm

7,837 views 11 slides Mar 04, 2018
Slide 1
Slide 1 of 11
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11

About This Presentation

Shannon-Fano algorithm is a compression algorithms uses variable length coding in data compression methods.


Slide Content

Shannon-Fano Algorithm
Variable Length Coding

Introduction
The Shannon-Fano algorithm was independently developed by Shannon at Bell Labs
and Robert Fana at MIT.
The encoding steps of the Shannon-Fano algorithm can be presented in the following
top-down manner:
1. Sort the symbols according to the frequency count of their occurrences.
2. Recursively divide the symbols into two parts, each with approximately the same
number of counts, until an parts contain only one symbol.
A natural way of implementing the above procedure is to build a binary tree. As a
convention, let's assign bit 0 to its left branches and 1 to the right branches.

Example
Symbols to be coded are the characters in the word SPEAKER.
The frequency count of the symbols is
Total number of symbols is 7


S 1
P 1
E 2
A 1
K 1
R 1

E,S:(3) P,A,K,R:(4)
(7)
E 2
S 1
P 1
A 1
K 1
R 1
STEP 1
STEP 2.1
0 1
The first division yields two parts: (a) E,S with a total count of 3, denoted as E,S:(3); and (b) P,A,Kand R: with a
total count of 4, denoted as P,A,K,R:(4).

E,S: (3) P,A,K,R: (4)
(7)
STEP 2.2
E:(2)
S: (1)
P,A: (2) K,R: (2)
0
0 0
1
1
1

E,S :(3) P,A,K,R: (4)
(7)
STEP 2.3
E:(2) S: (1) P,A: (2) K,R: (2)
0
0
0
1
1 1
P: (1) A: (1) K:(1) R :(1)
0
0
1
1

Symbol CountCodeNumber of
bits used
Probability
Pi
E 2 00 4 (2*2) 2/7 = 0.29
S 1 01 2 1/7 =0.14
P 1 100 3 1/7=0.14
A 1 101 3 1/7=0.14
K 1 110 3 1/7=0.14
R 1 111 3 1/7=0.14
Total number of bits : 18 bits

Compression Ratio
If the total number of bits required to represent the data before compression is B
0

and the total number of bits required to represent the data after compression is B
1
,
then we define the compression ratio as

Compression Ratio = B
0
/

B
1

B
0
= 8 * 7 Symbols Assume each character symbol require 8 bit


B
1
= 18 bits

Compression Ratio = 56/18 = 3.11 [ Positive Compression ]

Average number of bits used by 7 symbols in the above solution = 18 / 7 = 2.57

Entropy (η)
According to the Claude E. Shannon, the entropy η of an information source with
alphabet S = {S
1
, S
2
, ••. ,S
n
} is defined as:

η = H(S) =

where P
i
is the probability that symbol S
i
in S will occur.
The term indicates the amount of information contained in S
i
, which
corresponds to the number of bits needed to encode S
i
.

= 0.29 * log
2
(1/0.29) + [ 0.14 * log
2
(1/0.14) ] * 5
= 0.29 * log
2
(3.45) + [ 0.14 * log
2
(7.14) ] * 5
= 0.29 * log
2
(3.45) + [ 0.14 * log
2
(7.14) ] * 5
= 0.29 * 1.79 + [ 0.14 * 2.84) ] * 5
= 0.52 + 0.40 * 5 = 0.52 + 2.00 = 2.52
This suggests that the minimum average number of bits to code each character in the
word SPEAKER would be at least 2.52. The Shannon-Fano algorithm delivers
satisfactory coding results for data compression.





Entropy (η)

References
●Fundamentals of Multimedia by Ze-Nian Li and Mark S Drew, Pearson Education, 2009
●Log calculator by https://ncalculators.com