Huffman Coding

SaqibRehan2 864 views 57 slides May 14, 2018

Slide 1 of 57

About This Presentation

A complete slide that represent huffman coding and tree with examples explained and very easy to understantd

Size: 1.4 MB

Language: en

Added: May 14, 2018

Slides: 57 pages

Slide Content

DATA STRUCTURE & ALGORITHM HUFFMAN CODING /TREE

Presented By: M. Saqib Rehan 503 Presented to: Sir Naveed Hassan & the whole class

3 OUTLINE

an algorithm developed by David A. Huffman while he was a Ph.D. / Sc.D.(Doctor of Science) student at MIT, and published in the 1952 paper "A Method for the Construction of Minimum-Redundancy Codes". A minimum-redundancy code is one constructed in such a way that the average number of coding digits per message is minimized. 4 History of Huffman Coding

An Application of Binary Trees and Priority Queues Huffman Coding is a method of shortening down messages sent from one computer to another so that it can be sent quicker. Each code is a binary string that is used for transmission of the corresponding message. For Example : BAGGAGE 100 11 0 0 11 0 101 Plain Text Code 5 What is Huffman Coding

Huffman codes are an effective technique of ‘lossless data compression’ which means no information is lost. The idea is to assign variable-length codes to input characters Lengths of the assigned codes are based on the frequencies of corresponding characters Most frequent character smallest code Least frequent character largest code 6 Huffman Codes

Huffman Code is greedy when it locally chooses and merges two of the smallest nodes (nodes are weighted after occurrence/frequency. Those which occur most frequent has the largest values, and those with that occur least has the lowest values) at a time, until there is no more nodes left over and a binary tree is built 7 How Huffman Coding is Greedy

Used to compress files for transmission To decrease our storage cost Huffman coding is widely used for image compression. Since it reduces the coding redundancy and results in lossless data at the receiver end Uses statistical coding more frequently used symbols have shorter code words Works well for text and fax transmissions 8 Applications

Fixed length code:- If every word in the code has the same length, the code is called a fixed-length code, or a block code 9 Fixed & Variable length code Advantage Disadvantage Access is fast because the computer knows where each Word starts Using Fixed length records, the records are usually larger and therefore need more storage space and are slower to transfer

variable-length codes The advantage of variable-length codes over fixed-length is short codes can be given to characters that occur frequently on average, the length of the encoded message is less than fixed-length encoding Potential problem: how do we know where one character ends and another begins? not a problem if number of bits is fixed! A = 00 B = 01 C = 10 D = 11 00 10 11 01 11 00 11 11 11 11 11 A C D B A D D D D D

Prefix property A code has the prefix property if no character code is the prefix (start of the code) for another character Example: 000 is not a prefix of 11, 01, 001, or 10 11 is not a prefix of 000, 01, 001, or 10 … Symbol Code P 000 Q 11 R 01 S 001 T 10 01001101100010 R S T Q P T

Code without prefix property The following code does not have prefix property The pattern 1110 can be decoded as QQQP , QTP , QQS , or TS Symbol Code P Q 1 R 01 S 10 T 11

Consider a file of 100 characters from a- f, with these frequencies: a = 45 b = 13 c = 12 d = 16 e = 9 f = 5 13 Problem Analysis

Typically each character in a file is stored as a single byte (8 bits) If we know we only have six characters, we can use a 3 bit code for the characters instead: a = 000, b = 001, c = 010, d = 011, e = 100, f = 101 This is called a fixed-length code With this scheme, we can encode the whole file with 300 bits (45*3+13*3+12*3+16*3+9*3+5*3) We can do better Better compression More flexibility 14 Fixed-Length Code

Variable length codes can perform significantly better Frequent characters are given short code words, while infrequent characters get longer code words Consider this scheme: a = 0; b = 101; c = 100; d = 111; e = 1101; f = 1100 How many bits are now required to encode our file? 45*1 + 13*3 + 12*3 + 16*3 + 9*4 + 5*4 = 224 bits This is in fact an optimal character code for this file 15 Variable-Length Code

Compression: Data Compression, in Huffman coding, shrinks down a String so that it takes up less space. This is desirable for data storage and data communication. Storage space on disks is expensive so a file which occupies less disk space is "cheaper" than an uncompressed String. Decompression: In the decompression we convert the binary codes into the Original string. Techniques of Huffman coding:

Scan text to be compressed and tally occurrence of all characters. Sort or prioritize characters based on number of occurrences in text. Build Huffman code tree based on prioritized list. Perform a traversal of tree to determine all code words. Scan text again and create new file using the Huffman codes. The (Real) Basic Algorithm

Consider the following short text: Huffman coding ,the greedy algorithm Size is = 36 Characters Scan the text and prepare a frequency table. 18 EXAMPLE

Huffman coding ,the greedy algorithm 19 Frequency table character frequency character frequency character frequency H 1 c 1 h 2 u 1 o 2 r 2 f 2 d 2 y 1 m 2 i 2 l 1 a 2 g 3 e 3 n 2 , 1 Space 4 t 2

Building a Tree Prioritize characters Create binary tree nodes with character and frequency of each character Place nodes in a priority queue The lower the occurrence, the higher the priority in the queue

21 Building a Tree (Enqueue) H1 u1 c1 ,1 y1 l1 f2 m 2 a2 n2 O 2 d2 i2 T 2 h2 r2 g3 e3 sp 4

22 Building a Tree H1 u1 c1 ,1 y1 l1 f2 m 2 a2 n2 O 2 d2 i2 T 2 h2 r2 g3 e3 sp 4 2

23 Building a Tree c1 ,1 y1 l1 f2 m 2 a2 n2 O 2 d2 i2 T 2 h2 r2 g3 e3 sp 4 2 H1 u1 2

24 Building a Tree y1 l1 f2 m 2 a2 n2 O 2 d2 i2 T 2 h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2

25 Building a Tree y1 l1 f2 m 2 a2 n2 o 2 d2 i2 t 2 h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2

26 y1 l1 2 Building a Tree y1 l1 f2 m 2 a2 n2 o 2 d2 i2 t 2 h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2

27 y1 l1 2 Building a Tree f2 m 2 a2 n2 o 2 d2 i2 t 2 h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2

28 y1 l1 2 Building a Tree f1 m 2 a2 n2 o 2 d2 i2 t 2 h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2 f2 m2 4

y1 l1 2 Building a Tree a2 n2 o 2 d2 i2 t 2 h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2 f 2 m2 4

30 y1 l1 2 Building a Tree a2 n2 o 2 d2 i2 t 2 h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2 a2 n2 4 f2 m2 4

31 y1 l1 2 Building a Tree o 2 d2 i2 t 2 h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2 a2 n2 4 f 2 m2 4

32 y1 l1 2 Building a Tree o 2 d2 i2 t 2 h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2 a2 n2 4 o2 d2 4 f 2 m2 4

33 y1 l1 2 Building a Tree i2 t 2 h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2 a2 n2 4 o2 d2 4 f 2 m2 4

34 y1 l1 2 Building a Tree i2 t 2 h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2 a2 n2 4 o2 d2 4 i2 t2 4 f 2 m2 4

35 y1 l1 2 Building a Tree h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2 a2 n2 4 o2 d2 4 i2 t2 4 f 2 m2 4

36 y1 l1 2 h2 r2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2 a2 n2 4 o2 d2 4 i2 t2 4 h2 r2 4 f 2 m2 4

37 y1 l1 2 g3 e3 sp 4 H1 u1 2 c 1 ,1 2 a2 n2 4 o2 d2 4 i2 t2 4 h2 r2 4 H1 u1 2 c 1 ,1 2 4 f 2 m2 4

38 y1 l1 2 g3 e3 sp 4 f 2 m2 4 a2 n2 4 o2 d2 4 i2 t2 4 h2 r2 4 H1 u1 2 c 1 ,1 2 4 2 4 6 y1 l1 f 2 m2

39 g3 e3 sp 4 a2 n2 4 o2 d2 4 i2 t2 4 h2 r2 4 H1 u1 2 c 1 ,1 2 4 2 4 6 y1 l1 f 2 m2 g3 e3 6

40 sp 4 a2 n2 4 o2 d2 4 i2 t2 4 h2 r2 4 H1 u1 2 c 1 ,1 2 4 2 4 6 y1 l1 f 2 m2 g3 e3 6 sp 4 a2 n2 4 8

41 o2 d2 4 i2 t2 4 h2 r2 4 H1 u1 2 c 1 ,1 2 4 2 4 6 y1 l1 f 2 m2 g3 e3 6 sp 4 a2 n2 4 8 o2 d2 4 i2 t2 4 8

42 h2 r2 4 H1 u1 2 c 1 ,1 2 4 2 4 6 y1 l1 f2 m2 g3 e3 6 sp 4 a2 n2 4 8 o2 d2 4 i2 t2 4 8

43 h2 r2 4 H1 u1 2 c 1 ,1 2 4 2 4 6 y1 l1 f 2 m2 g3 e3 6 sp 4 a2 n2 4 8 o2 d2 4 i2 t2 4 8 8

44 h2 r2 4 H1 u1 2 c 1 ,1 2 4 sp 4 a2 n2 4 8 o2 d2 4 i2 t2 4 8 8 2 4 6 y1 l1 F2 m2 g3 e3 6 12

45 h2 r2 4 H1 u1 2 c 1 ,1 2 4 8 2 4 6 y1 l1 f 2 m2 g3 e3 6 12 sp 4 a2 n2 4 8 o2 d2 4 i2 t2 4 8 16

46 sp 4 a2 n2 4 8 o2 d2 4 i2 t2 4 8 16 h2 r2 4 H1 u1 2 c 1 ,1 2 4 8 2 4 6 y1 l1 f2 m2 g3 e3 6 12 20

47 sp 4 a2 n2 4 8 o2 d2 4 i2 t2 4 8 16 h2 r2 4 H1 u1 2 c 1 ,1 2 4 8 2 4 6 y1 l1 f 2 m2 g3 e3 6 12 20 36 After enqueuing only one index is left

Dequeue the single node left in the queue. This tree contains the new code words for each character. Frequency of root node should equal number of characters in text Huffman coding ,the greedy algorithm =36 Characters 48 Building a Tree

49 sp 4 a2 n2 4 8 o2 d2 4 i2 t2 4 8 16 h2 r2 4 H1 u1 2 c 1 ,1 2 4 8 2 4 6 y1 l1 f 2 m2 g3 e3 6 12 20 36

Coding Now we assign codes to the tree by placing a 0 on every left branch and a 1 on every right branch A traversal of the tree from root to leaf give the Huffman code for that particular leaf character Note that no code is the prefix of another code

51 sp 4 a2 n2 4 8 o2 d2 4 i2 t2 4 8 16 h2 r2 4 H1 u1 2 c 1 ,1 2 4 8 2 4 6 y1 l1 f2 m2 g3 e3 6 12 20 36 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

52 New code after traversing Character code Character Code Character Code space 000 d 0101 y 11000 e 1111 o 0100 , 10111 g 1110 n 0011 c 10110 r 1001 a 0010 u 10101 h 1000 m 11011 H 10100 t 0111 f 11010 i 0110 l 11001 frequent letter=less- bits less frequent- letter=more bits The tree could have been built in a number of ways; each would yielded different codes but the code would still be minimal.

Original text “ Huffman coding ,the greedy algorithm ” By using fixed length coding , length is characters*8 36*8= 288 By using variable length code encoded form is: 10100101011101011010110110010001100010110010001010110001111100001011101111000111100011101001111111110101110000000010110011110100101100111100011011 Compressed form contain 146 bits , save memory 49 % Huffman encoding

Once receiver has tree it scans incoming bit stream 0 ⇒ go left 1 ⇒ go right 54 Decoding the File

55 10100101011101011010110110010001100010110010001010110001111100001011101111000111100011101001111111110101110000000010110011110100101100111100011011 sp 4 a2 n2 4 8 o2 d2 4 i2 t2 4 8 16 h2 r2 4 H1 u1 2 c 1 ,1 2 4 8 2 2 4 y1 l1 f 1 m1 g3 e3 6 10 18 34 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1

Huffman Coding

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Huffman Coding

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Slide 29

Slide 30

Slide 31

Slide 32

Slide 33

Slide 34

Slide 35

Slide 36

Slide 37

Slide 38

Slide 39

Slide 40

Slide 41

Slide 42

Slide 43

Slide 44

Slide 45

Slide 46

Slide 47

Slide 48

Slide 49

Slide 50

Slide 51

Slide 52

Slide 53

Slide 54

Slide 55

Slide 56

Slide 57

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

8-top-ai-courses-for-customer-support-representatives-in-2025.pptx

7-essential-ai-courses-for-call-center-supervisors-in-2025.pptx

25-essential-ai-courses-for-user-support-specialists-in-2025.pptx

8-essential-ai-courses-for-insurance-customer-service-representatives-in-2025.pptx

Know for Certain

PPT OPD LES 3ertt4t4tqqqe23e3e3rq2qq232.pptx