Huffman Algorithm and its Application by Ekansh Agarwal

EkanshAgarwal19 441 views 17 slides Dec 17, 2021
Slide 1
Slide 1 of 17
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12
Slide 13
13
Slide 14
14
Slide 15
15
Slide 16
16
Slide 17
17

About This Presentation

This presentation contains the explanation on the topic Huffman algorithm backed by a sample problem for explaination


Slide Content

Huffman Algorithm and its Applications by Ekansh Agarwal(J001)

Contents Introduction Steps/A lgorithm to build Huffman tree (Example) Building Huffman tree with sample inputs Steps/Algorithm for traversing the Huffman tree Real Life applications of Algorithm Advantages/Disadvantages References 2

Introduction Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input characters, lengths of the assigned codes are based on the frequencies of corresponding characters. The most frequent character gets the smallest code and the least frequent character gets the largest code. The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences) are assigned in such a way that the code assigned to one character is not the prefix of code assigned to any other character. This is how Huffman Coding makes sure that there is no ambiguity when decoding the generated bit stream. 3

Introduction There are mainly two major parts in Huffman Coding-: Build a Huffman Tree from input characters. Traverse the Huffman Tree and assign codes to characters . 4

Steps/Algorithm to build Huffman table Create a leaf node for each unique character and build a min heap of all leaf nodes (Min Heap is used as a priority queue. The value of frequency field is used to compare two nodes in min heap. Initially, the least frequent character is at root) Extract two nodes with the minimum frequency from the min heap. Create a new internal node with a frequency equal to the sum of the two nodes frequencies. Make the first extracted node as its left child and the other extracted node as its right child. Add this node to the min heap. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the root node and the tree is complete. 5

Example-: Sample Data 6

Step A Create a leaf node for each character and build a min heap using all the nodes (The frequency value is used to compare two nodes in min heap) Resulted Leaf Nodes for each Character 7

Step B - Repeat the following steps till heap has more than one nodes Step 1: Extract two nodes, say x and y, with minimum frequency from the heap Step 2 : Create a new internal node z with x as its left child and y as its right child. Also frequency(z)= frequency(x)+frequency(y) Step 3 : Add z to min heap. Then Extract and Combine node u with an internal node having 4 as the frequency then add the new internal node to priority queue 8

Step B - Repeat the following steps till heap has more than one nodes Step 4: Extract and combine node a with an internal node having 8 as the frequency then add the new internal node to priority queue Step 5: Extract and Combine nodes i and s then add the new internal node to priority queue- Step 6: Extract and Combine nodes i and s then add the new internal node to priority queue- 9

Step B - Repeat the following steps till heap has more than one nodes Step 7: Extract and Combine node e with an internal node having 18 as the frequency then add the new internal node to priority queue- Step 8: Finally , Extract and Combine internal nodes having 25 and 33 as the frequency then add the new internal node to priority queue- 10

Steps to traversing Huffman Tree Create an auxiliary array Traverse the tree starting from root node Add 0 to array while traversing the left child and add 1 to array while traversing the right child Print the array elements whenever a leaf node is found 11

Need for traversing Huffman Tree Suppose the string “ staeiout ” needs to be transmitted from computer A (sender) to computer B (receiver) across a network. Using concepts of Huffman encoding, the string gets encoded to binary codes at sender address 12

Conclusion of all the steps used to demonstrated in above sample example 12 11 10 9 8 7 6 5 4 3 2 1 Create leaf nodes for all the characters and add them to the min heap. Extract two nodes, say x and y, with minimum frequency from the heap Add z to min heap  Since internal node with frequency 58 is the only node in the queue, it becomes the root of  Huffman tree . Create an auxiliary array Add 0 to array while traversing the left child and add 1 to array while traversing the right child Repeat the following steps till heap has more than one nodes Create a new internal node z with x as its left child and y as its right child. Also frequency(z)= frequency(x)+frequency(y) Extract and Combine node u with an internal node having 4 as the frequency Last node in the heap is the root of Huffman tree Traverse the tree starting from root node Print the array elements whenever a leaf node is found 13

Real-life Applications of Huffman Codding Applications of Huffman Codding-: Huffman encoding is widely used in compression formats like GZIP, PKZIP (WinZip) and BZIP2. Multimedia codecs like JPEG, PNG and MP3 uses Huffman encoding (to be more precise the prefix codes) Huffman encoding still dominates the compression industry since newer arithmetic and range coding schemes are avoided due to their patent issues. 14

Advantages of Huffman Encoding- This encoding scheme results in saving lot of storage space, since the binary codes generated are variable in length It generates shorter binary codes for encoding symbols/characters that appear more frequently in the input string The binary codes generated are prefix-free Advantages/ Disadvantages 15

Disadvantages of Huffman Encoding- Lossless data encoding schemes, like Huffman encoding, achieve a lower compression ratio compared to lossy encoding techniques. Thus, lossless techniques like Huffman encoding are suitable only for encoding text and program files and are unsuitable for encoding digital images. Huffman encoding is a relatively slower process since it uses two passes- one for building the statistical model and another for encoding. Thus, the lossless techniques that use Huffman encoding are considerably slower than others. Since length of all the binary codes is different, it becomes difficult for the decoding software to detect whether the encoded data is corrupt. This can result in an incorrect decoding and subsequently, a wrong output. Advantages/ Disadvantages 16

References Bao Ergude ; Li Weisheng ; Fan A Study and Implementation of the Huffman Algorithm Based on Condensed Huffman  Table Rabia Arshad ; Adeel Saleem ; Danista Khan Performance comparison of  Huffman Coding and double  Huffman  Coding Hoang- Anh Pham ; Van-Hieu 2010 Fifth IEEE International Symposium on Electronic Design, Test & Applications An Adaptive Huffman Decoding Algorithm for MP3 Decoder Studytonight Huffman Coding Algorithm GeekforGeeks Application of Huffman algorithm 17