Data Structures and Agorithm: DS 24 Hash Tables.pptx
RashidFaridChishti
17 views
14 slides
May 18, 2024
Slide 1 of 14
1
2
3
4
5
6
7
8
9
10
11
12
13
14
About This Presentation
Data Structures and Agorithm: DS 24 Hash Tables.pptx
Size: 415 KB
Language: en
Added: May 18, 2024
Slides: 14 pages
Slide Content
International Islamic University H-10, Islamabad, Pakistan Data Structure Lecture No. 24 Hash Tables Engr . Rashid Farid Chishti http://youtube.com/rfchishti http://sites.google.com/site/chishti
Hash table is a data structure that allows very fast retrieval of data no matter how much data there is For that reason hash tables are widely used in database indexing, cashing, program compilation, error checking and much more. Consider a simple 1 Dimensional Array To Find an item in the list you can apply a linear search approach. For a very big array this could take a very long time. Now suppose you want to retrieve a value from an element in an array and you already know the index number of that element. You can lookup the value very quickly indeed and it does not depend on size of an array or position of an element. Introduction 1 2 3 4 5 6 7 8 9 10 Jan Jim Mia Sam Leo Ted Bea Lou Ada Max Zoe
Each index number can be calculated by the value itself so that the index number is a some way related to the data. We get each letter of the word and get the ASCII code. E.g. for a Word Ada Its ASCII Code is: A = 65, d = 100, a = 97 Now get sum of these ASCII codes: sum = 65+100+97= 262 Now divide it by size of any array and take reminded: 262 % 11 = 9 So Ada will be stored at index number 9 of an array. Similarly all the words will be stores in an array. Suppose we want to search Ada , we will add its ASCII codes and then we will divide it’s sum by size of an array and will take reminder which is index no. 9. We can perform a fast array lookup using index no.9. Introduction 1 2 3 4 5 6 7 8 9 10 Ada
Hash Table Word Letter 1 ASCII 1 Letter 2 ASCII 2 Letter 3 ASCII 3 Sum Mod 11 Ada A 65 d 100 a 97 262 9 Bea B 66 e 101 a 97 264 Jan J 74 a 97 n 110 281 6 Leo L 76 e 101 o 111 288 2 Lou L 76 o 111 u 117 304 7 Max M 77 a 97 x 120 294 8 Mia M 77 i 105 a 97 279 4 Sam S 83 a 97 m 109 289 3 Ted T 84 e 101 d 100 285 10 Tim T 84 i 105 m 109 298 1 Zoe Z 90 o 111 e 101 302 5 1 2 3 4 5 6 7 8 9 10 Bea Tim Leo Sam Mia Zoe Jan Lou Max Ada Ted
Hash Function 1 2 3 4 5 6 7 8 9 10 Bea Tim Leo Sam Mia Zoe Jan Lou Max Ada Ted int hash( const string &key, int tableSize ) { int hasVal = 0; for ( int i = 0 ; i < key.length (); i ++) hashVal += key[ i ]; return hashVal % tableSize ; }
Rather that just storing individual item of data in an array. Hash tables are often used store key value pairs For example Ada name would be the key which will be used to calculate index. Date of birth and other data would be her corresponding values. Using an array of Objects you can store as much data as you like for each key. A Hashing Algorithm also known as a Hash Function is the calculation applied to the key which may be a very large number or a very large string . A Calculation is applied to a key to transform it into an address Hash Table 1 2 3 4 5 6 7 8 9 10 Bea Tim Leo Sam Mia Zoe Jan Lou Max Ada Ted 27-01-1941 08-06-1955 31-12-1945 27-04-1791 20-02-1986 19-06-1978 13-02-1956 27-12-1822 23-04-1858 10-12-1815 17-06-1937 English English American American Russian American Polish French German English American Astronomer Inventor Mathematician Inventor Space Station Actress Logician Biologist Physicist Mathematician Philosopher
For numeric keys, divide the key by the number of available addresses (n) and take the remainder. Address = key Mod n For alphanumeric keys, divide the sum of ASCII codes in a key by the number of available addresses (n) and take the remainder. Folding method divides key into equal parts then adds the parts together E.g. The telephone number 01452 8345654, becomes 01 +45 + 28 +34 +56 + 54 = 218 Depending on size of table, may then divide by some constant and take remainder There are lot of different hash algorithms to choose from. Some are more appropriate then others depending on nature of your data. Hash Table
Some times if you apply a hash function to two different keys, it generates the same index number for both. But both items can not go to same place. This is known as a collision and need to resolve it. There are several methods for dealing with this: Close addressing Separate chaining Open addressing Linear Probing Lets load the array with different set of data. Collision
In linear probing , collisions are resolved by sequentially scanning an array (with wraparound) until an empty cell is found. Open Addressing Word Letter ASCII Letter ASCII Letter ASCII Sum Mod 11 New index Mia M 77 i 105 a 97 279 4 4 Tim T 84 i 105 m 109 298 1 1 Bea B 66 e 101 a 97 264 Zoe Z 90 o 111 e 101 302 5 5 Sue S 83 u 117 e 101 301 4 6 Len L 76 e 101 n 110 287 1 2 Moe M 77 o 111 e 101 289 3 3 Lou L 76 o 111 u 117 304 7 7 Rae R 82 a 97 e 101 280 5 8 Max M 77 a 97 x 120 294 8 9 Tod T 84 o 111 d 100 295 9 10 1 2 3 4 5 6 7 8 9 10 Bea Tim Len Moe Mia Zoe Sue Lou Rae Max Tod Hash (“Mia”, 11) = 4 Hash (“Sue”, 11) = 4 Hash (“Tim”, 11) = 1 Hash (“Len”, 11) = 1
The idea is to keep a list of all elements that hash to the same value. The array elements are pointers to the first nodes of the lists. A new item is inserted to the front of the list. Advantages : Better space utilization for large items. Simple collision handling: searching linked list. Overflow: we can store more items than the hash table size. Deletion is quick and easy: deletion from the linked list . Separate Chaining
Separate Chaining
Separate Chaining To Find Rae Rae = 82 + 97 + 101 = 280 280 % 11 = 5 Now Search from Linked List starting from index 5 of an array
It should minimize the collision so that less time is spent in collision resolution. And the data retrieval will be quicker. Ideally it should give uniform distribution of hash values. Therefore data will be spread across the hash table. The hash function should be easy to calculate . It should include a suitable method to resolve any collision that do occur. Objectives of Hash Function
Hash Tables are used to index large amount of data. The address of each key is calculated using the key itself. resolution. And the data retrieval will be quicker. Collision is resolved with either open or closed addressing. Hashing is widely used in database indexing, compilers, cashing, password authentication and more. Insertion, Deletion and retrieval of data from hash table occur in a constant time. Summary