DEFN., Hashing in the data structure is a technique of mapping a large chunk of data into small tables using a hashing function. It is also known as the message digest function. It is a technique that uniquely identifies a specific item from a collection of similar items
Hash function A hash function is a function that takes a set of inputs of any arbitrary size and fits them into a table or other data structure that contains fixed-size elements . The table or data structure generated is usually called a hash table.
Separate chaining/Open hashing To handle the collision, This technique creates a linked list to the slot for which collision occurs. The new key is then inserted in the linked list. These linked lists to the slots appear like chains. That is why, this technique is called as separate chaining .
Advantages: 1) Simple to implement. 2) Hash table never fills up, we can always add more elements to the chain. 3) Less sensitive to the hash function or load factors. 4) It is mostly used when it is unknown how many and how frequently keys may be inserted or deleted. Disadvantages: 1) Cache performance of chaining is not good as keys are stored using a linked list. Open addressing provides better cache performance as everything is stored in the same table. 2) Wastage of Space (Some Parts of hash table are never used) 3) If the chain becomes long, then search time can become O(n) in the worst case. 4) Uses extra space for links.
Disadvantage of separate chaining
Open addressing/Closed Hashing Linear probing Linear probing is a scheme in computer programming for resolving collisions in hash tables , data structures for maintaining a collection of key–value pairs and looking up the value associated with a given key. In these schemes, each cell of a hash table stores a single key–value pair . h´(𝑥) = 𝑥 𝑚𝑜𝑑 𝑚 ℎ(𝑥, 𝑖) = (ℎ´(𝑥) + 𝑖)𝑚𝑜𝑑 𝑚 The value of i| = 0, 1, . . ., m – 1. So we start from i =
Drawbacks of linear probing
Quadratic probing Quadratic probing operates by taking the original hash index and adding successive values of an arbitrary quadratic polynomial until an open slot is found . h´ = (𝑥) = 𝑥 𝑚𝑜𝑑 𝑚 ℎ(𝑥, 𝑖) = (ℎ´(𝑥) + 𝑖 2 )𝑚𝑜𝑑 𝑚 We can put some other quadratic equations also using some constants The value of i = 0, 1, . . ., m – 1. So we start from i = 0, and increase this until we get one free space.
Properties of quadratic probing
Double hashing Double hashing uses the idea of applying a second hash function to key when a collision occurs. is size of hash table . First hash function is typically hash1(key ) = key % TABLE_SIZE A popular second hash function is : hash2(key ) = PRIME – (key % PRIME) where PRIME is a prime smaller than the TABLE_SIZE.
R ehashing Rehashing is the process of re-calculating the hashcode of already stored entries (Key-Value pairs), to move them to another bigger size hashmap when the threshold is reached/crossed. Rehashing of a hash map is done when the number of elements in the map reaches the maximum threshold value.
Extendible hasing Extendible hashing is a dynamically updateable disk-based index structure which implements a hashing scheme utilizing a directory . The index is used to support exact match queries, i.e., find the record with a given key. Overflows are handled by doubling the directory which logically doubles the number of buckets.
Content beyon d syllabus RED BLACK TREES Red Black Tree is a Binary Search Tree in which every node is colored either RED or BLACK . Rules That Every Red-Black Tree Follows: Every node has a colour either red or black. The root of the tree is always black. There are no two adjacent red nodes (A red node cannot have a red parent or red child). Every path from a node (including root) to any of its descendants NULL nodes has the same number of black nodes . http:// www.btechsmartclass.com/data_structures/red-black-trees.html