Hashing a searching technique in data structures

shiks1234 20 views 28 slides Oct 19, 2024

Slide 1 of 28

About This Presentation

searching technique

Size: 1016.42 KB

Language: en

Added: Oct 19, 2024

Slides: 28 pages

Slide Content

Course Name: Data Structure and Algorithm Topic: Hashing Course code : CS 2103 Credits : 4 Mode of delivery : Hybrid (Power point presentation) Faculty : Dr. Atul Kumar Verma Email-id : [email protected] B.TECH III Sem CSE Academic YEAR: 2024-2025 1 8/2/2024 Satpal Singh Kushwaha

Assessment criteria’s Assignment quiz Mid term examination End term Examination 8/2/2024 Satpal Singh Kushwaha 2

Hashing

Why do we need hashing? Many applications deal with lots of data Search engines and web pages There are myriad look ups. The look ups are time critical. Typical data structures like arrays and lists, may not be sufficient to handle efficient lookups In general : When look-ups need to occur in near constant time. O(1)

Why do we need hashing? Consider the internet(2002 data): By the Internet Software Consortium survey at http://www.isc.org/ in 2001 there are 125,888,197 internet hosts, and the number is growing by 20% every six month! Using the best possible binary search it takes on average 27 iterations to find an entry. By an survey by NUA at http://www.nua.ie/ there are 513.41 million users world wide.

Why do we need hashing? We need something that can do better than a binary search, O(log N). We want, O(1). Solution: Hashing In fact hashing is used in: Web searches Spell checkers Databases Compilers passwords Many others

Bu ildi ng a n i n d ex u si ng H as hM ap s WORD N D O C S PTR jezebel 20 jezer 3 jezerit 1 jeziah 1 jeziel 1 jezliah 1 jezoar 1 jezrahliah 1 jezreel 39 j e z o a r 34 6 1 118 2087 3922 3981 5002 44 3 215 2291 3010 56 4 5 22 134 992 D O C I D O CCU R POS 1 POS 2 . . . 566 3 203 245 287 67 1 132 . . .

The concept Suppose we need to find a better way to maintain a table (Example: a Dictionary) that is easy to insert and search in O(1) .

Big Idea in Hashing Let S={a 1 ,a 2,… a m } be a set of objects that we need to map into a table of size N. Find a function such that H:S [1…n] Ideally we’d like to have a 1-1 map But it is not easy to find one Also function must be easy to compute It is a good idea to pick a prime as the table size to have a better distribution of values

Finding a hash Function Assume that N = 5 and the values we need to insert are: cab, bea, bad etc. Let a=0, b=1, c=2, etc Define H such that H[data] = ( ∑ characters) Mod N H[cab] = (2+0+1) Mod 5 = 3 H[bea] = (1+4+0) Mod 5 = H[bad] = (1+0+3) Mod 5 = 4

Collisions What if the values we need to insert are “abc”, “cba”, “bca” etc… They all map to the same location based on our map H ( obviously H is not a good hash map) This is called “ Collision ” When collisions occur, we need to “handle” them Collisions can be reduced with a selection of a good hash function

Choosing a Hash Function A good hash function must Be easy to compute Avoid collisions How do we find a good hash function? A bad hash function Let S be a string and H(S) = Σ S i where S i is the i th character of S Why is this bad?

Collisions Hash functions can be many-to-1 They can map different search keys to the same hash key. hash1(`a`) == 9 == hash1(`w`) Must compare the search key with the record found If the match fails, there is a collision

Collision Resolving strategies Separate chaining Open addressing Linear Probing Quadratic Probing Double Probing Etc.

Separate Chaining Collisions can be resolved by creating a list of keys that map to the same value

Linear Probing The idea: Table remains a simple array of size N On insert(x), compute f(x) mod N, if the cell is full, find another by sequentially searching for the next available slot Go to f(x)+1, f(x)+2 etc.. On find(x), compute f(x) mod N, if the cell doesn’t match, look elsewhere. Linear probing function can be given by h(x, i) = (f(x) + i) mod N (i=1,2,….)

Linear Probing f(i) = i Probe sequence: th probe = h(k) mod TableSize 1 th probe = (h(k) + 1) mod TableSize 2 th probe = (h(k) + 2) mod TableSize . . . i th probe = (h(k) + i) mod TableSize

Figure 20.4 Linear probing hash table after each insertion

Linear Probing Example Consider H(key) = key Mod 6 (assume N=6) H(11)=5, H(10)=4, H(17)=5, H(16)=4,H(23)=5 Draw the Hash table 1 1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5

Clustering Problem As long as table is big enough, a free cell can always be found, but the time to do so can get quite large. Worse, even if the table is relatively empty, blocks of occupied cells start forming. This effect is known as primary clustering . Any key that hashes into the cluster will require several attempts to resolve the collision, and then it will add to the cluster.

Linear Probing – Clustering no collision no collision collision in small cluster collision in large cluster Primary Clustering. Nodes hash to same key clump. Nodes hash to same area clump.

Linear Probing How about deleting items from Hash table? Item in a hash table connects to others in the table Deleting items will affect finding the others “ Lazy Delete ” – Just mark the items as inactive rather than removing it.

Lazy Delete Naïve removal can leave gaps! I n se rt f Re m o ve e a b c 3 e 5 d 8 j 8 u 10 g 8 s a b c 5 d 3 f 8 j 8 u 10 g 8 s a b c 3 e 5 d 3 f 8 j 8 u 10 g 8 s F i nd f a b c 5 d 3 f 8 j 8 u 10 g 8 s “3 f” means search key f and hash key 3

Lazy Delete Clever removal I n se rt f Re m o ve e a b c 3 e 5 d 8 j 8 u 10 g 8 s a b c g o n e 5 d 3 f 8 j 8 u 10 g 8 s a b c 3 e 5 d 3 f 8 j 8 u 10 g 8 s F i nd f a b c g o n e 5 d 3 f 8 j 8 u 10 g 8 s “3 f” means search key f and hash key 3

Quadratic Probing Quadratic probing is a scheme in computer programming for resolving collisions in hash tables. Quadratic probing operates by taking the original hash value and adding successive values of an arbitrary quadratic polynomial to the starting value.

26 Quadratic Probing f(i) = i 2 Probe sequence: th probe = h(k) mod TableSize 1 th probe = (h(k) + 1) mod TableSize 2 th probe = (h(k) + 4) mod TableSize 3 th probe = (h(k) + 9) mod TableSize . . . i th probe = (h(k) + i 2 ) mod TableSize f ( i +1) = f ( i ) + 2 i + 1 Less likely to encounter Primary Clustering

Quadratic Probing Problem: We may not be sure that we will probe all locations in the table (i.e. there is no guarantee to find an empty cell if table is more than half full.) If the hash table size is not prime this problem will be much severe. Quadratic probing has not yet been mathematically analyzed. Although quadratic probing eliminates primary clustering, elements that hash to the same location will probe the same alternative cells. This is know as secondary clustering.

CENG 213 Data Structures 28 Hashing Applications Compilers use hash tables to implement the symbol table (a data structure to keep track of declared variables). Game programs use hash tables to keep track of positions it has encountered ( transposition table ) Online spelling checkers.

Hashing a searching technique in data structures

About This Presentation

Slide Content

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Hashing a searching technique in data structures

About This Presentation

Slide Content

Slide 1

Slide 2

Slide 3

Slide 4

Slide 5

Slide 6

Slide 7

Slide 8

Slide 9

Slide 10

Slide 11

Slide 12

Slide 13

Slide 14

Slide 15

Slide 16

Slide 17

Slide 18

Slide 19

Slide 20

Slide 21

Slide 22

Slide 23

Slide 24

Slide 25

Slide 26

Slide 27

Slide 28

Tags

Categories

Download

Quick Actions

Statistics

Related Slideshows

Pray For The Peace Of Jerusalem and You Will Prosper

Don_t_Waste_Your_Life_God.....powerpoint

VILLASUR_FACTORS_TO_CONSIDER_IN_PLATING_SALAD_10-13.pdf

Fertility awareness methods for women in the society

Chapter 5 Arithmetic Functions Computer Organisation and Architecture

syakira bhasa inggris (1) (1).pptx.......