RABIN KARP ALGORITHM STRING MATCHING

AbhishekSingh1904 1,264 views 12 slides Jul 09, 2017
Slide 1
Slide 1 of 12
Slide 1
1
Slide 2
2
Slide 3
3
Slide 4
4
Slide 5
5
Slide 6
6
Slide 7
7
Slide 8
8
Slide 9
9
Slide 10
10
Slide 11
11
Slide 12
12

About This Presentation

RABIN KARP algorithm with hash function and hash collision, analysis, algorithm and code for implementation. Besides it contains applications of RABIN KARP algorithm also


Slide Content

PRESENTED BY ABHISHEK KUMAR SINGH RABIN KARP String Matching Algorithm

Let text string be T of length N Pattern string be P of length M Example T=“hello world”; N=11; P=“llo”; M=3 Problem Statement H E L L O W O R L D L L O

The Rabin-Karp string matching algorithm calculates a hash value for the pattern, as well as for each M-character subsequence of given text to be compared. If the hash values for a particular subsequence are unequal, the algorithm will calculate the hash value for next M-character sequence. If the hash values are equal, the algorithm will do a Brute Force comparison between the pattern and the M-character sequence. Therefore there is only one comparison per text subsequence, and Brute Force is only needed when hash values match. Note: Only one comparison for sub sequence Rabin-Karp Algorithm

Consider an M-character sequence as an M-digit number in the alphabet. The text subsequence t[I ... i+M-1] is mapped to the number x(i) = t[i]*3^0+ t[i+1]*b^1+...+ t[i+M-1]*3^(M-1) Furthermore, given x(i) we can compute x(i+1) for the next subsequence t[i+1 .. i+M] in constant time,as follows: x(i+1) = (x(i)-(t[i]))/3+(3^(M-1))*t[i+M-1] Rolling Hash Function

A B D A B C Example tHash = Hash(“ABD”) = 1*3^0+2*3^1+4*3^2= 43 pHash = Hash(“ABC”) = 1*3^0+2*3^1+3*3^2 = 34 tHash == pHash  FALSE tHash = Hash(“BDA”) = 2*3^0+4*3^1+1*3^2 = 23 tHash == pHash  FALSE tHash = Hash(“DAB”) = 4*3^0+1*3^1+2*3^2 = 25 tHash == pHash  FALSE A B D A B C A B D A B C A B D A B C tHash = Hash(“ABC”) = 30+30+40 = 34 tHash == pHash  TRUE

TEXT : PATTERN : HERE PATTERN MATCHES THE SUBSTRING SO INDEX NO. 3 IS RETURNED Incase of HIT A B D A B C A B C

int main() { string s1,s2; cin>>s1>>s2; int l1=s1.size(),l2=s2.size(),i,j,k; int hs1=0,p=0,ar[l1]; for(i=0;i<l2;i++) hs1+=(int)s2[i]*pow(3,i); int hs2=0; for(i=0;i<l1;i++) { if(i==0) { for(k=i;k<l2;k++) hs2+=(int)s1[k]*pow(3,k); } else { hs2-=(int)s1[i-1]; hs2/=3; hs2+=(int)s1[i+l2-1]*pow(3,l2-1); } int c=0; if(hs2==hs1) { for(j=i;j<i+l2;j++) if(s1[j]==s2[j-i]) c++; else break; } if(c==l2) ar[p++]=i; } cout<<l1<<endl; cout<<"Stored index are“\n; for(i=0;i<p;i++) cout<<ar[i]<<" "; return 0; }

Hash to two string match then it is called Hit There is possibility Hash of “abc” is 34 Hash of “dga” is 34 This is called Hash Collision Minimize Collision by Taking mod with prime number HASH COLLISION

Hash of Pattern O(m) Hash Comparison O(n-m+1) = O(n) ; m < n Average Running Time O(m+n) Worst Case Running Time m comparison in each iteration O(mn) ANALYSIS

Keyword matching in large files Good for Plagiarism detection Searching engines Database searching Bioinformatics APPLICATION

CLRS – 2 nd Edition, Page 911-916 slideshare.net/GajanandSharma1/rabin-karp-string-matching-algorithm stoimen.com/blog/2012/04/02/computer-algorithms-rabin-karp-string-searching/ en.wikipedia.org/wiki/Rabin%E2%80%93Karp_algorithm cs.princeton.edu/courses/archive/fall04/cos226/lectures/string.4up.pdf References

THANK YOU