KMP by Rajeev K Krishna
Transcript of KMP by Rajeev K Krishna
-
8/14/2019 KMP by Rajeev K Krishna
1/30
Introduction Knuth-Morris-Pratt algorithm Final remarks
The Knuth-Morris-Pratt algorithm
Kshitij Bansal
Chennai Mathematical Institute
Undergraduate Student
August 25, 2008
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
2/30
Introduction Knuth-Morris-Pratt algorithm Final remarks
IntroductionThe ProblemNaive algorithm
Knuth-Morris-Pratt algorithm
Can we do better?Improved algorithmComputing f
Final remarks
ConclusionThink/read aboutReferencesThankyou
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
3/30
Introduction Knuth-Morris-Pratt algorithm Final remarks
The Problem
Given a piece of text, find if a smaller string occurs in it.
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
4/30
Introduction Knuth-Morris-Pratt algorithm Final remarks
The Problem
Given a piece of text, find if a smaller string occurs in it.
Let T[1..n] be an array which holds the text. Call the smallerstring to be searched pattern: p[1..m].
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
5/30
Introduction Knuth-Morris-Pratt algorithm Final remarks
Naive Algorithm
Naive-String-Matcher(T[1..n],P[1..m])
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
6/30
Introduction Knuth-Morris-Pratt algorithm Final remarks
Naive Algorithm
Naive-String-Matcher(T[1..n],P[1..m])
1 for i from 0 to n m
I d i K h M i P l i h Fi l k
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
7/30
Introduction Knuth-Morris-Pratt algorithm Final remarks
Naive Algorithm
Naive-String-Matcher(T[1..n],P[1..m])
1 for i from 0 to n m2 if P[1 . . . m] = T[i + 1 . . . i + m]
I t d ti K th M i P tt l ith Fi l k
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
8/30
Introduction Knuth-Morris-Pratt algorithm Final remarks
Naive Algorithm
Naive-String-Matcher(T[1..n],P[1..m])
1 for i from 0 to n m2 if P[1 . . . m] = T[i + 1 . . . i + m]
3 print Pattern found starting at position , i
Introduction Knuth Morris Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
9/30
Introduction Knuth-Morris-Pratt algorithm Final remarks
Naive Algorithm
Naive-String-Matcher(T[1..n],P[1..m])
1 for i from 0 to n m2 if P[1 . . . m] = T[i + 1 . . . i + m]
3 print Pattern found starting at position , i
Time complexity: O(mn). Space complexity: O(1).
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
10/30
Introduction Knuth-Morris-Pratt algorithm Final remarks
IntroductionThe ProblemNaive algorithm
Knuth-Morris-Pratt algorithm
Can we do better?Improved algorithmComputing f
Final remarks
ConclusionThink/read aboutReferencesThankyou
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
11/30
Introduction Knuth Morris Pratt algorithm Final remarks
Can we do better?
Text: abaabaabac
Pattern: abaabac
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
12/30
Introduction Knuth Morris Pratt algorithm Final remarks
Can we do better?
Text: abaabaabac
Pattern: abaabac
The naive algorithm when trying to match the seventh character ofthe text with the pattern fails. It discards all information abouttext read from first seven characters and starts afresh. Can we
somehow use this information and improve?
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
13/30
g
Improved algorithm
Improved-String-Matcher(T[1..n],P[1..m],f[1..m])
1 q = 0 (number of characters matched)
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
14/30
Improved algorithm
Improved-String-Matcher(T[1..n],P[1..m],f[1..m])
1 q = 0 (number of characters matched)
2 for i from 1 to n:
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
15/30
Improved algorithm
Improved-String-Matcher(T[1..n],P[1..m],f[1..m])
1 q = 0 (number of characters matched)
2 for i from 1 to n:3 if P[q+ 1] = T[i]:
4 q = q+ 1
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
16/30
Improved algorithm
Improved-String-Matcher(T[1..n],P[1..m],f[1..m])
1 q = 0 (number of characters matched)
2 for i from 1 to n:3 if P[q+ 1] = T[i]:
4 q = q+ 1
5 else if q > 0
6 q = f[q]
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
17/30
Improved algorithm
Improved-String-Matcher(T[1..n],P[1..m],f[1..m])
1 q = 0 (number of characters matched)
2for
ifrom
1to
n:3 if P[q+ 1] = T[i]:
4 q = q+ 1
5 else if q > 0
6 q = f[q]7 goto line 3
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
18/30
Improved algorithm
Improved-String-Matcher(T[1..n],P[1..m],f[1..m])
1 q = 0 (number of characters matched)
2for
ifrom
1to
n:3 if P[q+ 1] = T[i]:
4 q = q+ 1
5 else if q > 0
6 q = f[q]7 goto line 3
Does this really help? What happens when text is abaabadaba . . .?
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
19/30
Improved algorithm
Improved-String-Matcher(T[1..n],P[1..m],f[1..m])
1 q = 0 (number of characters matched)
2 for i from 1 to n:
3 if P[q+ 1] = T[i]:
4 q = q+ 1
5 else if q > 0
6 q = f[q]7 goto line 3
Does this really help? What happens when text is abaabadaba . . .?Note that q can increase by atmost 1 at each step.
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
20/30
Computing f
f[i] denotes the longest proper suffix of P[1 . . . i] which is aprefix of P[1 . . . n]
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
21/30
Computing f
f[i] denotes the longest proper suffix of P[1 . . . i] which is aprefix of P[1 . . . n]
Just match the pattern with itself.
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
22/30
IntroductionThe ProblemNaive algorithm
Knuth-Morris-Pratt algorithm
Can we do better?Improved algorithmComputing f
Final remarks
ConclusionThink/read aboutReferencesThankyou
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
23/30
Conclusion
KMP algorithm Time complexity: O(n + m)
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
24/30
Conclusion
KMP algorithm Time complexity: O(n + m)
Space complexity: O(m)
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
25/30
Think/read about
Number of distinct substrings in a string
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
26/30
Think/read about
Number of distinct substrings in a string Multiple patterns and single text
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
27/30
Think/read about
Number of distinct substrings in a string Multiple patterns and single text
Matching regular expresssions
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
28/30
References
Thomas H. Cormen; Charles E. Leiserson, Ronald L. Rivest,Clifford Stein (2001). Section 32.4: The Knuth-Morris-Prattalgorithm, Introduction to Algorithms, Second edition, MIT
Press and McGraw-Hill, 923931. ISBN 978-0-262-03293-3. The original paper: Donald Knuth; James H. Morris, Jr,
Vaughan Pratt (1977). Fast pattern matching in strings.SIAM Journal on Computing 6 (2): 323350.doi:10.1137/0206024.
An explanation of the algorithm and sample C++ code byDavid Eppstein:http://www.ics.uci.edu/ eppstein/161/960227.html
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
29/30
Thank you
Questions?
Introduction Knuth-Morris-Pratt algorithm Final remarks
http://find/http://goback/ -
8/14/2019 KMP by Rajeev K Krishna
30/30
The End
http://find/http://goback/