Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.
-
Upload
frank-hufford -
Category
Documents
-
view
225 -
download
0
Transcript of Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.
![Page 1: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/1.jpg)
Parameterized MatchingAmir, Farach, Muthukrishnan
Orgad Keller
Modified by Ariel Rosenfeld
![Page 2: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/2.jpg)
Orgad Keller - Algorithms 2 - Recitation 9 2
Definition: Two strings over the alphabet , parametrized match (p-match) if the following 3 conditions apply :
Parametrized Match Relation
, ,S S S S n ( )
1 i n
![Page 3: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/3.jpg)
Orgad Keller - Algorithms 2 - Recitation 9 3
Conditions
i iS S
i i i iS S S S
1 ,i i
i j i j
S S
j n S S S S
xSxS
xSyS
x
y
z
w
![Page 4: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/4.jpg)
Orgad Keller - Algorithms 2 - Recitation 9 4
We can see it as a bijection :
Example
, , , , , ,a b c x y z w
S b a x x y x b b z y z x b y z
S b a y y x y b b w x w y b x w
( ) ( )
( ) ( )
f x y f z w
f y x f w z
:f
![Page 5: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/5.jpg)
Orgad Keller - Algorithms 2 - Recitation 9 5
Parametrized Matching
1 1... , ... , ,n mT t t P p p Input: Output: All locations where
p-matches . i 1...i i mt t
P
![Page 6: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/6.jpg)
Orgad Keller - Algorithms 2 - Recitation 9 6
Given we’ll define :
In linear time…
Observation
where ,
i i i ii i
i i
i i i ii i
i i
t t t tt t
a t b t
p p p pp p
a p b p
a b
,T P , , ,T T P P
![Page 7: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/7.jpg)
Orgad Keller - Algorithms 2 - Recitation 9 7
Now is over and is over and .
We get the algorithm for p-match:Create Find all the places appears in (using
KMP) (cond. 1+2)Find all the places m-matches in
(We’ll show later how) (cond. 3)Return
Observation
P
,T P a ,T P b
T
P T 1L
2L1 2L L
, , ,T T P P
![Page 8: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/8.jpg)
Orgad Keller - Algorithms 2 - Recitation 9 8
Why is that enough? In other words: Prove there is a p-match at
location iff . (HW) We are left with the question: How do we
solve step 3 efficiently?
Exercise
i 1 2i L L
![Page 9: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/9.jpg)
Ariel Rosenfeld- Algorithms 2 - Recitation 9 9
M-match
![Page 10: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/10.jpg)
Orgad Keller - Algorithms 2 - Recitation 9 10
When is the last occurrence?
We’ll build an array :
So, if , we know hasn’t appeared before. Otherwise, we’ll know exactly where it had appeared last.
Can we do this efficiently?
[1],..., [ ]A A m
1 1
1 1
,...,[ ]
, ,...,i i
i k i k i
i p p pA i
k p p p p p
[ ]A j j jp
![Page 11: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/11.jpg)
Orgad Keller - Algorithms 2 - Recitation 9 11
Building the Array
A
We’ll hold a Balanced Binary Search Tree for the symbols of the alphabet. Initially it will be empty.
We’ll go over the pattern. For each symbol, if it isn’t in the tree, we’ll add it with it’s index and update . Otherwise, we know exactly where it had last appeared, so we’ll update and then update the symbol in the tree with the new index.
Time: where .
A
( log )O m min ,m
![Page 12: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/12.jpg)
Orgad Keller - Algorithms 2 - Recitation 9 12
The Matching Itself
We move forward if either and .
We’ll hold and update a balanced BST as we go over the text as well. Time:
So overall algorithm time is Can we improve this further?
log min ,O n n
1 1,...,j jp p p 1 1,...,i i i jt t t 1 s.t. ,j j l i i ll j p p t t
log min ,O n n
![Page 13: Parameterized Matching Amir, Farach, Muthukrishnan Orgad Keller Modified by Ariel Rosenfeld.](https://reader030.fdocuments.net/reader030/viewer/2022032611/56649c745503460f94927449/html5/thumbnails/13.jpg)
Orgad Keller - Algorithms 2 - Recitation 9 13
The Trick
We’ll split the text into overlapping segments of size like this:
So every match in the text must appear in whole in one of the segments.
We’ll run the algorithm for each such segment. Time: where .
Overall for all segments:
n
m
n
m2m
2m 2m 2m 2m 2m 2m
2m2m 2m 2m 2m 2m
2m
( log )O m ( log ) ( log )n
mO m O n
min ,m