Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

45
Function Matching Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    1

Transcript of Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Page 1: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Function MatchingFunction Matching

Amihood Amir

Yonatan Aumann

Moshe Lewenstein

Ely Porat

Bar Ilan University

Page 2: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Prog.c

int a,b;

a=1;a = g(a)*5+f(a);b=2;a = func(a,b);a = a*g(b);b=1;b = g(b)*5+f(b);….

Baker’s Parameterized MatchingBaker’s Parameterized Matching

Page 3: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Prog.c

int a,b;

a=1;a = g(a)*5+f(a);b=2;a = func(a,b);a = a*g(b);b=1;b = g(b)*5+f(b);….

Baker’s Parameterized MatchingBaker’s Parameterized Matching

c=1;c = g(c)*5+f(c);

Pattern

Baker’s work

pdup dupstat psearch

SICOMP 1997 JCSS 1996

Page 4: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Two dimensional parameterized matchingTwo dimensional parameterized matching

pattern

‘A horse is a horse,it ain’t make a differencewhat color it is’ John Wayne

Page 5: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Input P = p1…pm over alphabet T = t1 . . . tn over alphabet

Output: locations i of T, for which a bijection : exists s.t.

(P) = (p1) (p2)… (pm) = ti…ti+m-1

T

P

TPΠ

Π Π Π Π

Parameterized MatchingParameterized Matching

Page 6: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Parameterized MatchingParameterized Matching

• One dimensional

• Baker 1996, JCSS - Suffix Trees

• Baker 1997, SICOMP - Boyer Moore

• Amir, Farach, Muthu 1995, IPL - Knuth-Morris-Pratt

• Two dimensional

Regular methods fail !!

Page 7: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Function MatchingFunction Matching

Input: P = p1…pm over alphabet T = t1 . . . tn over alphabet

Output: locations i of T, where f: exists s.t.

f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1

T

P

TP

Page 8: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Input: P = p1…pm over alphabet T = t1 . . . tn over alphabetT

P

P = h e h a e hT = a b c b a c b a d a b d a d d a d

Function MatchingFunction Matching

TPOutput: locations i of T, where f: exists s.t.

f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1

Page 9: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Input: P = p1…pm over alphabet T = t1 . . . tn over alphabet T

P

P = h e h a e hT = a b c b a c b a d a b d a d d a d

f(h) = bf(e) = cf(a) = a

Function MatchingFunction Matching

TPOutput: locations i of T, where f: exists s.t.

f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1

Page 10: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Input: P = p1…pm over alphabet T = t1 . . . tn over alphabet T

P

P = h e h a e hT = a b c b a c b a d a b d a d d a d

f(h) = af(e) = df(a) = b

Function MatchingFunction Matching

TPOutput: locations i of T, where f: exists s.t.

f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1

Page 11: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Input: P = p1…pm over alphabet T = t1 . . . tn over alphabet T

P

P = h e h a e hT = a b c b a c b a d a b d a d d a d

f(h) = df(e) = af(a) = d

Function MatchingFunction Matching

TPOutput: locations i of T, where f: exists s.t.

f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1

Page 12: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Input: P = p1…pm over alphabet T = t1 . . . tn over alphabet T

P

P = h e h a e hT = a b c b a c b a d a b d a d d a d

f(h) = ??no match !

Function MatchingFunction Matching

TPOutput: locations i of T, where f: exists s.t.

f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1

Page 13: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Function Matching vs. Parameterized MatchingFunction Matching vs. Parameterized Matching

P p-matches ti…ti+m-1 iff

1. P f-matches ti…ti+m-1

and 2. # of symbols in ti…ti+m-1 = # of symbols in P

P = h e h a e h h e h a e hT = a b c b a c b a d a b d a d d a d

f(h) = df(e) = af(a) = d

f(h) = bf(e) = cf(a) = a

Page 14: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Naïve AlgorithmNaïve Algorithm

At each location i of text T check if pattern f-matches

CheckFor each letter ‘a’ in pattern Are elements aligned with the pattern ‘a’s the same? no? declare ‘no match’ All letters “OK” – declare ‘match’

Running time: O(nm), where m = |P| and n = |T|

Page 15: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Function Matching with Don’t CaresFunction Matching with Don’t Cares

Input: P = p1…pm over alphabet {?} T = t1 . . . tn over alphabet T

P

P = h e ? ? e hT = a b c b a c b c d b c d a d d a d

TPOutput: locations i of T, where f: exists s.t.

f(P) = f(p1)f(p2)…f(pm) = ti…ti+m-1,

f(?) - wildcard

Page 16: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Why do we need don’t cares?Why do we need don’t cares?

Pattern

Text

Page 17: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Linearize Text and PatternLinearize Text and Pattern

Text

Pattern

…Line 1 Line 2

T =

Page 18: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Linearize Text and PatternLinearize Text and Pattern

Text

Pattern

…Line 5 Line 6

T= … P = ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?

Line 1 Line 2

n

n

m

m

n-m n-m

Page 19: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

t1 t2 t3 t4 . . . tn-2 tn-1 tn pm pm-1 . . . p2 p1

p1t1 p1t2 . . . p1tn-2 p1tn-1 p1tn p2t1 p2t2 p2t3 . . . p2tn-2 p2tn-1 p2tn

p3t1 p3t2 p3t3 p3t3 . . . p3tn-1 p3tn

pmt1 . . . pmtm pmtm+1 . . pmtn-1 pmtn

. . .. . .

..

Polynomial Multiplication - ConvolutionsPolynomial Multiplication - Convolutions

. . .. . .

Running time: O(n log m)

Page 20: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

t1 t2 t3 t4 . . . tn-2 tn-1 tn pm pm-1 . . . p2 p1

p1t1 p1t2 . . . p1tn-2 p1tn-1 p1tn p2t1 p2t2 p2t3 . . . p2tn-2 p2tn-1 p2tn

p3t1 p3t2 p3t3 p3t4 . . . p3tn-1 p3tn

pmt1 . . . pmtm pmtm+1 . . pmtn-1 pmtn

. . .. . .

..

Convolutions: Fischer-Patterson [1974]Convolutions: Fischer-Patterson [1974]

p1 p2 p3 p4 . . . pm

m

iiitp

1

. . .. . .

Page 21: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

t1 t2 t3 t4 . . . tn-2 tn-1 tn pm pm-1 . . . p2 p1

p1t1 p1t2 . . . p1tn-2 p1tn-1 p1tn p2t1 p2t2 p2t3 . . . p2tn-2 p2tn-1 p2tn

p3t1 p3t2 p3t3 p3t4 . . . p3tn-1 p3tn

pmt1 . . . pmtm pmtm+1 . . pmtn-1 pmtn

. . .. . .

..

p1 p2 p3 p4 . . . pm

m

iiitp

11

. . .. . .

Convolutions: Fischer-Patterson [1974]Convolutions: Fischer-Patterson [1974]

Page 22: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

How does this help for Function Matching?How does this help for Function Matching?

beneath each symbol from the pattern alphabet all text characters must be the same

The property that needs to be checked is:

Page 23: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? e PR = e ? h e a h e h

Example -Example -

Page 24: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

h in P vs.a in T

T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? e PR = e ? h e a h e h

Example -Example -

Ta = 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1PR

h = 0 0 1 0 0 1 0 1

Page 25: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

h - a Ta = 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1PR

h = 0 0 1 0 0 1 0 1 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 1 0 0 1 1 1 0 2 0 2 1 0 3 0 1 2 0 1 2 0 1 1 0 1

T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? e PR = e ? h e a h e h

Example -Example -

Page 26: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

h - a Ta = 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1PR

h = 0 0 1 0 0 1 0 1 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

0 0 1 0 0 1 1 1 0 2 0 2 1 0 3 0 1 2 0 1 2 0 1 1 0 1

T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? e PR = e ? h e a h e h

Example -Example - h e h a e h ? e

Page 27: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

h - a

0 0 1 0 0 1 1 1 0 2 0 2 1 0 3 0 1 2 0 1 2 0 1 1 0 1

=> in O(n log m) time!!

T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? e PR = e ? h e a h e h

Example -Example -

Ta = 1 0 0 0 1 0 0 1 0 1 0 0 1 0 0 1 0 0 1PR

h = 0 0 1 0 0 1 0 1

Page 28: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

h - a 1 0 2 0 2 1 0 3 0 1 2 0

=> in O(| | n log m) time!!

h - b 0 3 0 1 1 1 1 0 1 0 1 0

h - c 2 0 1 2 0 1 1 0 1 0 0 0

h - d 0 0 0 0 0 0 1 0 1 2 0 3

T

0 1 0 0 0 0 0 1 0 0 0 1Match(h)

T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? e PR = e ? h e a h e h

Example -Example -

Page 29: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

In general - the AlgorithmIn general - the Algorithm

• For each character ‘a’ in create Pa

• For each character ‘b’ in create Tb

• For all Pa and Tb multiply them and construct Match(a) for each ‘a’ in

• Announce each location i of T as a ‘match’ if Match(a)[i] = 1 for all a’s in P

=> in O(| || | n log m) time.T P

T

P

P

Page 30: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Improvement Improvement

Lemma: Let a1, ..., ak , then

k iff

for all i,j, ai = aj

Ν

k

1h

2h

k

1h h

2 )a(a

Idea: Let’s encode text with numbers for symbols

and encode pattern to compute their sum

and separately their sum of squares.

Page 31: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Improvement Improvement

Lemma: Let a1, ..., ak , then

k iff

for all i,j, ai = aj

Ν

T# = 1 2 3 2 13 2 1 3 1 2 4 1 4 4 1 4 5 1T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? ePe = 0 1 0 0 1 0 0 1

Example: Compute sum of text char’s beneath “e”

k

1h

2h

k

1h h

2 )a(a

Page 32: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Improvement Improvement

Lemma: Let a1, ..., ak , then

k iff

for all i,j, ai = aj

Ν

T#2= 1 4 9 4 1 9 4 1 9 1 4 16 1 16 16 1 16 25 1

T# = 1 2 3 2 1 3 2 1 3 1 2 4 1 4 4 1 4 5 1T = a b c b a c b a c a b d a d d a d e aP = h e h a e h ? ePe = 0 1 0 0 1 0 0 1

Example: Compute sum of squares beneath “e”

k

1h

2h

k

1h h

2 )a(a

Page 33: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Improvement Improvement

Lemma: Let a1, ..., ak , then

k iff

for all i,j, ai = aj

Ν

k

1h

2h

k

1h h

2 )a(a

Running Time:

Two convolutions for each pattern character.

O(| | n log m)P

Page 34: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Can we do better for big alphabets?

We have seen – 2 algorithms for Function Matching

1. O(nm) - naïve algorithm

2. O(| | n log m) - convolution basedP

We will see:

1. O(n log2m) - randomized convolutions based2. Lower bound of (nm) for deterministic

convolutions based methodsΩ

Page 35: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Def:Def: A pattern is 2-charactered if every character appears at most twice in the pattern.

Example:Example: P = a b c b c c b b P1 = a1 b1 c1 b1 c1 c2 b2 b2 (even pairs) P2 = a1 b1 c1 b2 c2 c2 b2 b3 (odd pairs)

Lemma: Lemma: Let P be a pattern and T a text. 2-charactered patterns P1 and P2 s.t. at loc. i of T P f-matches iff P1 and P2 f-match.

Page 36: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Situation:Situation: An algorithm for Function Matching with 2-charactered patterns a general algorithm for Function Matching.

So,all that needs to be checked is that: each pair in P has equal text symbols beneath it.each pair in P has equal text symbols beneath it.

Page 37: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

1.1. For each character:For each character: - a in T, randomly choose ra in {0, 1} - relace all a’s in T with ra - get T’

- b in P, randomly choose sb in {1,2} - set first b to be sb and the second b to be -sb - get P’

2. Convolve T’ and P’R

3. For each location i, for which T’*P’R[i] equals 0 for the convolutiondeclare a ‘match’

New Randomized AlgorithmNew Randomized Algorithm

Page 38: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Example:Example:

P = v q v u q u ? sT = a b a a b a b a c a b d a b c b d b a

f(a) =f(b) =f(c) =f(d) =

1001

g(v) =g(q) =g(u) =

268

f(T) = 1 0 1 1 0 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1

g(P) = 2 6 –2 8 –6 –8 0 0

2+0–2+8+0–8+0+0 = 0

h(v) = ah(q) = bh(u) = ah(s) = a

Page 39: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Example:Example:

P = v q v u q u ? sT = a b a a b a b a c a b d a b c b d b a

f(a) =f(b) =f(c) =f(d) =

1001

g(v) =g(q) =g(u) =

268

f(T) = 1 0 1 1 0 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1

g(P) = 2 6 –2 8 –6 –8 0 0

0+6–2+0-6+0+0+0 = -2

Page 40: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Example:Example:

P = v q v u q u ? sT = a b a a b a b a c a b d a b c b d b a

f(a) =f(b) =f(c) =f(d) =

1001

g(v) =g(q) =g(u) =

268

f(T) = 1 0 1 1 0 1 0 1 0 0 1 0 1 1 0 0 0 1 0 1

g(P) = 2 6 –2 8 –6 –8 0 0

0= 2+6+0+0+0-8+0+0

Page 41: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Running Time: Running Time: O(nk log m) with probability 2-k

O(n log2m) with probability 1/m

if P f-matches at location i of T then f(T)*g(P)R [i+m-1] is trivially always equal to 0

if P does not f-match at location i of T then for each convolution <f,g>, f(T)*g(P)R [i+m-1],equals 0 with probability ½with k rounds of amplification the probability is (½)k

Correctness:Correctness:

Page 42: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Limitation of the Convolutions ModelLimitation of the Convolutions Model

Can we do the same deterministically? No!

To show this we use the model of communication complexity

Alice Bob

xf(x,y)

y

Page 43: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Limitation of the Convolutions ModelLimitation of the Convolutions Model

Known:Known: for x,y in {0,1}k the communication complexity of equals(x,y) is (k)

Take pattern P = a1 a2 a3 … am a1 a2 a3 … am, where i j ai aj

Given a collection of convolutions {<g(P), f(T)>}the convolutions of location i, (g(P)*f(t))[i+m-1] = g(aj )*f(ti+j-1) + g(aj )*f(ti+j+m-1). Since we arein essence comparing ti…ti+m-1 to ti+m…ti+2m-1

we get the equal information from the convolution.This is lower bounded by (m) for each location,In general (nm)

ΩΩ

m

j 1

m

j 1

Page 44: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

Another Application for Function MatchingAnother Application for Function Matching

Protein Folding detection:

1 2 3 4 5 6

78910

789

10

1 2 3

P = 1 2 3 4 5 6 7 8 9 10 10 9 8 7 6 5 4 11 12 … 12 11 3 2 1

Page 45: Function Matching Amihood Amir Yonatan Aumann Moshe Lewenstein Ely Porat Bar Ilan University.

QuestionsQuestions

1. Can Function Matching be solved deterministicallyin o(nm) time for big alphabets?

2. Are there special cases of Function Matching thatare easier (other than Parameterized Matching andother trivial ones)?

3. Does 2-dimensional Parameterized Matching needto be solved with function matching?