Simple Algorithm for Sorting the Fibonacci String Rotations

Post on 06-Feb-2016

21 views 2 download

description

Simple Algorithm for Sorting the Fibonacci String Rotations. Manolis Christodoulakis King’s College London Joint work with Costas S. Iliopoulos Yoan Jos é Pinz ó n Ardila. Our Goal. What makes Fibonacci strings a best case input for the Burrows-Wheeler Transform (BWT)? - PowerPoint PPT Presentation

Transcript of Simple Algorithm for Sorting the Fibonacci String Rotations

Simple Algorithm for Sorting theFibonacci String Rotations

Manolis ChristodoulakisManolis Christodoulakis King’s College London

Joint work with Costas S. IliopoulosYoan José Pinzón Ardila

SOFSEM 2006 2

Our GoalOur Goal

What makes Fibonacci strings a best case input for the Burrows-Wheeler Transform (BWT)?

Relationship between different rotations of a Fibonacci string

What is their lexicographic order? Side effect: we can deduce the symbol

stored at any position of any Fibonacci string in constant time (without using , provided that the fn values are known)

SOFSEM 2006 3

Fibonacci Strings & NumbersFibonacci Strings & Numbers

The n-th Fibonacci stringFn = Fn-1Fn-2 n≥2 F0=b, F1=a

The n-th Fibonacci numberfn = fn-1+fn-2 n≥2 f0=1, f1=1

F2 a= b

F3 a= b a

F4 a= b a a b

F1 a=

F0 b= f0 1=

f1 1=f2 2=f3 3=f4 5=

SOFSEM 2006 4

NotationNotation

The i-th rotation of a string

where i is taken modulo n.

rank(i,x) = the rank of Ri(x) rot(ρ,x) = the rotation whose rank is ρ

0 1 … i-1 i …n-1x =

0 1 … i-1 i …n-1Ri(x)

=

SOFSEM 2006 5

Burrows-Wheeler Transform (BWT)Burrows-Wheeler Transform (BWT)

M.Burrows and D.J.Wheeler. 1994 Purpose: to make a string more

compressible BWT Algorithm:

1. Create list of all rotations2. Sort them3. Output last symbol of every rotation4. Output the rank of the 0-th rotation

SOFSEM 2006 6

BWT on Fibonacci StringsBWT on Fibonacci Strings

F5 = abaababa, f5 = 8

R0(F5) a= b a a b a b aR1(F5) b= a a b a b a aR2(F5) a= a b a b a a bR3(F5) a= b a b a a b aR4(F5) b= a b a a b a aR5(F5) a= b a a b a a bR6(F5) b= a a b a a b aR7(F5) a= a b a a b a b

R0(F5) a= b a a b a b a

R1(F5) b= a a b a b a a

R2(F5) a= a b a b a a b

R3(F5) a= b a b a a b a

R4(F5) b= a b a a b a a

R5(F5) a= b a a b a a b

R6(F5) b= a a b a a b a

R7(F5) a= a b a a b a b

SOFSEM 2006 7

Properties of Fibonacci StringsProperties of Fibonacci Strings

The number of ‘b’ in Fn is fn-2

Proof: By induction.

C.S.Iliopoulos, D.W.Moore and W.F.Smyth. 1997Fn = Fn-2Fn-3…F1un, un = ba (n odd)

un = ab (n even)

Let’s call this the IMS formula.

SOFSEM 2006 8

Similarities in RotationsSimilarities in Rotations

R0(Fn) differs from Rfn-2(Fn) in 2 symbols Proof:

R0(Fn) = Fn-2Fn-3…F1un

Rfn-2(Fn) = Fn-3…F1unFn-2 (1)

R0(Fn) = Fn-1Fn-2

= Fn-3…F1un-1Fn-2 (2) Ri(Fn) differs from Ri+fn-2(Fn) in 2 symbols Proof:

Ri(Fn) = Ri(R0(Fn))

Ri+fn-2(Fn) = Ri(Rfn-2(Fn))

SOFSEM 2006 9

Relative Order of RotationsRelative Order of Rotations

Ri(Fn) < Ri+fn-2(Fn) for n odd, i fn-1-1 Proof:

R0(Fn) = Fn-3…F1un-1Fn-2

Rfn-2(Fn) = Fn-3…F1un Fn-2

For i=fn-1-1:

Ri(Fn) = bFn-2Fn-3…F1a

Ri+fn-2(Fn)= aFn-2Fn-3…F1b

Similarly, Ri(Fn) > Ri+fn-2(Fn) for n even, i fn-1-1

= Fn-3 … F1 ab Fn-2

= Fn-3 … F1 ba Fn-2

SOFSEM 2006 10

Sorted List of RotationsSorted List of Rotations

We proved (n odd):Ri(Fn) < Ri+fn-2(Fn) i fn-1-1 (3)

We will now prove that there is no j s.t.Ri(Fn) < Rj(Fn) < Ri+fn-2(Fn)

Proof: (constructive)Start at i=fn-1 and construct the partial list

Ri Ri+fn-2 Ri+2fn-2 Ri+3fn-2 … Ri+kfn-2 …

for as long asi+kfn-2 fn-1-1 (mod fn) kfn-1

I.e. the list is complete!

SOFSEM 2006 11

Identify Rotation Identify Rotation (i)(i) by Rank by Rank ((ρρ))

Therefore, for n odd:rot(ρ,Fn) = fn-1

= (ρfn-2-1) mod fn

Similarly, for n even, the sorted list is constructed bottom-up giving

rot(ρ,Fn) = (-(ρ+1)fn-2-1) mod fn

+ρfn-2) mod fn(

SOFSEM 2006 12

Identify Rank Identify Rank ((ρρ)) of a Rotation of a Rotation (i)(i)

This is simply the inverse of the previous function

n oddrank(i,Fn) = ((i+1)fn-2) mod fn

n evenrank(i,Fn) = ((i+1)fn-2-1) mod fn

SOFSEM 2006 13

Symbols of Fibonacci StringsSymbols of Fibonacci Strings

Fn[i] = ? Observe that

Fn[i] = Ri(Fn)[0]

In the sorted list of rotations, the first fn-1 rotations start with ‘a’, the rest with ‘b’

Thus Fn[i] can be deduced from rank(i,Fn)

If rank(i,Fn) ≤ fn-1 then Fn[i]=a else b.

SOFSEM 2006 14

BWT & Fibonacci ― The Quick WayBWT & Fibonacci ― The Quick Way

The first fn-2 symbols of BWT are ‘b’ Proof: (n odd)

We proved the first fn-2 rotations have index

(ρ·fn-2-1)modfn for 0 ≤ ρ < fn-2

The last symbol of these rotations isFn[ (ρ·fn-2-1 )modfn ]

Which for 0 ≤ ρ < fn-2 is ‘b’

The next fn-1 symbols of BWT are ‘a’ Proof: Consequence of previous lemma

+fn-1