Suffix arrays
-
Upload
strand-life-sciences-pvt-ltd -
Category
Technology
-
view
626 -
download
2
Transcript of Suffix arrays
C G A C G
The Text
C
C
G
T
T
A C
A G
A C
T
3 1 4 6 2 5 7
Suffix Tree
Suffix ArraySorted List of
Suffixes
Task
Sort these suffixes
lexicographically
O(n log n) comparisons
each taking up to n time
Obtain two arrays, f[i]: sorted order of
ith suffix, g[i]: which suffix is ith
highest
String of length n with characters in the range 1..n
Sorting Even Suffixes
Sort these n/2 pairs and map them to single
chars in the range 1..n/2
A1A2
A3A4
New text of half the
length; sort suffixes
recursively
Sorting Odd Suffixes
A1,E1 A2,E2 A3,E3 A4,E4
Sort these n/2 pairs, E’s are
the even suffixes, whose order we know
O1 O2 O3 O4
Merging
Do we have any info to determine
the relative order of an odd suffix and
an even one?
A,E B,O
O E
The Trick Sanders, Karkkainnen
Split suffixes into 3 groups instead of 2, so 0 mod 3, 1 mod 3 and 2
mod 3
0 1 2
Sorting 0 and 1 Together
A B C D E F G H I J K L
Sort these 2n/3 triplets
and map them to single chars
New text of length 2n/3; sort suffixes recursively
Sorting Suffixes in 2
A1,01
Sort these n/3 pairs, 0’s are
the mod 0 suffixes, whose order we know
21 22 23 24
A2,02 A3,03 A4,04
Generalization
v 2v 3v
This string has size |D|n/v
Set D of indices mod v
Time taken to create this string
is O(n |D|)
Sorting suffixes of this string gives the sorted order
of all suffixes which begin at
indices j such that j mod v is in D
Key Property of D
For any 2 indices i and j i-j mod v is the distance between some two beads in D
x<v
D is a Difference Cover if distances between beads in D generate 0,1…,v-1
x<v