Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory...
-
Upload
anya-spurrell -
Category
Documents
-
view
214 -
download
2
Transcript of Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory...
![Page 1: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/1.jpg)
Improved hit criteria for DNA local Improved hit criteria for DNA local alignmentalignment
JOBIM 2004 Montréal - June 28th
Laurent Noé, Gregory KucherovLORIA, Nancy
France
![Page 2: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/2.jpg)
2
PlanPlan
Introduction– Local alignment– Heuristic methods
Hit criteria– Seed Models and extension proposed– Single/Multiple hit strategies and extension proposed
Experiments Conclusion
– Extensions
![Page 3: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/3.jpg)
3
Local alignment methodsLocal alignment methods
Why being interested in local alignment methods– Improvement needed
#sequences , #users , ( budget )
Dynamic programming (Smith-Waterman)– Give an exact solution– Quadratic cost
(Best optimization in [Crochemore et al 02])
Heuristic Algorithms– Fasta, Blast, PatternHunter, Blastz, Yass,…In practice
![Page 4: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/4.jpg)
4
Dot plot
ctcgactcgggctcacgctcgcaccgggttacagcggtcgattgctaggcctcgggctcgcgctcgcgcgctagacaccgggttacagcgt
Detected alignment
Seed filtering Seed filtering
Start with small conserved and easily detected fragments (seeds).
Then extend seeds to build possible alignments
Detected seeds
![Page 5: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/5.jpg)
5
Dot plot
ctcgactcgggctcacgctcgcaccgggttacagcggtcgattgctaggcctcgggctcgcgctcgcgcgctagacaccgggttacagcgt
Two questions usually askedTwo questions usually asked
1. seed model: What can serve as a seed?
2. hit criterion: What is the criterion that witnesses a potential alignment?
Detected alignment
Detected seeds → 1. Seed model
→ 2. Hit criterion
![Page 6: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/6.jpg)
6
1.1. What can serve as a seedWhat can serve as a seed
Exact similarity :
Seed Pattern :
Contiguous Seed
Example :
ATCAGT||||||ATCAGT######
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
######ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
######ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
######ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
######ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
######ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
######ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
![Page 7: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/7.jpg)
7
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
Spaced Seed Model Spaced Seed Model [Ma et al. 02: PATTERNHUNTER][Ma et al. 02: PATTERNHUNTER]
Seed Pattern : ###--#-##
‘#’ : obligatory match position‘-’ : joker position (“don’t care” position)
Weight : 6 [number of #] Span : 9 [number of all symbols]
Example : ###--#-##
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
###--#-##ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
###--#-##ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
###--#-##ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
###--#-##ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
###--#-##ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
![Page 8: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/8.jpg)
8
Spaced SeedsSpaced Seeds
Some probabilistic observations:
For spaced seeds, hits at subsequent positions are more independent events
For contiguous vs spaced seeds of the same weight, the expected number of hits is (basically) the same but the probabilities of having at least one hit are very different
||||||||||||||||| ###### ######
||||||||||||||||| ###--#-## ###--#-##
||||||||||||||||| ###### ######
||||||||||||||||| ###--#-## ###--#-##
![Page 9: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/9.jpg)
9
Some probabilistic observations:
ATCAGTGCAATGCTCAAGA|||||||||||||||||||ATCAGTGCAATGCTCAAGA
###--#-##
ATCAGTGCAATGCTCAAGA|||||||||||||||||||ATCAGTGCAATGCTCAAGA
######
ATCAGTGCAATGCTCAAGA|||||||||||||||||||ATCAGTGCAATGCTCAAGA###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-##
###--#-##
ATCAGTGCAATGCTCAAGA|||||||||||||||||||ATCAGTGCAATGCTCAAGA###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ######
######
ATCAGTGCAATGCTCAAGA|||||||||||||||||||ATCAGTGCAATGCTCAAGA
###--#-##
ATCAGTGCAATGCTCAAGA|||||||||||||||||||ATCAGTGCAATGCTCAAGA
######
ATCAGTGCAATGCTCAAGA|||||.|||||||||||||ATCAGCGCAATGCTCAAGA
###--#-##
ATCAGTGCAATGCTCAAGA|||||.|||||||||||||ATCAGCGCAATGCTCAAGA
######
ATCAGTGCAATGCTCAAGA|||||.|||||||||||||ATCAGCGCAATGCTCAAGA###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-##
###--#-##
ATCAGTGCAATGCTCAAGA|||||.|||||||||||||ATCAGCGCAATGCTCAAGA###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ######
######
ATCAGTGCAATGCTCAAGA|||||.|||||||||||||ATCAGCGCAATGCTCAAGA###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-##
###--#-##
ATCAGTGCAATGCTCAAGA|||||.|||||||||||||ATCAGCGCAATGCTCAAGA###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ######
######
ATCAGTGCAATGCTCAAGA|||||.|||||||||||||ATCAGCGCAATGCTCAAGA
###--#-##
ATCAGTGCAATGCTCAAGA|||||.|||||||||||||ATCAGCGCAATGCTCAAGA
######
ATCAGTGCAATGCTCAAGA|||||.|||||||:|||||ATCAGCGCAATGCGCAAGA
###--#-##
ATCAGTGCAATGCTCAAGA|||||.|||||||:|||||ATCAGCGCAATGCGCAAGA
######
ATCAGTGCAATGCTCAAGA|||||.|||||||:|||||ATCAGCGCAATGCGCAAGA###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-##
###--#-##
ATCAGTGCAATGCTCAAGA|||||.|||||||:|||||ATCAGCGCAATGCGCAAGA###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ######
######
ATCAGTGCAATGCTCAAGA|||||.|||||||:|||||ATCAGCGCAATGCGCAAGA###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-##
###--#-##
ATCAGTGCAATGCTCAAGA|||||.|||||||:|||||ATCAGCGCAATGCGCAAGA###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ######
######
ATCAGTGCAATGCTCAAGA|||||.|||||||:|||||ATCAGCGCAATGCGCAAGA
###--#-##
ATCAGTGCAATGCTCAAGA|||||.|||||||:|||||ATCAGCGCAATGCGCAAGA
######
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
###--#-##
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
######
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-##
###--#-##
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ######
######
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-##
###--#-##
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ######
######
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-## ###--#-##
###--#-##
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ###### ######
######
For contiguous vs spaced seeds of the same weight, the expected number of hits is (basically) the same but the probabilities of having at least one hit are very different
![Page 10: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/10.jpg)
10
Spaced seedsSpaced seeds
Spaced seed model is generally more sensitive than the contiguous seed model
Extend spaced seed model by taking into account DNA substitutions specificity
![Page 11: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/11.jpg)
11
Biological properties
Transitions are usually over-represented.Regularity phenomenon in coding sequences. Use those properties to extend the spaced seed model
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
Mutational events Mutational events
A T
G Ctransitions
transversions
.:
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
![Page 12: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/12.jpg)
12
BLASTZ modelBLASTZ model
[Schwartz et al. 03][Schwartz et al. 03]
A spaced seed that allows one possible transition substitution over its ‘#’ positions.
Problem : running time seed of large weight to obtain reasonable speed.
ATCAGGCATGCTAAGATCGGATCCTCAATGGCTCA|||.|||:|||.|||||.||:||||||:||.||||ATCGGGCTTGCCAAGATTGGTTCCTCATTGCCTCA
###-#--##--#-#--#--##ATCAGGCATGCTAAGATCGGATCCTCAATGGCTCA|||.|||:|||.|||||.||:||||||:||.||||ATCGGGCTTGCCAAGATTGGTTCCTCATTGCCTCA
###-#--##--#-#--#--##ATCAGGCATGCTAAGATCGGATCCTCAATGGCTCA|||.|||:|||.|||||.||:||||||:||.||||ATCGGGCTTGCCAAGATTGGTTCCTCATTGCCTCA
###-#--##--#-#--#--##ATCAGGCATGCTAAGATCGGATCCTCAATGGCTCA|||.|||:|||.|||||.||:||||||:||.||||ATCGGGCTTGCCAAGATTGGTTCCTCATTGCCTCA
###-#--##--#-#--#--##ATCAGGCATGCTAAGATCGGATCCTCAATGGCTCA|||.|||:|||.|||||.||:||||||:||.||||ATCGGGCTTGCCAAGATTGGTTCCTCATTGCCTCA
###-#--##--#-#--#--##ATCAGGCATGCTAAGATCGGATCCTCAATGGCTCA|||.|||:|||.|||||.||:||||||:||.||||ATCGGGCTTGCCAAGATTGGTTCCTCATTGCCTCA
###-#--##--#-#--#--##ATCAGGCATGCTAAGATCGGATCCTCAATGGCTCA|||.|||:|||.|||||.||:||||||:||.||||ATCGGGCTTGCCAAGATTGGTTCCTCATTGCCTCA
###-#--##--#-#--#--##ATCAGGCATGCTAAGATCGGATCCTCAATGGCTCA|||.|||:|||.|||||.||:||||||:||.||||ATCGGGCTTGCCAAGATTGGTTCCTCATTGCCTCA
###-#--##--#-#--#--##ATCAGGCATGCTAAGATCGGATCCTCAATGGCTCA|||.|||:|||.|||||.||:||||||:||.||||ATCGGGCTTGCCAAGATTGGTTCCTCATTGCCTCA
###-#--##--#-#--#--##ATCAGGCATGCTAAGATCGGATCCTCAATGGCTCA|||.|||:|||.|||||.||:||||||:||.||||ATCGGGCTTGCCAAGATTGGTTCCTCATTGCCTCA
![Page 13: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/13.jpg)
13
YASS model: YASS model: Transition Constrained SeedsTransition Constrained Seeds
Seed Pattern: ##@#-#@-###‘#’ : obligatory match position‘-’ : joker position (“don’t care” position)‘@’ : transition constrained position
transition constrained position: position that corresponds to either a match or a transition.
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
##@#-#@-###ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
##@#-#@-###ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
##@#-#@-###ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
##@#-#@-###ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
##@#-#@-###ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
##@#-#@-###ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
![Page 14: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/14.jpg)
14
Transition Constrained SeedsTransition Constrained Seeds
Seed Pattern: ##@#-#@-###‘#’ : obligatory match position‘-’ : joker position (“don’t care” position)‘@’ : transitions constrained position
Weight : 8 [number of # + half number of @]
@ carries 1 bit of information whereas # carries 2 bits.
@ adapted to GC-rich/poor genomes
![Page 15: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/15.jpg)
15
Spaced seeds and Spaced seeds and Transition-Constrained SeedsTransition-Constrained Seeds
Seed pattern ( why ##@#-#@-### and not #@-#-#-#@# ?) – Not chosen randomly → Need to:
• define an alignment model.• search for the best (at least a good) seed pattern according to
this model. ( Sensitivity : probability to detect any alignment given by the
model )
– Chosen model can drastically change the seed shape…
ExampleBernoulli model ##@-#@#--#-#-###Markov model ##@##-##@##
![Page 16: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/16.jpg)
16
– Bernoulli [Keich et al 02]
– Markov [Buhler et al 03]
– Automata (M3/M8) and HMMs [Brejova et al 03]
– Homogeneous alignments [Kucherov et al 04]
ATCAGTGCAATGCTCAAGA|||||.||.||||:|||||ATCAGCGCGATGCGCAAGA
|||||.||.||||:||||||||||.||.||||:|||||2222212212222022222
2222212212222022222
P(’2’) = 0.7, P(’1’) = 0.15, P(’0’) = 0.15
222221221222 X
Transition has an emission probability for each symbol
Ex : P(’2’) = 0.8, P(’1’) = 0.10, P(’0’) = 0.10
Probabilistic Alignment Models:Probabilistic Alignment Models:
“HSP” Alignments found by heuristic algorithms
![Page 17: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/17.jpg)
17
Seed DesignSeed Design
Alignment Model : Bernoulli– P(match) = 0.7, P(transition)=0.15, P(transversion)=0.15
– alignment length = 64
![Page 18: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/18.jpg)
18
Seed DesignSeed Design
Alignment Model : Markov– 5th Order, obtained on N.Menengitidis, S.Cerevisiae, Drosophila, and
Human sequences.
![Page 19: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/19.jpg)
19
ExperimentsExperiments
S.Cerevisiae/Neisseiria sequences
![Page 20: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/20.jpg)
20
To summarize ...To summarize ...
We have presented several seed models (contiguous, “classic” spaced seeds, BLASTZ)
We introduced transition-constrained seeds and showed how they improve the sensitivity
From detected seeds to detected alignments
![Page 21: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/21.jpg)
21
2.2. Hit criterionHit criterion
What is the criterion that witnesses a potential alignment ?
Restriction : only the information about seeds is available
Dot plot
ctcgactcgggctcacgctcgcaccgggttacagcggtcgattgctaggcctcgggctcgcgctcgcgcgctagacaccgggttacagcgt
Detected alignment
Detected seeds
→ 2. Hit criterion
![Page 22: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/22.jpg)
22
Several methods have been proposedSeveral methods have been proposed
FASTA:– Several small seeds on
proximal diagonals
BLAST: (single hit)– One “large” seed.
Gapped-BLAST: (double hit)– Two seeds on the same diagonal
To define a good criterion we have first to define a class of similarities we want to detect : mutation model
Dot plot
ctcgactcgggctcacgctcgcaccgggttacagcggtcgattgctaggcctcgggctcgcgctcgcgcgctagacaccgggttacagcgt Dot plot
ctcgactcgggctcacgctcgcaccgggttacagcggtcgattgctaggcctcgggctcgcgctcgcgcgctagacaccgggttacagcgt Dot plot
ctcgactcgggctcacgctcgcaccgggttacagcggtcgattgctaggcctcgggctcgcgctcgcgcgctagacaccgggttacagcgt Dot plot
ctcgactcgggctcacgctcgcaccgggttacagcggtcgattgctaggcctcgggctcgcgctcgcgcgctagacaccgggttacagcgt
![Page 23: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/23.jpg)
23
Mutation effect Mutation effect onon Seeds Seeds
Mutation effect
– Substitutions : “suppressing seeds”
– Indels : “diagonal shifts”
Remaining seeds
– Estimation of inter-seed distances• via a Waiting Time distribution
– Estimation of diagonals shifts• via a Random Walk model
ctcgactcgggctcacgctcgcaccgggttacagcggtcgattgctaggcctcgggctcgcgctcgcgcgctagacaccgggttacagcgt
![Page 24: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/24.jpg)
24
YASS hit criterionYASS hit criterion
According to these parameters, YASS propose:
– An intermediate criterion between BLAST single/Gapped Blast double hit criterion.
– Overlap controlled multi-hits
|:|||||||:|||:||| ###### ######
|:||||:|||||:|.|. ###--#-## ###--#-##
7 9
![Page 25: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/25.jpg)
25
SensiSensitivitytivity Comparison of BLASTn/Gapped-BLAST/YASS hit criteria
score 25
![Page 26: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/26.jpg)
26
SensiSensitivitytivity (cont) (cont) Comparison of BLASTn/Gapped-BLAST/YASS hit criteria
score 35
![Page 27: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/27.jpg)
27
YASS criterion mixed with spaced seedsYASS criterion mixed with spaced seeds
![Page 28: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/28.jpg)
28
ExperimentsExperiments
Local alignment sensitivity– YASS software / BLASTn (2.2.6 package)
M.t : M. tuberculosis CDC1551 S.s : Synechocystis sp. PCC 6803V.p : Vibrio p. RIMD 2210633 IY.p : Yersinia pestis KIM
![Page 29: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/29.jpg)
29
AdsAds
![Page 30: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/30.jpg)
30
AdsAds
YASS web page
http://www.loria.fr/projects/YASS
YASS can be queried online
http://yass.loria.fr
YASS is Open Source
![Page 31: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/31.jpg)
31
ConclusionsConclusions
Two improvements:– Transition-constrained spaced seeds– Hit criterion combining statistical models and advantage of
single/multi hit strategies.
A tool that implements both of them
![Page 32: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/32.jpg)
32
ExtensionsExtensions
To be done
– Multi-seed approach [Li03, Bulher04, Noe04]
– Seed design on the fly (non necessary static seeds).
– and others …
![Page 33: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/33.jpg)
33
QuestionsQuestions
agctga
g?cc??
tatgag
caa?ga
cca??a
ctc?gc
ggcgca
tctagg
ag??ac
c???tc
ttcttc
g
???? ??
![Page 34: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/34.jpg)
34
![Page 35: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/35.jpg)
35
|95550 |95540 |95530 |95520 |95510 |95500 |95490 |95480 CAAGTTTATTTCTGTAGAGAGTGTAGAAGACAGTTCGATTTTAGCCTTTTCAGCGGCTTCTCTTATTCTTTGGACAGCC||.|||:|||||:||::.::|::..::.||||.|||||||||.||.|||||.||.||||||.|.|::|:|||:|:||||CAGGTTAATTTCGGTTTGCTGGCCGCTGGACAATTCGATTTTGGCTTTTTCGGCAGCTTCTTTCAGGCGTTGTAGAGCC |583630 |583640 |583650 |583660 |583670 |583680 |583690 |583700
|95470 |95460 |95450 |95440 |95430 |95420 |95410 |95400ATACGGTCATTACTCAAATCGATACCGGTTTCTTTCTTGAAATGAGAAATAATTTCTTGCAACAAATAAATGTCAAAAT||:::|||:|::.|||||||.||.||:::||||||.|||||:|:.|:.||.||:|::|::|::|....:::|||.||.|ATCACGTCTTGTTTCAAATCAATGCCTTGTTCTTTTTTGAACTCGGCGATGATGTGGTCGATGAGGCGTTGGTCGAAGT |583710 |583720 |583730 |583740 |583750 |583760 |583770
|95390 |95380 |95370 |95360 |95350 |95340 CTTCGCCACCCAAATGGGTGTCACCATTGGTAGATTTAACCTCAAA------GATACC------GTTATCGATGTCCAG||||.||.|||||.:.|||.||.||.|||||:|:.:::||.||.|| |:..|| |||.:||||:||:|:CTTCACCGCCCAAGAAGGTATCGCCGTTGGTTGCCAATACTTCGAATTGTTTGTCGCCGTCGAGGTTGGCGATTTCGAT |583790 |583800 |583810 |583820 |583830 |583840 |583850
|95320 |95310 |95300 |95290 |95280 |95270 |95260GATTGAAATATCGAAAGTACCACCGCCCAAGTCGAAAACAGCAAT------GACTTTTGGCTCTGATTTATCTAGACCG|||:|||||||||||||||||.|||||||||||.:|:||.||:|. |:||||::::||:::|||.||.|:||||GATGGAAATATCGAAAGTACCGCCGCCCAAGTCATATACGGCTACTTTGCGGTCTTTGTTGTCGCCTTTGTCCATACCG |583870 |583880 |583890 |583900 |583910 |583920 |583930
|95250 |95240 |95230 |95220 |95210 |95200 |95190 TAAGCTAGGGCAGCAGCTGTTGGTTCGTTGACAACACGTAATACATTAAGCCCAATAATTTGTCCTGCGTCTTTAGTAG:|:||.|..||.||:||:||.||.|||||||..|..|||::.||.|.:|.:||....||:.|:|||.|||||||.||.|AATGCCAAAGCGGCTGCGGTCGGCTCGTTGATGATGCGTTTCACGTCCAAACCGGCGATACGGCCTACGTCTTTGGTGG |583950 |583960 |583970 |583980 |583990 |584000 |584010
|95170 |95160 |95150 |95140 |95130 |95120 |95110 CTTGTCTTTGGGCATCATTGAAGTAAGCAGGAACGGTGACAACAGCATTTTTGACGCTCTTCGCTAAGTAAGCCTCCGC||||:|:||||:..||.||||||||.|||||.|||||.|.:||.||:|.::|:||:.|.|.::|.||||||||.||:||CTTGACGTTGGCTGTCGTTGAAGTAGGCAGGGACGGTAATCACGGCTTCGGTTACTTTTTCGCCCAAGTAAGCTTCGGC |584030 |584040 |584050 |584060 |584070 |584080 |584090
|95090 |95080 |95070 |95060 |95050 |95040 |95030 TGTTTCCTTCATTTTATTTAAGATAAAACCTCCTATTTGGGCGGGGGAGTACGTTCTGTTTCTAGCCTCTACCCAGGCA:|.|||.|||||||||.:.|.||.:::::|::::|||||.|:.||.||::.|:.|.||..|.::||.|.||||||:||.GGCTTCTTTCATTTTACGCAGGACTTCTGCGGAAATTTGAGGAGGAGACAGCTCTTTGCCTTGTGCTTTTACCCATGCG |584110 |584120 |584130 |584140 |584150 |584160 |584170
|95010 |95000 |94990 |94980 |94970 |94960 |94950 TCTCCATTAGAATGCTTGACGATTTTGAAAGGAACCTGATTAATATCTCTTTGGACTTCAGCGTCCTCGAAACGGCGGC||:||.||.::.::.||||.|||||.||||||:|.::.:|..||.||:|:|||||||||::.|||.||.||:.:|.|||TCGCCGTTGTTGGCTTTGATGATTTCGAAAGGCATAGATTCGATGTCGCGTTGGACTTCTTTGTCTTCAAATTTGTGGC |584190 |584200 |584210 |584220 |584230 |584240 |584250
|94930 |94920 |94910 |94900 |94890 |94880 |94870 CGATTAAACGCTTAGTAGCAAACAAAGTGTTTTCTGAGTTTATGACGGATTGTCGTTTGGCTGGCTCACCAACTAAACG||||.|||||.||.|..||.:|:|:||||||||.:|:|||:.|:||:|:|||:||||||||:|||:||||.||:|..::CGATCAAACGTTTGGCGGCGTAAATAGTGTTTTTGGCGTTGGTTACCGCTTGGCGTTTGGCAGGCGCACCGACGAGGAT|584260 |584270 |584280 |584290 |584300 |584310 |584320 |584330
|94850 |94840 |94830 |94820 |94810 |94800 |94790 TTCTCCGTCTTTAGTGAAAGCCACTACAGACGGAGTAGTTCTTGAGCCTTCTGCATTTTCGATAATTCTCGGAACTTTA|||:|||.|:|.:.:.:||||:|.:||.|||||:||.||:|:||:|||||||||.||||||||:|.|.|:|:::::...TTCGCCGCCGTCCAAATAAGCGATAACGGACGGCGTGGTGCGTGCGCCTTCTGCGTTTTCGATCACTTTGGTTTGACCG |584340 |584350 |584360 |584370 |584380 |584390 |584400 |584410
|94780 |94770 |94760 |94750 |94740 CCTTCCATAATAGCTACCGCAGAATTGGTAGTACCTAAATCAATACCGATAAC..|||:.:|||.||.|::::|||.|||||:||||||||.||.||||||||:||TTTTCGGAAATGGCCAAACAAGAGTTGGTTGTACCTAAGTCGATACCGATTAC |584420 |584430 |584440 |584450 |584460
*(96264-94728)(582917-584471) Ev: 0 s: 1537/1555 r* S.cerevisiae.V (reverse complementary strand) / gi|12057208|(forward strand)* score = 1073 : bitscore = 491.92* mutations per triplet 347, 108, 152 (1.79e-36) | ts : 272 tv : 335
|96260 |96250 |96240 |96230 |96220 |96210 |96200 |96190TTCCGCTTCATTAACCATTCGATCAATCTCCGTATCAGATAGCCCAGACGCTCCGGCAACAGTGATGGAAGAGTCTTTG|||:||:||:||:|||||:||:||.||.||.:.:||.::.|.:||:||:|::||:::.|..||||||::.|:::||||.TTCGGCATCTTTCACCATGCGTTCGATTTCTTCTTCGCTCAAACCTGAAGAACCTTGGATGGTGATGTTGGCTGCTTTA |582920 |582930 |582940 |582950 |582960 |582970 |582980
|96180 |96170 |96160 |96150 |96140 |96130 |96120 TGGCTGGCGAGATCTTTTGCTGAAACGTTGATGATGCCGTTCGCATCGATATCAAAAGTGACTTCAATTTGTGGGGTAC.:|:||:|:::.|||||:||:|||||||::|:|||||||||:||.|||||.||.||.||:|||||.|||||.||:.|||CCGGTGCCTTTGTCTTTGGCGGAAACGTGCAGGATGCCGTTGGCGTCGATGTCGAAGGTTACTTCGATTTGCGGCATAC |583000 |583010 |583020 |583030 |583040 |583050 |583060
|96100 |96090 |96080 |96070 |96060 |96050 |96040 CTTTTGGAGCTGGAGGAATGCCCGCAAGAGTAAAATTACCTATTAATTTGTTATCCTTGACTAACTCCCTCTCACCTTG|:.:.||:||:||:|:.|||.|::|:|..:|.||:|:|||.|::.|||||||.:|:::..|::..||:|:.||.|||||CGCGCGGTGCAGGTGCGATGTCGCCCAAGTTGAACTGACCCAAAGATTTGTTGGCAGAAGCGCGTTCGCGTTCGCCTTG |583080 |583090 |583100 |583110 |583120 |583130 |583140
|96020 |96010 |96000 |95990 |95980 |95970 |95960 GAAAACTTTAACTTCCACCGATGTTTGACCTGATGCCGCAGTTGAAAAAATTTGAGATTTCTTATTGGGAATTGTAGAA:|.:||:|:.|.::..||.|:::||||...:::|:|:||.||:||.||:|.|||:||.:..||.:|:||.||:||.|:.CAGTACGTGGATGGTTACTGCGCTTTGGTTGTCTTCGGCGGTAGAGAACACTTGCGACGCTTTGGTCGGGATGGTGGTG |583160 |583170 |583180 |583190 |583200 |583210 |583220
|95940 |95930 |95920 |95910 |95900 |95890 |95880 TTTCTTGGGATTAATTTTGTAAAAACTCCTCCTAAAGTTTCAATACCCAATGATAGGGGAGTGACATCTAGCAACAAAA||..|.:|.||.|.|||:||:|::||:||:||.|:.|||||.||||||||:||.||.|||||:||.||.||.|.|||:|TTCTTCTGAATCAGTTTGGTCATCACGCCGCCCATGGTTTCGATACCCAAAGACAGAGGAGTTACGTCCAGTAGCAATA |583240 |583250 |583260 |583270 |583280 |583290 |583300
|95860 |95850 |95840 |95830 |95820 |95810 |95800 CATCGGTAACTTCACCAGACAAGACCGCAGCCTGTATAGCGGCCCCTAAAGCGACTGCTTCATCAGGGTTAACAGCTTT|.|||:|.:::.|.||.::|||:||.:|.:|.||:||:||:||:||||:.||.||:|||||.||||||||:||.:||||CGTCGCTGCGGCCGCCGCTCAATACTTCGCCTTGGATCGCTGCGCCTACGGCAACGGCTTCGTCAGGGTTCACGTCTTT |583320 |583330 |583340 |583350 |583360 |583370 |583380
|95780 |95770 |95760 |95750 |95740 |95730 |95720 TGATGCATCCTTACCGAATAATTTCTTTACAGTATCTGCAACCTTGGGCATCCTTGACATACCACCAACTAATAAAACA::..|::||.||.|||||:||::..||:||.|.:|||:::||.||:|||||:|::|||:::||.||.||.||:|::||.GCGCGGTTCTTTGCCGAAGAAGGCTTTAACGGCTTCTTGTACTTTCGGCATACGGGACTGCCCGCCGACCAAGATTACG |583400 |583410 |583420 |583430 |583440 |583450 |583460
|95710 |95700 |95690 |95680 |95670 |95660 |95650 |95640 TCCGATATATCTGAGGCGGTAATTCTTGCGTCTTTCAGTGCTTTTTTGACAGGATCAACCGTTCTATCAATCAATGGGG||::::||.||:::||.|:|:|::|.:||.|||||||.|||::|||||::|||:||.|.:|::|:.:.|||||.:::::TCGTCGATGTCGCCGGTGCTCAAGCCGGCATCTTTCAATGCAATTTTGCAAGGTTCGATAGAGCGGGTAATCAGGTCTT|583470 |583480 |583490 |583500 |583510 |583520 |583530 |583540
|95630 |95620 |95610 |95600 |95590 |95580 |95570 |95560 CGGTTATATTCTCAAGCTGAACCCTAGAAAAGGGCATACGAATATGCTTTGGGCCTGCAGCATCAGCAGTTATGAAAGG|....|:..|.||.|..|:..|:|:.|:||::::|||::::|:.||.||.|||||:|.:||.||:...||:|||:|:||CAACCAGGCTTTCGAATTTGGCGCGGGTAATTTTCATCGCCAAGTGTTTCGGGCCGGTTGCGTCCATGGTGATGTACGG |583550 |583560 |583570 |583580 |583590 |583600 |583610 |583620
![Page 36: Improved hit criteria for DNA local alignment JOBIM 2004 Montréal - June 28th Laurent Noé, Gregory Kucherov LORIA, Nancy France.](https://reader035.fdocuments.in/reader035/viewer/2022070306/55199d605503464d068b4a61/html5/thumbnails/36.jpg)
36
|95550 |95540 |95530 |95520 |95510 |95500 |95490 |95480 CAAGTTTATTTCTGTAGAGAGTGTAGAAGACAGTTCGATTTTAGCCTTTTCAGCGGCTTCTCTTATTCTTTGGACAGCC||.|||:|||||:||::.::|::..::.||||.|||||||||.||.|||||.||.||||||.|.|::|:|||:|:||||CAGGTTAATTTCGGTTTGCTGGCCGCTGGACAATTCGATTTTGGCTTTTTCGGCAGCTTCTTTCAGGCGTTGTAGAGCC |583630 |583640 |583650 |583660 |583670 |583680 |583690 |583700
|95470 |95460 |95450 |95440 |95430 |95420 |95410 |95400ATACGGTCATTACTCAAATCGATACCGGTTTCTTTCTTGAAATGAGAAATAATTTCTTGCAACAAATAAATGTCAAAAT||:::|||:|::.|||||||.||.||:::||||||.|||||:|:.|:.||.||:|::|::|::|....:::|||.||.|ATCACGTCTTGTTTCAAATCAATGCCTTGTTCTTTTTTGAACTCGGCGATGATGTGGTCGATGAGGCGTTGGTCGAAGT |583710 |583720 |583730 |583740 |583750 |583760 |583770
|95390 |95380 |95370 |95360 |95350 |95340 CTTCGCCACCCAAATGGGTGTCACCATTGGTAGATTTAACCTCAAA------GATACC------GTTATCGATGTCCAG||||.||.|||||.:.|||.||.||.|||||:|:.:::||.||.|| |:..|| |||.:||||:||:|:CTTCACCGCCCAAGAAGGTATCGCCGTTGGTTGCCAATACTTCGAATTGTTTGTCGCCGTCGAGGTTGGCGATTTCGAT |583790 |583800 |583810 |583820 |583830 |583840 |583850
|95320 |95310 |95300 |95290 |95280 |95270 |95260GATTGAAATATCGAAAGTACCACCGCCCAAGTCGAAAACAGCAAT------GACTTTTGGCTCTGATTTATCTAGACCG|||:|||||||||||||||||.|||||||||||.:|:||.||:|. |:||||::::||:::|||.||.|:||||GATGGAAATATCGAAAGTACCGCCGCCCAAGTCATATACGGCTACTTTGCGGTCTTTGTTGTCGCCTTTGTCCATACCG |583870 |583880 |583890 |583900 |583910 |583920 |583930
|95250 |95240 |95230 |95220 |95210 |95200 |95190 TAAGCTAGGGCAGCAGCTGTTGGTTCGTTGACAACACGTAATACATTAAGCCCAATAATTTGTCCTGCGTCTTTAGTAG:|:||.|..||.||:||:||.||.|||||||..|..|||::.||.|.:|.:||....||:.|:|||.|||||||.||.|AATGCCAAAGCGGCTGCGGTCGGCTCGTTGATGATGCGTTTCACGTCCAAACCGGCGATACGGCCTACGTCTTTGGTGG |583950 |583960 |583970 |583980 |583990 |584000 |584010
|95170 |95160 |95150 |95140 |95130 |95120 |95110 CTTGTCTTTGGGCATCATTGAAGTAAGCAGGAACGGTGACAACAGCATTTTTGACGCTCTTCGCTAAGTAAGCCTCCGC||||:|:||||:..||.||||||||.|||||.|||||.|.:||.||:|.::|:||:.|.|.::|.||||||||.||:||CTTGACGTTGGCTGTCGTTGAAGTAGGCAGGGACGGTAATCACGGCTTCGGTTACTTTTTCGCCCAAGTAAGCTTCGGC |584030 |584040 |584050 |584060 |584070 |584080 |584090
|95090 |95080 |95070 |95060 |95050 |95040 |95030 TGTTTCCTTCATTTTATTTAAGATAAAACCTCCTATTTGGGCGGGGGAGTACGTTCTGTTTCTAGCCTCTACCCAGGCA:|.|||.|||||||||.:.|.||.:::::|::::|||||.|:.||.||::.|:.|.||..|.::||.|.||||||:||.GGCTTCTTTCATTTTACGCAGGACTTCTGCGGAAATTTGAGGAGGAGACAGCTCTTTGCCTTGTGCTTTTACCCATGCG |584110 |584120 |584130 |584140 |584150 |584160 |584170
|95010 |95000 |94990 |94980 |94970 |94960 |94950 TCTCCATTAGAATGCTTGACGATTTTGAAAGGAACCTGATTAATATCTCTTTGGACTTCAGCGTCCTCGAAACGGCGGC||:||.||.::.::.||||.|||||.||||||:|.::.:|..||.||:|:|||||||||::.|||.||.||:.:|.|||TCGCCGTTGTTGGCTTTGATGATTTCGAAAGGCATAGATTCGATGTCGCGTTGGACTTCTTTGTCTTCAAATTTGTGGC |584190 |584200 |584210 |584220 |584230 |584240 |584250
|94930 |94920 |94910 |94900 |94890 |94880 |94870 CGATTAAACGCTTAGTAGCAAACAAAGTGTTTTCTGAGTTTATGACGGATTGTCGTTTGGCTGGCTCACCAACTAAACG||||.|||||.||.|..||.:|:|:||||||||.:|:|||:.|:||:|:|||:||||||||:|||:||||.||:|..::CGATCAAACGTTTGGCGGCGTAAATAGTGTTTTTGGCGTTGGTTACCGCTTGGCGTTTGGCAGGCGCACCGACGAGGAT|584260 |584270 |584280 |584290 |584300 |584310 |584320 |584330
|94850 |94840 |94830 |94820 |94810 |94800 |94790 TTCTCCGTCTTTAGTGAAAGCCACTACAGACGGAGTAGTTCTTGAGCCTTCTGCATTTTCGATAATTCTCGGAACTTTA|||:|||.|:|.:.:.:||||:|.:||.|||||:||.||:|:||:|||||||||.||||||||:|.|.|:|:::::...TTCGCCGCCGTCCAAATAAGCGATAACGGACGGCGTGGTGCGTGCGCCTTCTGCGTTTTCGATCACTTTGGTTTGACCG |584340 |584350 |584360 |584370 |584380 |584390 |584400 |584410
|94780 |94770 |94760 |94750 |94740 CCTTCCATAATAGCTACCGCAGAATTGGTAGTACCTAAATCAATACCGATAAC..|||:.:|||.||.|::::|||.|||||:||||||||.||.||||||||:||TTTTCGGAAATGGCCAAACAAGAGTTGGTTGTACCTAAGTCGATACCGATTAC |584420 |584430 |584440 |584450 |584460
*(96264-94728)(582917-584471) Ev: 0 s: 1537/1555 r* S.cerevisiae.V (reverse complementary strand) / gi|12057208|(forward strand)* score = 1073 : bitscore = 491.92* mutations per triplet 347, 108, 152 (1.79e-36) | ts : 272 tv : 335
|96260 |96250 |96240 |96230 |96220 |96210 |96200 |96190TTCCGCTTCATTAACCATTCGATCAATCTCCGTATCAGATAGCCCAGACGCTCCGGCAACAGTGATGGAAGAGTCTTTG|||:||:||:||:|||||:||:||.||.||.:.:||.::.|.:||:||:|::||:::.|..||||||::.|:::||||.TTCGGCATCTTTCACCATGCGTTCGATTTCTTCTTCGCTCAAACCTGAAGAACCTTGGATGGTGATGTTGGCTGCTTTA |582920 |582930 |582940 |582950 |582960 |582970 |582980
|96180 |96170 |96160 |96150 |96140 |96130 |96120 TGGCTGGCGAGATCTTTTGCTGAAACGTTGATGATGCCGTTCGCATCGATATCAAAAGTGACTTCAATTTGTGGGGTAC.:|:||:|:::.|||||:||:|||||||::|:|||||||||:||.|||||.||.||.||:|||||.|||||.||:.|||CCGGTGCCTTTGTCTTTGGCGGAAACGTGCAGGATGCCGTTGGCGTCGATGTCGAAGGTTACTTCGATTTGCGGCATAC |583000 |583010 |583020 |583030 |583040 |583050 |583060
|96100 |96090 |96080 |96070 |96060 |96050 |96040 CTTTTGGAGCTGGAGGAATGCCCGCAAGAGTAAAATTACCTATTAATTTGTTATCCTTGACTAACTCCCTCTCACCTTG|:.:.||:||:||:|:.|||.|::|:|..:|.||:|:|||.|::.|||||||.:|:::..|::..||:|:.||.|||||CGCGCGGTGCAGGTGCGATGTCGCCCAAGTTGAACTGACCCAAAGATTTGTTGGCAGAAGCGCGTTCGCGTTCGCCTTG |583080 |583090 |583100 |583110 |583120 |583130 |583140
|96020 |96010 |96000 |95990 |95980 |95970 |95960 GAAAACTTTAACTTCCACCGATGTTTGACCTGATGCCGCAGTTGAAAAAATTTGAGATTTCTTATTGGGAATTGTAGAA:|.:||:|:.|.::..||.|:::||||...:::|:|:||.||:||.||:|.|||:||.:..||.:|:||.||:||.|:.CAGTACGTGGATGGTTACTGCGCTTTGGTTGTCTTCGGCGGTAGAGAACACTTGCGACGCTTTGGTCGGGATGGTGGTG |583160 |583170 |583180 |583190 |583200 |583210 |583220
|95940 |95930 |95920 |95910 |95900 |95890 |95880 TTTCTTGGGATTAATTTTGTAAAAACTCCTCCTAAAGTTTCAATACCCAATGATAGGGGAGTGACATCTAGCAACAAAA||..|.:|.||.|.|||:||:|::||:||:||.|:.|||||.||||||||:||.||.|||||:||.||.||.|.|||:|TTCTTCTGAATCAGTTTGGTCATCACGCCGCCCATGGTTTCGATACCCAAAGACAGAGGAGTTACGTCCAGTAGCAATA |583240 |583250 |583260 |583270 |583280 |583290 |583300
|95860 |95850 |95840 |95830 |95820 |95810 |95800 CATCGGTAACTTCACCAGACAAGACCGCAGCCTGTATAGCGGCCCCTAAAGCGACTGCTTCATCAGGGTTAACAGCTTT|.|||:|.:::.|.||.::|||:||.:|.:|.||:||:||:||:||||:.||.||:|||||.||||||||:||.:||||CGTCGCTGCGGCCGCCGCTCAATACTTCGCCTTGGATCGCTGCGCCTACGGCAACGGCTTCGTCAGGGTTCACGTCTTT |583320 |583330 |583340 |583350 |583360 |583370 |583380
|95780 |95770 |95760 |95750 |95740 |95730 |95720 TGATGCATCCTTACCGAATAATTTCTTTACAGTATCTGCAACCTTGGGCATCCTTGACATACCACCAACTAATAAAACA::..|::||.||.|||||:||::..||:||.|.:|||:::||.||:|||||:|::|||:::||.||.||.||:|::||.GCGCGGTTCTTTGCCGAAGAAGGCTTTAACGGCTTCTTGTACTTTCGGCATACGGGACTGCCCGCCGACCAAGATTACG |583400 |583410 |583420 |583430 |583440 |583450 |583460
|95710 |95700 |95690 |95680 |95670 |95660 |95650 |95640 TCCGATATATCTGAGGCGGTAATTCTTGCGTCTTTCAGTGCTTTTTTGACAGGATCAACCGTTCTATCAATCAATGGGG||::::||.||:::||.|:|:|::|.:||.|||||||.|||::|||||::|||:||.|.:|::|:.:.|||||.:::::TCGTCGATGTCGCCGGTGCTCAAGCCGGCATCTTTCAATGCAATTTTGCAAGGTTCGATAGAGCGGGTAATCAGGTCTT|583470 |583480 |583490 |583500 |583510 |583520 |583530 |583540
|95630 |95620 |95610 |95600 |95590 |95580 |95570 |95560 CGGTTATATTCTCAAGCTGAACCCTAGAAAAGGGCATACGAATATGCTTTGGGCCTGCAGCATCAGCAGTTATGAAAGG|....|:..|.||.|..|:..|:|:.|:||::::|||::::|:.||.||.|||||:|.:||.||:...||:|||:|:||CAACCAGGCTTTCGAATTTGGCGCGGGTAATTTTCATCGCCAAGTGTTTCGGGCCGGTTGCGTCCATGGTGATGTACGG |583550 |583560 |583570 |583580 |583590 |583600 |583610 |583620