CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong...
-
Upload
dennis-austen-kennedy -
Category
Documents
-
view
215 -
download
1
Transcript of CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong...
![Page 1: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/1.jpg)
CZ5225: Modeling and Simulation in BiologyCZ5225: Modeling and Simulation in Biology
Lecture 9: Next Generation SequencingLecture 9: Next Generation Sequencing
Prof. Chen Yu ZongProf. Chen Yu Zong
Tel: 6516-6877Tel: 6516-6877Email: Email: [email protected]
http://bidd.nus.edu.sgRoom 08-14, level 8, S16, NUSRoom 08-14, level 8, S16, NUS
![Page 2: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/2.jpg)
OutlineOutline
• First generation sequencing
• Next generation sequencing
• Third generation sequencing
• Analysis challenges
![Page 3: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/3.jpg)
Sanger SequencingSanger Sequencing
• DNA is fragmented• Cloned to a plasmid
vector• Cyclic sequencing
reaction• Separation by
electrophoresis• Readout with
fluorescent tags
![Page 4: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/4.jpg)
Steps to Assemble a GenomeSteps to Assemble a Genome
1. Find overlapping reads
4. Derive consensus sequence ..ACGATTACAATAGGTT..
2. Merge some “good” pairs of reads into longer contigs
3. Link contigs to form supercontigs
Some Terminology
read a 500-900 long word that comes out of sequencer
mate pair a pair of reads from two endsof the same insert fragment
contig a contiguous sequence formed by several overlapping readswith no gaps
supercontig an ordered and oriented set(scaffold) of contigs, usually by mate
pairs
consensus sequence derived from thesequene multiple alignment of reads
in a contig
![Page 5: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/5.jpg)
Sequencing Types and ApplicationsSequencing Types and Applications
![Page 6: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/6.jpg)
Cyclic-Array MethodsCyclic-Array Methods
• DNA is fragmented• Adaptors ligated to
fragments• Several possible
protocols yield array of PCR colonies.
• Enyzmatic extension with fluorescently tagged nucleotides.
• Cyclic readout by imaging the array.
![Page 7: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/7.jpg)
Emulsion PCREmulsion PCR
• Fragments, with adaptors, are PCR amplified within a water drop in oil.
• One primer is attached to the surface of a bead. • Used by 454, Polonator and SOLiD.
![Page 8: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/8.jpg)
Bridge PCRBridge PCR
• DNA fragments are flanked with adaptors.• A flat surface coated with two types of primers,
corresponding to the adaptors.• Amplification proceeds in cycles, with one end of each
bridge tethered to the surface.• Used by Solexa.
![Page 9: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/9.jpg)
Comparison of Existing MethodsComparison of Existing Methods
![Page 10: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/10.jpg)
Genome Assembly: Find Overlapping ReadsGenome Assembly: Find Overlapping Reads
aaactgcagtacggatctaaactgcag aactgcagt… gtacggatct tacggatctgggcccaaactgcagtacgggcccaaa ggcccaaac… actgcagta ctgcagtacgtacggatctactacacagtacggatc tacggatct… ctactacac tactacaca
(read, pos., word, orient.)
aaactgcagaactgcagtactgcagta… gtacggatctacggatctgggcccaaaggcccaaacgcccaaact…actgcagtactgcagtacgtacggatctacggatctacggatcta…ctactacactactacaca
(word, read, orient, pos.)
aaactgcagaactgcagtacggatcta actgcagta actgcagtacccaaactgcggatctacctactacacctgcagtacctgcagtacgcccaaactggcccaaacgggcccaaagtacggatcgtacggatctacggatcttacggatcttactacaca
![Page 11: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/11.jpg)
• Find pairs of reads sharing a k-mer, k ~ 24• Extend to full alignment – throw away if not >98% similar
TAGATTACACAGATTAC
TAGATTACACAGATTAC|||||||||||||||||
T GA
TAGA| ||
TACA
TAGT||
• Caveat: repeats A k-mer that occurs N times, causes O(N2) read/read comparisons ALU k-mers could cause up to 1,000,0002 comparisons
• Solution: Discard all k-mers that occur “too often”
• Set cutoff to balance sensitivity/speed tradeoff, according to genome at hand and computing resources available
Genome Assembly: Find Overlapping ReadsGenome Assembly: Find Overlapping Reads
![Page 12: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/12.jpg)
Create local multiple alignments from the
overlapping reads
TAGATTACACAGATTACTGATAGATTACACAGATTACTGATAG TTACACAGATTATTGATAGATTACACAGATTACTGATAGATTACACAGATTACTGATAGATTACACAGATTACTGATAG TTACACAGATTATTGATAGATTACACAGATTACTGA
Genome Assembly: Find Overlapping ReadsGenome Assembly: Find Overlapping Reads
![Page 13: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/13.jpg)
• Correct errors using multiple alignment
TAGATTACACAGATTACTGATAGATTACACAGATTACTGATAGATTACACAGATTATTGATAGATTACACAGATTACTGATAG-TTACACAGATTACTGA
TAGATTACACAGATTACTGATAGATTACACAGATTACTGATAG-TTACACAGATTATTGATAGATTACACAGATTACTGATAG-TTACACAGATTATTGA
insert A
replace T with Ccorrelated errors—probably caused by repeats disentangle overlaps
TAGATTACACAGATTACTGATAGATTACACAGATTACTGA
TAG-TTACACAGATTATTGA
TAGATTACACAGATTACTGA
TAG-TTACACAGATTATTGA
In practice, error correction removes up to 98% of the errors
Genome Assembly: Find Overlapping ReadsGenome Assembly: Find Overlapping Reads
![Page 14: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/14.jpg)
Genome Assembly: Merge Reads into ContigsGenome Assembly: Merge Reads into Contigs
• Overlap graph:– Nodes: reads r1…..rn
– Edges: overlaps (ri, rj, shift, orientation, score)
Note:of course, we don’tknow the “color” ofthese nodes
Reads that comefrom two regions ofthe genome (blueand red) that containthe same repeat
![Page 15: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/15.jpg)
We want to merge reads up to potential repeat boundaries
repeat region
Unique Contig
Overcollapsed Contig
Genome Assembly: Merge Reads into ContigsGenome Assembly: Merge Reads into Contigs
![Page 16: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/16.jpg)
• Ignore non-maximal reads• Merge only maximal reads into contigs
repeat region
Genome Assembly: Merge Reads into ContigsGenome Assembly: Merge Reads into Contigs
![Page 17: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/17.jpg)
Read Length and PairingRead Length and Pairing
• Short reads are problematic, because short sequences do not map uniquely to the genome.
• Solution #1: Get longer reads.• Solution #2: Get paired reads.
ACTTAAGGCTGACTAGC TCGTACCGATATGCTG
![Page 18: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/18.jpg)
Third Generation SequencingThird Generation Sequencing
• Nanopore sequencing– Nucleic acids driven through a nanopore.– Differences in conductance of pore provide readout.
• Real-time monitoring of PCR activity– Read-out by fluorescence resonance energy transfer
between polymerase and nucleotides or– Waveguides allow direct observation of polymerase
and fluorescently labeled nucleotides
![Page 19: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/19.jpg)
Nanopore sequencingNanopore sequencing
Deamer, DW, and Akeson, M. ‘Nanopores and Nucleic Acids: prospects for ultrarapid sequencing’. Tibtech.Meller, A J. Phys.: Condens. Matter 15 (2003) R581–R607
Earlier Findings – Transmembrane voltage drives
RNA through the protein nanopore α-hemolysin.
– Passage of RNA through the pore reduces the ionic current
– Blockage current is modulated by base identity
• PolyC – iblock = 5 pA, • PolyA – Iblock = 20 pA
– Translocation rate depends on base identity
• PolyC - v = 3 µs/base• PolyA – v = 20 µs/base
![Page 20: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/20.jpg)
Automated Rapid DNA Sequencing with NanoporesAutomated Rapid DNA Sequencing with Nanopores
Church, George M. ‘Genomes for All’ Scientific American, Jan 2006, pp. 47-54.
Sequencing will require a better understanding of the physics of the interaction between DNA and protein pore during translocation.
![Page 21: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/21.jpg)
Modeling of ssDNA TranslocationModeling of ssDNA Translocation
• F = zeVa– ze = effective charge / base– V = applied voltage– a = base-to-base distance
• F = (1)(1.6 x10-19)(.125)(.4 x 10-9) ~ 5kbT / a ~ 44 pN
• Basis for modeling– P(forward or backward) ~ exp(Fa/kBT)– Averaged over all monomers
• Model Assumptions: – Length of polymer = L >> pore length – With short polymers, membrane has 0 thickness
D. K. Lubensky and D. R. Nelson, Biophys. J. 77, 1824 (1999).
F
![Page 22: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/22.jpg)
Experiment of ssDNA TranslocationExperiment of ssDNA Translocation
Conditions– Temp: 2oC
– Electrolyte solution– 1M KCl, 1 mM Tris-EDTA buffer,
pH 8.5
– Polymer• Polydeoxyadenylic acid
(poly(dA))• Length: 4 – 100 bases
– Driving voltage: 70-300 mV
Meller, A., L. Nivon, D. Branton, 2001. Voltage-Driven DNA
Translocations Through a Nanopore, Phys. Rev. Lett., 86,3435-39
![Page 23: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/23.jpg)
2323
Sequence Alignment as a Mathematical Sequence Alignment as a Mathematical Problem: Problem:
Example: Sequence a: ATTCTTGC Sequence b: ATCCTATTCTAGC
Best Alignment: ATTCTTGC
ATCCTATTCTAGC /|\ gap Bad Alignment: AT TCTT GC ATCCTATTCTAGC /|\ /|\ gap gap
What is a good alignment?
![Page 24: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/24.jpg)
2424
How to rate an alignment?How to rate an alignment?• Match: +8 (w(x, y) = 8, if x = y)
• Mismatch: -5 (w(x, y) = -5, if x ≠ y)
• Each gap symbol: -3 (w(-,x)=w(x,-)=-3)
![Page 25: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/25.jpg)
2525
Pairwise AlignmentPairwise AlignmentSequence a: CTTAACTSequence b: CGGATCAT
An alignment of a and b:
C---TTAACTCGGATCA--T
Insertion gap
Match Mismatch
Deletion gap
![Page 26: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/26.jpg)
2626
Alignment GraphAlignment GraphSequence a: CTTAACT
Sequence b: CGGATCATC G G A T C A T
C
T
T
A
A
C
T
C---TTAACTCGGATCA--T
Insertion gap
Deletion gap
![Page 27: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/27.jpg)
2727
Graphic representation of an alignmentGraphic representation of an alignment
Sequence a: CTTAACT Sequence b: CGGATCAT
C
C C---TTAACTCGGATCA--T
![Page 28: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/28.jpg)
2828
Graphic representation of an alignmentGraphic representation of an alignment
Sequence a: CTTAACT Sequence b: CGGATCAT
C G G A
C C---TTAACTCGGATCA--T
![Page 29: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/29.jpg)
2929
Graphic representation of an alignmentGraphic representation of an alignment
Sequence a: CTTAACT Sequence b: CGGATCAT
C G G A T
C
T
C---TTAACTCGGATCA--T
![Page 30: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/30.jpg)
3030
Graphic representation of an alignmentGraphic representation of an alignment
Sequence a: CTTAACT Sequence b: CGGATCAT
C G G A T C A
C
T
T
A
A
C
C---TTAACTCGGATCA--T
![Page 31: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/31.jpg)
3131
Graphic representation of an alignmentGraphic representation of an alignment
Sequence a: CTTAACT Sequence b: CGGATCAT
C G G A T C A T
C
T
T
A
A
C
T
C---TTAACTCGGATCA--T
![Page 32: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/32.jpg)
3232
Pathway of an alignmentPathway of an alignmentSequence a: CTTAACT
Sequence b: CGGATCATC G G A T C A T
C
T
T
A
A
C
T
C---TTAACTCGGATCA--T
![Page 33: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/33.jpg)
3333
Alignment ScoreAlignment ScoreSequence a: CTTAACT
Sequence b: CGGATCAT
8 5 2 -1
-1+8
=7
7-3
=4
4+8
=12
12-3
=9
9-3
=6
C G G A T C A T
C
T
T
A
A
C
T
C---TTAACTCGGATCA--T
6+8=14
Alignment score
![Page 34: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/34.jpg)
3434
An optimal alignmentAn optimal alignment-- the alignment of maximum score-- the alignment of maximum score
• Let A=a1a2…am and B=b1b2…bn .
• Si,j: the score of an optimal alignment between
a1a2…ai and b1b2…bj
• With proper initializations, Si,j can be computedas follows.
),(
),(
),(
max
1,1
1,
,1
,
jiji
jji
iji
ji
baws
bws
aws
s
![Page 35: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/35.jpg)
3535
Computing Computing SSi,ji,j
i
j
w(ai,-)
w(-,bj)
w(ai,bj)
Sm,n
![Page 36: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/36.jpg)
3636
InitializationsInitializationsS0,0= 0
S0,1=-3, S0,2=-6,
S0,3=-9, S0,4=-12,
S0,5=-15, S0,6=-18,
S0,7=-21, S0,8=-24
S1,0=-3, S2,0=-6,
S3,0=-9, S4,0=-12,
S5,0=-15, S6,0=-18,
S7,0=-21
0 -3 -6 -9 -12 -15 -18 -21 -24
-3
-6
-9
-12
-15
-18
-21
C G G A T C A T
C
T
T
A
A
C
T
Gap symbol: -3
![Page 37: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/37.jpg)
3737
SS1,11,1 = = ??Option 1:
S1,1 = S0,0 +w(a1, b1)
= 0 +8 = 8
Option 2:
S1,1=S0,1 + w(a1, -)
= -3 - 3 = -6
Option 3:
S1,1=S1,0 + w( - , b1)
= -3-3 = -6
Optimal:
S1,1 = 8
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 ?
-6
-9
-12
-15
-18
-21
C G G A T C A T
C
T
T
A
A
C
T
Match: 8
Mismatch: -5
Gap symbol: -3
![Page 38: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/38.jpg)
3838
SS1,21,2 = = ??Option 1:
S1,2 = S0,1 +w(a1, b2)
= -3 -5 = -8
Option 2:
S1,2=S0,2 + w(a1, -)
= -6 - 3 = -9
Option 3:
S1,2=S1,1 + w( - , b2)
= 8-3 = 5
Optimal:
S1,2 =5
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 8 ?
-6
-9
-12
-15
-18
-21
C G G A T C A T
C
T
T
A
A
C
T
Match: 8
Mismatch: -5
Gap symbol: -3
![Page 39: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/39.jpg)
3939
SS2,12,1 = = ??Option 1:
S2,1= S1,0 +w(a2, b1)
= -3 -5 = -8
Option 2:
S2,1=S1,1 + w(a2, -)
= 8 - 3 = 5
Option 3:
S2,1=S2,0 + w( - , b1)
= -6-3 = -9
Optimal:
S2,1 =5
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 8 5
-6 ?
-9
-12
-15
-18
-21
C G G A T C A T
C
T
T
A
A
C
T
Match: 8
Mismatch: -5
Gap symbol: -3
![Page 40: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/40.jpg)
4040
SS2,22,2 = = ??Option 1:
S2,2= S1,1 +w(a2, b2)
= 8 -5 = 3
Option 2:
S2,2=S1,2 + w(a2, -)
= 5 - 3 = 2
Option 3:
S2,2=S2,1 + w( - , b2)
= 5-3 = 2
Optimal:
S2,2 =3
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 8 5
-6 5 ?
-9
-12
-15
-18
-21
C G G A T C A T
C
T
T
A
A
C
T
Match: 8
Mismatch: -5
Gap symbol: -3
![Page 41: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/41.jpg)
4141
SS3,53,5 = = ??
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 8 5 2 -1 -4 -7 -10 -13
-6 5 3 0 -3 7 4 1 -2
-9 2 0 -2 -5 ?
-12
-15
-18
-21
C G G A T C A T
C
T
T
A
A
C
T
![Page 42: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/42.jpg)
4242
SS3,53,5 = = ??
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 8 5 2 -1 -4 -7 -10 -13
-6 5 3 0 -3 7 4 1 -2
-9 2 0 -2 -5 5 -1 -4 9
-12 -1 -3 -5 6 3 0 7 6
-15 -4 -6 -8 3 1 -2 8 5
-18 -7 -9 -11 0 -2 9 6 3
-21 -10 -12 -14 -3 8 6 4 14
C G G A T C A T
C
T
T
A
A
C
T
optimal score
![Page 43: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/43.jpg)
4343
C T T A A C – TC T T A A C – TC G G A T C A TC G G A T C A T
0 -3 -6 -9 -12 -15 -18 -21 -24
-3 8 5 2 -1 -4 -7 -10 -13
-6 5 3 0 -3 7 4 1 -2
-9 2 0 -2 -5 5 -1 -4 9
-12 -1 -3 -5 6 3 0 7 6
-15 -4 -6 -8 3 1 -2 8 5
-18 -7 -9 -11 0 -2 9 6 3
-21 -10 -12 -14 -3 8 6 4 14
C G G A T C A T
C
T
T
A
A
C
T
8 – 5 –5 +8 -5 +8 -3 +8 = 14
![Page 44: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/44.jpg)
4444
Multiple sequence alignment MSAMultiple sequence alignment MSA
![Page 45: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/45.jpg)
4545
How to score an MSA?How to score an MSA?
• Sum-of-Pairs (SP-score)
GC-TC
A---C
G-ATC
GC-TC
A---C
GC-TC
G-ATC
A---C
G-ATC
Score =
Score
Score
Score
+
+
![Page 46: CZ5225: Modeling and Simulation in Biology Lecture 9: Next Generation Sequencing Prof. Chen Yu Zong Tel: 6516-6877 Email: phacyz@nus.edu.sg .](https://reader030.fdocuments.in/reader030/viewer/2022032607/56649ece5503460f94bdb93c/html5/thumbnails/46.jpg)
4646
How to score an MSA?How to score an MSA?
• Sum-of-Pairs (SP-score)
GC-TC
A---C
G-ATC
GC-TC
A---C
GC-TC
G-ATC
A---C
G-ATC
Score =
Score
Score
Score
+
+
-5-3+8-3+8= 5
+
8-3-3+8+8= 18
+
-5+8-3-3+8= 5
= 28
SP-score=5+18+5=28