Maximizing hidden stop codon on gene design
-
Upload
khaledmonsoor -
Category
Technology
-
view
728 -
download
1
description
Transcript of Maximizing hidden stop codon on gene design
![Page 1: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/1.jpg)
Authors: Phan, V., Saha, S., Pandey, A., Wong, T-Y
Published in: Intl. Journal of Data Mining and Bioinformatics
Vol. 4, No. 4, 2010
Presented by:
Khaled MonsoorBioinformatics Masters ProgramThe University of MemphisMail: [email protected]
Date: Nov 05, 2010
![Page 2: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/2.jpg)
What ?
Why ?
How ?
Result ?
Conclusion
Synthetic gene design with a large number of hidden stops
![Page 3: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/3.jpg)
![Page 4: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/4.jpg)
Like him …
Sleeping is waste of precious time
![Page 5: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/5.jpg)
What are the Hidden stops in genes ?
Can we “redesign” genes to include more Hidden stops ?
How clever computer algorithms can help us ?
![Page 6: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/6.jpg)
What ?
Why ?
How ?
Result ?
Conclusion
Synthetic gene design with a large number of hidden stops
![Page 7: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/7.jpg)
It is now feasible to construct artificial genomes.
Researchers at the C. Venter Research Institute created artificially the genome of Mycoplasma genitalium, completed in 2010
…. To increase efficiency of protein synthesis in ‘designed’ genes ?
How to increase efficiency …
Hidden stops can protect from frame shifts
by terminating them early
Without hidden stops, frame shifts can cause
very long non-functional proteins
![Page 8: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/8.jpg)
Dictates what a protein is composed of
Has evolved through millions of years
A protein is a sequence of amino acids
Contains 20(twenty) amino acids
8
![Page 9: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/9.jpg)
![Page 10: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/10.jpg)
mRNA:
ATGTCCAAACCT
Protein:
M S L P
10
![Page 11: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/11.jpg)
11
CCT, CCC, CCA, CCG all represent P (Proline)
A mutation in the 3rd
positions does not change the amino acid
![Page 12: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/12.jpg)
Deletion creates frame shifts, which change entire subsequence content
RNA: ….. CAT.CAT.CAT.CAT ….
Protein: …HHHH… (chain of Histidine)
Deletion of 3rd character (T): CAC.ATC.ATC.AT
Protein: HII
... Totally bizarre something else !!!
12
![Page 13: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/13.jpg)
:-(
![Page 14: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/14.jpg)
(start) (codon)k (stop)
Start – ATG
Stop – TAA, TAG, TGA
Codon – any triplet not equal to TAA, TAG, or TGA
Example: ATG.ACC.AAT.CGG.TAA
14
Stop codon (but hidden)
![Page 15: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/15.jpg)
Hidden stops can protect against frame shifts by terminating consequence translation early
Without hidden stops, frame shifts can cause very long non-functional proteins, resulting to NOT
ONLY waste of time, amino acid resources (money), ATP (energy) but also produce some
deadly toxin
15
Ref: Seligmann and Pollock, DNA and Cell Biology, 2004
![Page 16: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/16.jpg)
What ?
Why ?
How ?
Result ?
Conclusion
Synthetic gene design with a large number of hidden stops
![Page 17: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/17.jpg)
•Design genes with maximum hiddenstops
•Constraints:
1. None,
2. by matching GC content, and
3. by matching codon usage
17
![Page 18: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/18.jpg)
18
![Page 19: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/19.jpg)
Consider this protein is MSDSKED
Both sequences encode for this protein:
1. ATG.AGT.GAT.AGT.AAA.GAA.GAC.TAA
2. ATG.TCC.GAT.TCG.AAA.GAA.GAC.TAA
Sequence (1) is better! It has 4 hidden stops!
19
![Page 20: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/20.jpg)
Goal:
• Given a protein, design a DNA sequence that encodes the protein with the maximum number of hidden stops
20
![Page 21: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/21.jpg)
Idea:
Optimal design of whole sequence is based on optimal design of partial sequences
H(i, j) = optimal design up to ith amino acid, Ai , which is coded by its jth codon
21
![Page 22: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/22.jpg)
This formula can be computed recursively (in linear time, O(n))
H(i, j) = maxk { H(i-1, k) + Ikj }
Maximizing over all k codons coding the previous amino acid, Ai-1
Ikj = 1 if the kth codon of Ai-1 and jth codon of Ai is a stop codon
22
![Page 23: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/23.jpg)
Protein DNA This is a 1-to-many mapping
Back translation should:
1. Satisfy constraints imposed by host genomes,
2. Serve specific design purpose
23
![Page 24: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/24.jpg)
![Page 25: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/25.jpg)
GC content = number of G & C in sequence
GC content relates to the stability of DNA
Algorithm’s objectives: 1. maximize number of hidden stops, 2. then, match GC content of host genome
25
![Page 26: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/26.jpg)
![Page 27: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/27.jpg)
Algorithm:
Construct the sequence with maximum number of hidden stops
“Fit” this sequence to the required Codon usage
Result:
Cannot achieve both max hidden stops and match Codon usage
Still “better” than wild-type genes
27
![Page 28: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/28.jpg)
28For Leucine, codon CUG is used 51% in E. Coli.
![Page 29: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/29.jpg)
![Page 30: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/30.jpg)
What ?
Why ?
How ?
Result ?
Conclusion
Synthetic gene design with a large number of hidden stops
![Page 31: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/31.jpg)
1. “Wild type” (genes from NCBI)
2. Random gene (constrained by Codon usage of “wild type”
3. “Optimal” – design with no constraint (max stop codon)
4. Constrained by GC content of wild type
5. Constrained by Codon usage of wild type
31
![Page 32: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/32.jpg)
.
.
.
![Page 33: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/33.jpg)
Nu
mb
er o
f h
idd
en s
top
co
do
n
![Page 34: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/34.jpg)
What ?
Why ?
How ?
Result ?
Conclusion
![Page 35: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/35.jpg)
While maintaining GC content & codon usage of wild-types, the algorithms can propose gene s with 1approx 10% more hidden stops
Maintaining both the constraints, the shape of distribution graph of ‘wild-type’ and ‘designed’ gene can maintain 98% Pearson correlation
![Page 36: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/36.jpg)
As a lagging grad student,
I’ll try my best to answer
…
![Page 37: Maximizing hidden stop codon on gene design](https://reader036.fdocuments.in/reader036/viewer/2022062514/5597732a1a28ab80508b4573/html5/thumbnails/37.jpg)
Thank you for attending his boring presentation … oh