Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All...
-
Upload
abagail-dry -
Category
Documents
-
view
213 -
download
1
Transcript of Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All...
![Page 1: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/1.jpg)
Computational Biology, Part 8Protein Coding Regions
Computational Biology, Part 8Protein Coding Regions
Robert F. MurphyRobert F. Murphy
Copyright Copyright 1996-2009. 1996-2009.
All rights reserved.All rights reserved.
![Page 2: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/2.jpg)
Sequence Analysis TasksSequence Analysis Tasks
Finding protein coding regionsFinding protein coding regions
![Page 3: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/3.jpg)
GoalGoal
Given a DNA or RNA sequence, find those Given a DNA or RNA sequence, find those regions that code for protein(s)regions that code for protein(s) Direct approach: Look for stretches that can be Direct approach: Look for stretches that can be
interpreted as protein using the genetic codeinterpreted as protein using the genetic code Statistical approaches: Use other knowledge Statistical approaches: Use other knowledge
about likely coding regionsabout likely coding regions
![Page 4: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/4.jpg)
Direct ApproachDirect Approach
![Page 5: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/5.jpg)
Genetic codesGenetic codes
The set of tRNAs that an organism The set of tRNAs that an organism possesses defines its genetic code(s)possesses defines its genetic code(s)
The The universal genetic codeuniversal genetic code is common to all is common to all organismsorganisms
Prokaryotes, mitochondria and chloroplasts Prokaryotes, mitochondria and chloroplasts often use slightly different genetic codesoften use slightly different genetic codes
More than one tRNA may be present for a More than one tRNA may be present for a given codon, allowing more than one given codon, allowing more than one possible translation productpossible translation product
![Page 6: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/6.jpg)
Genetic codesGenetic codes
Differences in genetic codes occur in start Differences in genetic codes occur in start and stop codons onlyand stop codons only
Alternate initiation codonsAlternate initiation codons: codons that : codons that encode amino acids but can also be used to encode amino acids but can also be used to start translation (GUG, UUG, AUA, UUA, start translation (GUG, UUG, AUA, UUA, CUG)CUG)
Suppressor tRNA codonsSuppressor tRNA codons: codons that : codons that normally stop translation but are translated normally stop translation but are translated as amino acids (UAG, UGA, UAA)as amino acids (UAG, UGA, UAA)
![Page 7: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/7.jpg)
Genetic codesGenetic codes
![Page 8: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/8.jpg)
Genetic codesGenetic codes
![Page 9: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/9.jpg)
Genetic codesGenetic codes
Note additional start codons: UUA, UUG, CUGNote additional start codons: UUA, UUG, CUG Note conversion of stop codon UGA (opal) to TrpNote conversion of stop codon UGA (opal) to Trp
![Page 10: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/10.jpg)
Reading FramesReading Frames
Since nucleotide sequences are “read” three Since nucleotide sequences are “read” three bases at a time, there are three possible bases at a time, there are three possible “frames” in which a given nucleotide “frames” in which a given nucleotide sequence can be “read” (in the forward sequence can be “read” (in the forward direction)direction)
Taking the complement of the sequence and Taking the complement of the sequence and reading in the reverse direction gives three reading in the reverse direction gives three more more reading framesreading frames
![Page 11: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/11.jpg)
Reading framesReading frames
TTC TCA TGT TTG ACA GCTTTC TCA TGT TTG ACA GCT
RF1 RF1 Phe Ser Cys Leu Thr Ala>Phe Ser Cys Leu Thr Ala>
RF2 RF2 Ser His Val *** Gln Leu>Ser His Val *** Gln Leu>
RF3 RF3 Leu Met Phe Asp Ser>Leu Met Phe Asp Ser>
AAG AGT ACA AAC TGT CGAAAG AGT ACA AAC TGT CGA
RF4 RF4 <Glu *** Thr Gln Cys Ser<Glu *** Thr Gln Cys Ser
RF5 RF5 <Glu His Lys Val Ala<Glu His Lys Val Ala
RF6 RF6 <Arg Met Asn Ser Leu<Arg Met Asn Ser Leu
![Page 12: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/12.jpg)
Open Reading Frames (ORF)Open Reading Frames (ORF)
Concept: Region of DNA or RNA sequence Concept: Region of DNA or RNA sequence that that could could be translated into a peptide be translated into a peptide sequence (sequence (openopen refers to absence of stop refers to absence of stop codons)codons)
Prerequisite: A specific genetic codePrerequisite: A specific genetic code Definition:Definition:
(start codon) (amino acid coding codon)(start codon) (amino acid coding codon)nn (stop codon) (stop codon)
Note: Not all ORFs are Note: Not all ORFs are actuallyactually used used
![Page 13: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/13.jpg)
![Page 14: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/14.jpg)
Open Reading FramesOpen Reading Frames
Click boxes for Click boxes for List ORFS List ORFS and and ORF mapORF map
![Page 15: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/15.jpg)
Check reading Check reading frame: frame: mod(696,3)=0 mod(696,3)=0 -> RF3-> RF3
![Page 16: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/16.jpg)
EMBOSS plotorfEMBOSS plotorf
![Page 17: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/17.jpg)
![Page 18: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/18.jpg)
![Page 19: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/19.jpg)
Splicing ORFsSplicing ORFs
For eukaryotes, which have interrupted For eukaryotes, which have interrupted genes, ORFs in different reading frames genes, ORFs in different reading frames may be spliced together to generate final may be spliced together to generate final productproduct
ORFs from forward and reverse directions ORFs from forward and reverse directions cannot be combinedcannot be combined
![Page 20: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/20.jpg)
Block Diagram for Search for ORFsBlock Diagram for Search for ORFs
Search Engine
Sequence to be searched
Genetic code
List of ORF positions
Both strands?
Ends start/stop?
![Page 21: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/21.jpg)
Statistical ApproachesStatistical Approaches
![Page 22: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/22.jpg)
Calculation WindowsCalculation Windows
Many sequence analyses require calculating Many sequence analyses require calculating some statistic over a long sequence looking some statistic over a long sequence looking for regions where the statistic is unusually for regions where the statistic is unusually high or lowhigh or low
To do this, we define a To do this, we define a window sizewindow size to be to be the width of the region over which each the width of the region over which each calculation is to be donecalculation is to be done
Example: %ATExample: %AT
![Page 23: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/23.jpg)
Base Composition BiasBase Composition Bias
For a protein with a roughly “normal” amino acid For a protein with a roughly “normal” amino acid composition, the first 2 positions of all codons composition, the first 2 positions of all codons will be about 50% GCwill be about 50% GC
If an organism has a high GC content overall, the If an organism has a high GC content overall, the third position of all codons must be mostly GCthird position of all codons must be mostly GC
Useful for prokaryotesUseful for prokaryotes Not useful for eukaryotes due to large amount of Not useful for eukaryotes due to large amount of
noncoding DNAnoncoding DNA
![Page 24: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/24.jpg)
Fickett’s statisticFickett’s statistic
Also called Also called TestCodeTestCode analysis analysis Looks for Looks for asymmetryasymmetry of base composition of base composition Strong statistical basis for calculationsStrong statistical basis for calculations Method:Method:
For each For each windowwindow on the sequence, calculate on the sequence, calculate the base composition of nucleotides 1, 4, 7..., the base composition of nucleotides 1, 4, 7..., then of 2, 5, 8..., and then of 3, 6, 9...then of 2, 5, 8..., and then of 3, 6, 9...
Calculate statistic from resulting three numbersCalculate statistic from resulting three numbers
![Page 25: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/25.jpg)
Codon Bias (Codon Preference)Codon Bias (Codon Preference)
PrinciplePrinciple Different levels of expression of different Different levels of expression of different
tRNAs for a given amino acid lead to pressure tRNAs for a given amino acid lead to pressure on coding regions to “conform” to the preferred on coding regions to “conform” to the preferred codon usagecodon usage
Non-coding regions, on the other hand, feel no Non-coding regions, on the other hand, feel no selective pressure and can driftselective pressure and can drift
![Page 26: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/26.jpg)
Codon Bias (Codon Preference)Codon Bias (Codon Preference)
Starting point: Table of observed codon Starting point: Table of observed codon frequencies in known genes from a given frequencies in known genes from a given organismorganism best to use highly expressed genesbest to use highly expressed genes
MethodMethod Calculate “coding potential” within a moving Calculate “coding potential” within a moving
windowwindow for all three for all three reading framesreading frames Look for ORFs with high scoresLook for ORFs with high scores
![Page 27: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/27.jpg)
Codon Bias (Codon Preference)Codon Bias (Codon Preference)
Works best for prokaryotes or unicellular Works best for prokaryotes or unicellular eukaryotes because for multicellular eukaryotes because for multicellular eukaryotes, different pools of tRNA may be eukaryotes, different pools of tRNA may be expressed at different stages of development expressed at different stages of development in different tissuesin different tissues may have to group genes into setsmay have to group genes into sets
Codon bias can also be used to estimate Codon bias can also be used to estimate protein expression levelprotein expression level
![Page 28: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/28.jpg)
Portion of D. melanogaster codon frequency tablePortion of D. melanogaster codon frequency tableAmino Acid Codon Number Freq/1000 Fraction
Gly GGG 11 2.60 0.03
Gly GGA 92 21.74 0.28
Gly GGT 86 20.33 0.26
Gly GGC 142 33.56 0.43
Glu GAG 212 50.11 0.75
Glu GAA 69 16.31 0.25G ly G
![Page 29: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/29.jpg)
Comparison of Glycine codon frequenciesComparison of Glycine codon frequencies
Codon E. coli D. melanogaster
GGG 0.02 0.03
GGA 0.00 0.28
GGT 0.59 0.26
GGC 0.38 0.43G ly G
![Page 30: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/30.jpg)
Illustration of Codon Bias PlotsIllustration of Codon Bias Plots
Use Entrez via MacVector to get sequence of lexA Use Entrez via MacVector to get sequence of lexA under “Database” select “Internet Entrez Search”under “Database” select “Internet Entrez Search” Select gene=lexA AND organism=EscherichiaSelect gene=lexA AND organism=Escherichia Pick one (e.g., region from 89.2 to 92.8)Pick one (e.g., region from 89.2 to 92.8)
Under “Analyze” select “Codon Preference Plots”Under “Analyze” select “Codon Preference Plots” Choose Escherichia coli codon bias fileChoose Escherichia coli codon bias file Choose gene region corresponding to lacZChoose gene region corresponding to lacZ Click on Staden codon bias and Gribskov codon biasClick on Staden codon bias and Gribskov codon bias
![Page 31: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/31.jpg)
![Page 32: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/32.jpg)
-28
-19
-10
-0
9
18
122400 122500 122600 122700 122800 122900
Staden Codon Preference: Window = 40 codonsFrame +1
-28
-19
-10
-0
9
18
122400 122500 122600 122700 122800 122900
Staden Codon Preference: Window = 40 codonsFrame +2
-28
-19
-10
-0
9
18
122400 122500 122600 122700 122800 122900
Staden Codon Preference: Window = 40 codonsFrame +3
-28
-19
-10
-0
9
18
122400 122500 122600 122700 122800 122900
Staden Codon Preference: Window = 40 codonsFrame -1
-28
-19
-10
-0
9
18
122400 122500 122600 122700 122800 122900
Staden Codon Preference: Window = 40 codonsFrame -2
-28
-19
-10
-0
9
18
122400 122500 122600 122700 122800 122900
Staden Codon Preference: Window = 40 codonsFrame -3
0.70
0.80
0.90
1.00
1.10
1.20
1.30
122400 122500 122600 122700 122800 122900
Gribskov Codon Preference: Window = 40 codonsFrame +1
0.70
0.80
0.90
1.00
1.10
1.20
1.30
122400 122500 122600 122700 122800 122900
Gribskov Codon Preference: Window = 40 codonsFrame +2
0.70
0.80
0.90
1.00
1.10
1.20
1.30
122400 122500 122600 122700 122800 122900
Gribskov Codon Preference: Window = 40 codonsFrame +3
![Page 33: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/33.jpg)
-22
-16
-10
-4
2
8
14
122000 122200 122400 122600 122800 123000 123200
Staden Codon Preference: Window = 40 codonsFrame +1
-22
-16
-10
-4
2
8
14
122000 122200 122400 122600 122800 123000 123200
Staden Codon Preference: Window = 40 codonsFrame +2
-22
-16
-10
-4
2
8
14
122000 122200 122400 122600 122800 123000 123200
Staden Codon Preference: Window = 40 codonsFrame +3
-22
-16
-10
-4
2
8
14
122000 122200 122400 122600 122800 123000 123200
Staden Codon Preference: Window = 40 codonsFrame -1
-22
-16
-10
-4
2
8
14
122000 122200 122400 122600 122800 123000 123200
Staden Codon Preference: Window = 40 codonsFrame -2
-22
-16
-10
-4
2
8
14
122000 122200 122400 122600 122800 123000 123200
Staden Codon Preference: Window = 40 codonsFrame -3
0.70
0.80
0.90
1.00
1.10
1.20
1.30
122000 122200 122400 122600 122800 123000 123200
Gribskov Codon Preference: Window = 40 codonsFrame +1
0.70
0.80
0.90
1.00
1.10
1.20
1.30
122000 122200 122400 122600 122800 123000 123200
Gribskov Codon Preference: Window = 40 codonsFrame +2
0.70
0.80
0.90
1.00
1.10
1.20
1.30
122000 122200 122400 122600 122800 123000 123200
Gribskov Codon Preference: Window = 40 codonsFrame +3
![Page 34: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/34.jpg)
Codon Preference AlgorithmsCodon Preference Algorithms
The Staden method (from Staden & The Staden method (from Staden & McLachlan, 1982) uses a codon usage table McLachlan, 1982) uses a codon usage table directly in identifying coding regions. The directly in identifying coding regions. The codon usage table is normalized so that the codon usage table is normalized so that the sum of all 64 codons is 1. The usages for sum of all 64 codons is 1. The usages for each codon in each reading frame in each each codon in each reading frame in each window are multiplied together and window are multiplied together and normalized by the sum of the probabilities normalized by the sum of the probabilities in all three positions to generate a relative in all three positions to generate a relative coding probability.coding probability.
![Page 35: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/35.jpg)
Codon Preference AlgorithmsCodon Preference Algorithms
The Gribskov method uses a codon usage The Gribskov method uses a codon usage table normalized so that the sum of the table normalized so that the sum of the alternatives for each amino acid add to 1. alternatives for each amino acid add to 1. The values for each codon for each reading The values for each codon for each reading frame in each window are multiplied frame in each window are multiplied together and normalized by the random together and normalized by the random probability expected for that codon given probability expected for that codon given the mononucleotide frequencies of the the mononucleotide frequencies of the target sequence. It is the most commonly target sequence. It is the most commonly used method.used method.
![Page 36: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/36.jpg)
![Page 37: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/37.jpg)
![Page 38: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/38.jpg)
Plot Plot from from syco
![Page 39: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/39.jpg)
SummarySummary
Translation of nucleic acid sequences into Translation of nucleic acid sequences into hypothetical protein sequences requires a hypothetical protein sequences requires a genetic codegenetic code
Translation can occur in three forward and Translation can occur in three forward and three reverse reading framesthree reverse reading frames
Open reading frames are regions that can be Open reading frames are regions that can be translated without encountering a stop translated without encountering a stop codoncodon
![Page 40: Computational Biology, Part 8 Protein Coding Regions Robert F. Murphy Copyright 1996-2009. All rights reserved.](https://reader036.fdocuments.in/reader036/viewer/2022070307/551aa580550346e0158b5bad/html5/thumbnails/40.jpg)
SummarySummary
The likelihood that a particular open The likelihood that a particular open reading frames is in fact a coding region reading frames is in fact a coding region (actually made into protein) can be (actually made into protein) can be estimated using third-codon base estimated using third-codon base composition or codon preference tablescomposition or codon preference tables
This can be used to scan long sequences for This can be used to scan long sequences for possible coding regionspossible coding regions