MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.
-
Upload
kathlyn-marilyn-neal -
Category
Documents
-
view
221 -
download
0
description
Transcript of MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.
![Page 1: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/1.jpg)
microRNA Prediction with microRNA Prediction with SCFG and MFE Structure SCFG and MFE Structure AnnotationAnnotationTim Shaw, Ying Zheng, and Bram Sebastian
![Page 2: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/2.jpg)
Goal of the PresentationGoal of the PresentationIntroduction to miRNASurvey of computational and
experimental approaches to identify microRNA
CYK AlgorithmOur MethodologyResult/DiscussionFuture Direction
![Page 3: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/3.jpg)
Computers vs GeneticsComputers vs Genetics
![Page 4: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/4.jpg)
Background on microRNA Background on microRNA and its Classical Definitionand its Classical DefinitionFound in Eukaryotes (706 identified in
human)Genome-encoded stem-loop precursorGenerally Processed by a Dicer and
HelicaseMature microRNA is approximately 22
nucleotides (nt)Recognize target mRNA by base-pairing
◦Acts as a primarily gene silencing◦Some cases of gene enhancing
![Page 5: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/5.jpg)
Diagram for miRNADiagram for miRNA
![Page 6: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/6.jpg)
Problems with miRNA Problems with miRNA Hunting through Lab Hunting through Lab ExperimentsExperimentsBiology = network of cause and
effectmiRNA might be bounded by certain
Environmental TriggersHard to detect expression of certain
microRNA sequences.Some miRNA may have a hard to
clone physical property including sequence composition or post-transcriptional modification
![Page 7: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/7.jpg)
Problems with miRNA Problems with miRNA Hunting through Hunting through Computational ApproachesComputational ApproachesStem loop structure is common in
EukaryotesEukaryotic genome are long and
most computational approach are not practical for scanning through the entire genome
![Page 8: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/8.jpg)
Computational Driven Computational Driven ApproachApproachStructure information
(Thermodynamics)◦RNAz
Homology Conservation of structure (ERPIN, MirScan, snarloop)
◦Stem ◦Loop◦Target sequece
Machine Learning (miRFinder, microPred)◦Feature selection based on sequence and
structural information
![Page 9: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/9.jpg)
Tests done on those Tests done on those methodologymethodologyERPIN (2001) (Homology)
◦ Good result but very limited to the availability of the training data. Capable of detecting only 66 of the 706 miRNA if we remove the human training sequences we can only detect 36 miRNA
miRFinder (2007) (ab initio)◦ Human
Specificity: (1320/8494) (84.46%) Sensitivity: (599/706) (84.84%)
◦ Mouse Specificity: (1759/10213) (82.78%) Sensitivity: (450/547) (82.27%)
microPred (2009) (ab initio)◦ Found bug for the author, currently getting it
fixed.
![Page 10: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/10.jpg)
Negative Set GenerationNegative Set GenerationSequence were obtained from
the CDS region of the genome◦Implementation of a CDS Extractor
for ccdsgenes.txt files from the UCSC Genome Browser
CDS means coding region ◦(Sequence that code for protein)
Need to implement a new parser based on the cds.txt from the UCSC Genome Browser
![Page 11: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/11.jpg)
Positive SetPositive SetDownloaded from MiRBase 706
human and 547 mouse genome
![Page 12: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/12.jpg)
Algorithms for SCFGAlgorithms for SCFGCYK algorithm
◦calculates the optimal alignment of a sequence to an SCFG with ambiguity
Inside algorithm◦calculates the probability of a
sequence given an SCFG.Inside-outside algorithm
◦Estimates optimal probability parameters for an SCFG given a set of example sequences.
![Page 13: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/13.jpg)
Advantages of CYKAdvantages of CYKA relative fast algorithm O(n3) and if
we take advantage of the Dynamic Programming table we can scan through the sequence O(n2)
We can quickly compute multiple windows at the same time
It is able to fold an RNA forcefully into a specific structure that we specify
![Page 14: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/14.jpg)
Introduction to the Modified Introduction to the Modified CYK AlgorithmCYK AlgorithmGiven X = X1… Xn and a SCFG G,
◦Find the optimal parse of X◦Dynamic Programming
(i, j, V): likelihood of the most likely parse of xi…xj,
rooted at nonterminal V
![Page 15: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/15.jpg)
Stochastic CYKStochastic CYKInitialization:
(i, i-1) = log P()
Iteration:For i = 1 to NFor j = i to N (i+1, j–1) + log P(xi S xj) (i, j–1) + log P(S xi) (i, j) = max (i+1, j) + log P(xi S) maxi < k < j (i, k) + (k+1, j) + log P(S S)
![Page 16: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/16.jpg)
Weight Estimation of each Weight Estimation of each Non-terminal emissionNon-terminal emissionmiRNA let7 57 sequences
obtained from RfamUsed R Coffee to estimate length
of the hairpin loop, stem, and bulge
The parameters that we estimated seems to work well with majority of the cases of the microRNA
![Page 17: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/17.jpg)
Result for CYKResult for CYKInsert Plot Here
![Page 18: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/18.jpg)
RNAfoldRNAfoldMost commonly used tool for
predicting RNA secondary structureAll the ab intio approaches or hairpin
loop finders currently uses RNAfold for identifying an estimate of a microRNA structure and its MFE
We use RNAfold’s mfe as a measuring stick and use some of its structural features to assist our routine
![Page 19: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/19.jpg)
Result for RNAfold Result for RNAfold Insert Plot Here
![Page 20: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/20.jpg)
CYK RNAfold HybridCYK RNAfold HybridI use the formula as follows[CYK] * 2 + [MFE] =
CombinedScoreDuring the calculation, if RNAfold
predict a structure with two or more hairpin loops, then we penalize the CYK score
![Page 21: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/21.jpg)
Z score calculationZ score calculationIn order for us to combine the
features of the MFE and the CYK score we randomly sampled 20,000 sequences from the Human Genome and calculated its MFE and CYK
![Page 22: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/22.jpg)
CYK RNAfold Hybrid ResultCYK RNAfold Hybrid ResultInsert Plot Here
![Page 23: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/23.jpg)
Optimized Sensitivity Optimized Sensitivity Specificity ComparisonSpecificity Comparison
Human Specificity TestHuman Sensitivity Test Mouse Specificity Test
Mouse Sensitivity Test
8494 pseudo-miRNA 706 miRNA 10213 pseudo-miRNA 547 miRNA
MFE 73.15% 73.07% 65.83% 66.97%
CYK 79.09% 78.60% 72.19% 72.47%
CYK-Hybrid 81.05% 81.08% 72.17% 71.93%
miRFinder 84.46% 84.84% 82.78% 82.27%
![Page 24: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/24.jpg)
Disadvangtage of our Disadvangtage of our ProgramProgramLimited to its structural accuracy
![Page 25: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/25.jpg)
To Do ListTo Do ListPossibly test the accuracy in
terms of CYK’s ability to predicting the structure of the microRNA
Need to run through the
![Page 26: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/26.jpg)
SummarySummaryWe currently have a routine that is
capable of identifying microRNA on a 82% Sensitivity and Specificity based solely on its structure
Currently communicating with a student from the UK that published microPred to see if we can use our program to retrain their SVM to see if we can get a better result
![Page 27: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/27.jpg)
See Website for more See Website for more DetailsDetailshttp://128.192.76.177/ProjectUpd
ate/microRNA.htmlhttp://128.192.76.177/CYK.html
for testing out the grammar
![Page 28: MicroRNA Prediction with SCFG and MFE Structure Annotation Tim Shaw, Ying Zheng, and Bram Sebastian.](https://reader034.fdocuments.in/reader034/viewer/2022051201/5a4d1b5a7f8b9ab0599aab3f/html5/thumbnails/28.jpg)
ReferencesReferences