stem Loops

download stem Loops

of 24

Transcript of stem Loops

  • 8/13/2019 stem Loops

    1/24

    12BBI321-Artificial Intelligence

    Seminar

    on

    rRNA Gene Finding : Stem Loops

    as SignalsBy

    USHA B BIRADAR

    1RV12BBI11III Semester,M.Tech Bioinformatics,

    Department of Biotechnology

    RVCE,Bangalore79.

  • 8/13/2019 stem Loops

    2/24

    RNA Sequences

    Composition : A, U, G, C

    RNA Sequences

    Coding

    mRNA - proteins

    Non coding

    rRNA, tRNA,siRNA..

    Functional RNAs

    Subclass-

    Structural RNAs

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    2

  • 8/13/2019 stem Loops

    3/24

    RNA Secondary Structures

    Virtue: Hydrogen bonds

    with complementary

    basesImplication:Fold upon

    itself and form stable

    secondary structures

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    3

  • 8/13/2019 stem Loops

    4/24

    Stem Loops

    Logical expressions1

    RNA sequence: n basesIndexed: (1,2,3,4,.e,I,k,p,t,n) where 1

  • 8/13/2019 stem Loops

    5/24

    Internal structures

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    5

    Some of the RNA secondary structure components 1,3

  • 8/13/2019 stem Loops

    6/24

    Stem Loop Probabilities1

    Stem loop: P(stem loop)

    Nucleotides: P(A), P(U), P(C), P(G)

    Allowed pairing : A-U,G-C,G-U

    Example : say upstream (AAGG),

    then possibilities downstream are

    (UUCC)

    (UUCU)(UUUC) and

    (UUUU)

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    6

  • 8/13/2019 stem Loops

    7/24

    Example

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    7

  • 8/13/2019 stem Loops

    8/24

    Structural RNA Gene Finders

    Thermodynamic models

    Gibbs free energy( G)

    Le et al and Chen et al : Z scores

    Many models which were developed later based onG, G+C content etc

    Drawbacks: dependent on length, strong evidences forsuboptimal G , computational complexity O(n3)

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    8

  • 8/13/2019 stem Loops

    9/24

    RNAFold

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    9

  • 8/13/2019 stem Loops

    10/24

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    10

    RNAFold

  • 8/13/2019 stem Loops

    11/24

    Stem Loop Centered Approach

    RNA genes conserve structure before sequence!!

    Stem loops (RNA structures) helices and sheets(protein structures)

    Pairing rules

    Directionalty 53 Stem loops found in genomic regions coding for structural RNAs are of

    higher densitiesand longer lengths than those in the genomic

    counterparts No window partitioning required

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    11

  • 8/13/2019 stem Loops

    12/24

    Methods

    DNA -------transcribe-----> RNA

    Valid pairing : A-U, U-A, G-C, C-G, G-U, U-G

    High G-C pair composition favored

    Artificial Intelligence technique

    Search Algorithm implemented and extendedto complex techniques like HMMs, Neural

    Networks (for classification purposes)

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    12

  • 8/13/2019 stem Loops

    13/24

    1. Basic Stem Loop finding algorithm

    Problem:

    Find the positions and number of stem loops inthe given RNA sequence

    Given: RNA sequence of length n

    Allowed bp

    Other parametersOutput:

    Each stem loop is stored as a tuple

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    13

  • 8/13/2019 stem Loops

    14/24

    Search Algorithm1: Finds tetra loops

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    14

  • 8/13/2019 stem Loops

    15/24

    2. Analysis with random sequences3. rRNA Stem Loop characters

    Studies on model organisms: E. coli, Saccharomycessps, etc

    Outcomes:

    Size of hairpin loop : 3- 20 bases

    Internal loop size : ~7 bases

    Bulges : 1-4 bases

    %GC bp : ~30-40 %

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    15

    Methods continued

  • 8/13/2019 stem Loops

    16/24

    Extended Stem Loop Finding

    Algorithm

    Incorporate SL of different sizes

    Find internal loops and bulges ( alignment method , scores

    assigned)

    COMPLEXITY OF THE ALGORITHM

    Worst case: O(n2)

    Real case : O(n) ..i.e lenth of stem loop

  • 8/13/2019 stem Loops

    17/24

    Stem Loop Statistics1

    bps Span

    cSpacing

    fSpacing

    Combined metrics:

    [(cSpacing OR fSpacing) * bps]

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    17

  • 8/13/2019 stem Loops

    18/24

    Statistics

    1. Mean metric domains for a given genomicdomain ( )

    2. Hypothesis tests

    Finding rRNA genes

    Calculate all statistical values for the

    unknown gene.

    Compare against standard with appropriateconfidence limits

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    18

  • 8/13/2019 stem Loops

    19/24

    Finding rRNA Genes

    Each metric individually used

    Complex methods : apply HMMs, Neural

    Networkswith different metrics as inputs (

    appropriate weights assigned)

    Improvisations: Accuracy, Efficiency ,

    parameter optimization considerations.

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    19

  • 8/13/2019 stem Loops

    20/24

    Neural Network Model

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    20

    .

    .

    .

    .

    VALID PAIRING

    INTERIOR BULGE

    LENGTH

    cSpacing

    fSpacing

    Input nodes Output :

    rRNAyes/no

    Wij

    Wij

    Input networks

    l

  • 8/13/2019 stem Loops

    21/24

    Tools

    Vienna RNA package

    MFOLD

    SPECIFIC rRNA tools: SILVA rRNA database project (Max Planck Institute for Marine

    Microbiology, Bremen, Germany) - provides comprehensive, qualitychecked and regularly updated datasets of aligned small (16S/18S, SSU)and large subunit (23S/28S, LSU) ribosomal RNA (rRNA) sequences for allthree domains of life (Bacteria, Archaea and Eukarya).

    RNAmmer 1.2- predicts 5s/8s, 16s/18s, and 23s/28s ribosomal RNA in fullgenome sequences

    The Ribosomal Database Project - (RDP(Michigan State University Centrefor Microbial Ecology, U.S.A.). -provides ribosome related data andservices, including online data analysis and aligned and annotatedBacterial and Archaeal small-subunit 16S rRNA sequences.

    Ridom- Ribosomal RNA analysis for clinically relevant bacteria -(University of Wrzburg, Germany).

    Rifle- (Universitat Bielefeld, Germany) The RIFLE system comparesrestriction patterns of 16S rDNA amplicons against a database oftheoretical restriction patterns generated from a 16S rDNA database

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    21

  • 8/13/2019 stem Loops

    22/24

    CASE STUDIES

    Sanjun, Rafael, and Antonio V. Bordera(2011).

    "Interplay between RNA structure and protein

    evolution in HIV-1." Molecular biology and

    evolution28.4 : 1333-1338. Yu, Chien-Hung, et al(2011). "Stemloop structures

    can effectively substitute for an RNA pseudoknot

    in 1 ribosomal frameshifting." Nucleic acids

    research39.20 : 8952-8959.

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    22

  • 8/13/2019 stem Loops

    23/24

    References1. Kirt M. Noel, Kay C. Wiese (2008).Considering Stem-Loops as Sequence

    Signals for Finding Ribosomal RNA Genes.Computational Intelligence in

    Biomedicine and Bioinformatics Studies in Computational

    Intelligence ,Volume 151.

    2. Ronny Lorenz et al (2011).ViennaRNA Package 2.0. Algorithms Mol

    Biol. ,vol 6: 26.

    3. Yoon, Byung-Jun. (2008).Effective annotation of noncoding RNA familiesusing profile context-sensitive HMMs. Communications, Control and

    Signal Processing, 2008. ISCCSP 2008. 3rd International Symposium on.

    IEEE, 2008.

    4. Yoon, Byung-Jun. (2009).Hidden markov models and their applications in

    biological sequence analysis.Current genomics10.6 : 402.

    5. Carter, Richard J., Inna Dubchak, and Stephen R. Holbrook(2001). A

    computational approach to identify genes for functional RNAs in genomic

    sequences.Nucleic Acids Research29.19 : 3928-3938.

    November 8, 2013 Dept of Biotechnology[M.TechBioinformatics]

    23

  • 8/13/2019 stem Loops

    24/24

    THANK

    YOU!!