RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf ·...
Transcript of RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf ·...
![Page 1: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/1.jpg)
RNA secondary structure
• Functions
• Representations
• Predictions
Many slides courtesy of M. Zuker, RPI Math
Sequence Analysis '16 -- Lecture 13
![Page 2: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/2.jpg)
When RNA secondary structure matters
mRNA --> protein
ssRNA
protein
Strong secondary structure can block translation.
![Page 3: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/3.jpg)
RBS
Registry of Standard Biological Partshttp://parts.igem.org/
UUUCU CUNNNNAAAGA GA NN
NNNAUGNNNN
5' 3'NNNN
fMet
ribosome binding site
16s
NNN
especially sensitive is the...
species specific!
Anderson RBS family -- bacterial.
![Page 4: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/4.jpg)
primary-microRNA
microRNA duplex
dicer
microRNA
blocks translation deadenylation
endonuclease digestion
passenger strand degraded
argonaut proteins
transcription, folding
microRNA (miR) Found in 3'UTR introns exons
nucl
eus
cyto
plas
m
![Page 5: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/5.jpg)
Ambiguous bases
![Page 6: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/6.jpg)
RNA secondary structure is base pairingi•j
Rule
s for
nor
mal
SS
![Page 7: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/7.jpg)
Not just Watson-Crick..
Different ways of base-pairing allow RNA to adopt duplex structures beyond A and B
helix.
If any base-pair is possible, how do we predict pairings?
![Page 8: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/8.jpg)
Structure types within RNA sec struct
E
H M
M I B
I
H
I
I
H
M
I
H
HH I
B
2D plot
![Page 9: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/9.jpg)
BHH
H
H
H
H
H
H
H
H
H
E
M
B
I I
I
I
I
I
I
I
M
M
I
note: No bp lines cross.
![Page 10: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/10.jpg)
Converting from 2D plot sec struct to circle/tree.
![Page 11: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/11.jpg)
CCC
CUCUCC
AG G GGUCAU
CGGA
Circle plot=====>
Pseudoknots
![Page 12: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/12.jpg)
![Page 13: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/13.jpg)
![Page 14: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/14.jpg)
![Page 15: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/15.jpg)
![Page 16: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/16.jpg)
• Mutual information (MI) - Requires a deep multiple sequence alignment - Can find non-canonical base-pairs.
Comparative methods, phylogenetics
Free energy calculations
• Dot plot - Easy. - Can be done on a single sequence. - Cannot find non-canonical base pairs.
Prediction of RNA secondary structure
![Page 17: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/17.jpg)
Comparative modelingassume conserved structure between homologs
![Page 18: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/18.jpg)
RNA structure by energy minimization
e(i,j) = 0 if j-i < 4
Forward summation of energy matrix E:
Assumes energy is the sum of base pairs.
add to loopadd to loop
start a helix, or add a base pair join helices in
multi-loop
everywhere is a potential hairpin
![Page 19: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/19.jpg)
Find k such that Ei,j = E i,k + Ek+1,jPush (i,k) and (k+1,j) onto Stack B.Stop with error if no such k exists.
Stack A becomes the answer: a list of base pairs Stack B is a list of unfinished segments
energy matrix
1. energy matrix is initilized starting from diagonal..
2. base-pairing is found by tracing back from (1,n)
RNA structure by energy minimization
![Page 20: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/20.jpg)
RNA structure by Dot Plot
![Page 21: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/21.jpg)
RNA structure by Dot Plot
![Page 22: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/22.jpg)
1) Run BLAST search to get homologs.2) Prune sequences to remove redundancy.*3) Prune columns to remove uninformative data. (conserved
positions tell you nothing)4) Calculate mutual information (Mi,j) for all pairs of
positions (i,j).
position
position
posi
tion
How-to:RNA structure by MI
![Page 23: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/23.jpg)
sum over all event pair types
frequency of a pair of events observed together= N(a,b events)/N(total events)
expected frequency of a,b events together is the product of the frquencies of
the events separately
Mutual information, in general
in bits, because we used log2
A measure of the surprisingness of a pair of events.
M = Σ f(a,b) log2( )f(a,b)f(a)f(b)a,b ∈{events}
![Page 24: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/24.jpg)
sum over all base-pair types
frequency of base-pair (B1,B2) at positions (i,j) =N(B1,B2)/N(total sequences)
expected frequency of base B1, B2
Mutual information for base-pairsA measure of the surprisingness of the evolution of two positions in the sequence.
i j
pair of sequence positions (i,j)
M(i,j) = Σ fi,j(B1,B2) log2( )fi,j(B1,B2)fi(B1)fj(B2)
B1,B2 ∈{A,C,G,T}
Exercise: Calculate M(i,j)=___________
123456789
101112
species
position
![Page 25: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/25.jpg)
Can you find pairs of positions with high MI?
W-C non-canonical
![Page 26: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/26.jpg)
Mutual information matrix for 20 aligned sequences
very noisy.
Take home message: You need lots of sequence to do mutual information analysis.
And this is for RNA where the signal is strong. Try protein. You'll need thousands of
sequences....
![Page 27: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/27.jpg)
Same RNA, Mi,j for 302 aligned sequences
![Page 28: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/28.jpg)
Helices now appear as straight lines of dots, after re-numbering to remove gaps in
the MSA
Corresponding structure.
H
I
M
I
H
![Page 29: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/29.jpg)
• Mutual information (MI) - Requires a deep multiple sequence alignment - Can find non-canonical base-pairs.
Comparative methods, phylogenetics
Free energy calculations
• Dot plot - Easy. - Can be done on a single sequence. - Cannot find non-canonical base pairs.
Prediction of RNA secondary structure
![Page 30: RNA secondary structure - bioinfo.rpi.edu › bystrc › courses › biol4540 › lecture13.pdf · 6. What is expressed in a dot plot? 7. What is expressed in a circle plot? 8. What](https://reader036.fdocuments.in/reader036/viewer/2022070802/5f02ecf57e708231d406b115/html5/thumbnails/30.jpg)
1. What are the IUPAC codes?2. What are the different types of RNA structure?3. What is mutual information? How is it calculated?4. What are the sources of energy for RNA structure?5. What algorithm is used to calculate the energy over
all RNA structures?6. What is expressed in a dot plot?7. What is expressed in a circle plot?8. What is a pseudoknot?9. What is a non-canonical basepair?10. Can you convert a dotplot into a graph?11. Can you convert a graph into a circle plot?12. Can you see a pseudoknot in a graph, circle, dotplot?
Review questions?