Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction...

29
17.02.2005 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research Group Bioinformatics of Regulation

Transcript of Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction...

Page 1: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

17.02.2005

Fast and effective prediction of miRNA targets

Marc RehmsmeierCeBiTec, Bielefeld University, GermanyJunior Research Group Bioinformatics of Regulation

Page 2: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

Small interfering RNAs versus small temporal RNAs

Hannon. Nature. 418:244-251, 2002.

Page 3: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

miRNA/target duplexes

Grosshans and Slack. The Journal of Cell Biology, 156(1):17-21, 2002.

Page 4: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

A direct approach

Given a miRNA and a potential target: What are the energetically most favourable binding sites?

Calculation of multiple mfe secondary structure duplexes

Page 5: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

The language of RNA duplexes

hybrid = nil ><< tt (region,region) ||| unpaired_left_top |||closed ... h

unpaired_left_top = ult <<< tt (base,empty) ~~~ unpaired_left_top ||| unpaired_left_bot

... h

unpaired_left_bot = ulb <<< tt (empty,base) ~~~ unpaired_left_bot ||| edangle ... h

edangle = eds <<< tt (base, base) ~~~ closed |||edt <<< tt (base,emptybase) ~~~ closed ||| edb <<< tt (emptybase,base) ~~~ closed ... h

Page 6: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

closed = stacking_region ||| bulge_top ||| bulge_bottom |||internal_loop ||| end_loop ... h

stacking_region = sr <<< basepair ~~~ closed

bulge_top = (bt <<< basepair ~~~ tt (uregion, empty)) `topbound` closed

bulge_bottom = (bb <<< basepair ~~~ tt (empty, uregion)) `botbound` closed

internal_loop = (il <<< basepair ~~~ tt (uregion,uregion)) `symbound` closed

end_loop = el <<< basepair ~~~ tt (region,region)

The language of RNA duplexes

Page 7: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

Dynamic Programming recurrences

Time/memory complexity: linear in target length

Page 8: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

let-7/lin-41 binding sites

position: 688, mfe: -28.0 kcal/mol

position: 737, mfe: -29.0 kcal/mol

Page 9: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

Requirements

For prediction of miRNA targets in large databases we need:

• A fast program

• Good statistics

Page 10: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

Length normalisation of minimum free energies

)mnlog(een

Page 11: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

p-values of individual binding sites

Page 12: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

Poisson statistics of multiple binding sites

Probability of k binding sites:

with

For small p-values:

The probability of at least k binding sites:

exp

!k]kN[P

k

]N[E

p,p]N[E

1

01

k

i]iN[P]kN[P

Page 13: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

Comparative analysis of orthologous targets

Page 14: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

Multi-species p-values

2p

1p

3p

Poisson p-values:

3313322 })p,...,p(max{]pP,pP,pP[P 11

multi-species p-value:

General case: k species

Page 15: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

A dependence problem

We should see a p-value as often as it says (blue curve), but we don‘t (red curve).

Page 16: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

let-7b/NME4 (human/mouse) binding sites

-GGCTCAAGCTGCCCTTACCACCCCATCCCCCACGCAGGACCAACTACCTCCGTCAGCAAGAACCCAAGCCCACATCCAAACCTGCCTGTCCCAAACCAC

GGGCTTGCACTGCCTTCTGCACTTCAGGTCT-ACCCATGACCTACTACCTCTGTCAACAAGAAGTCAAGCCCCCATGC---TTCCCATGTCCCCAAAC--

**** ***** * *** ** * ** ** **** ******** **** ****** ******* *** * * ****** ** *

TTACTTCCCTGTTCACCTCTGCCCCACCCCAGCCCAGAGGAGTTTGAGCCACCAACTTCAGTGCCTTTCTGTACCCCAAGCCAGCACAAGATTGGACCAA

-CACTCCCTACTCCCGCTCTACCCAACTCCAGCCCAGGGGAGTCTAAGCCTCAACTCTATGTGCCTTTTTGTATCCTAAGTCAATACAATATTGGACCAT

*** ** * * **** *** ** ********* ***** * **** * * * ******** **** ** *** ** **** *********

TCCTTTTTGCACCAAAGTGCCGGACAACCTTTGTGGTGGGGGGGGGTCTTCACATTATCATAACCTCTCCTCTAAAGGGGAGGCATTAAAATTCACTGTG

GTCCTTGTGTACAAAAGTGCCAGACAACCTTTG--------GGGCATTGTCA-AAGGTGACTTCACCTGCCTCAAAGGAGAGACATTAAAATTT--TATG

* ** ** ** ******** *********** *** * *** * * * * ** * ***** *** ********** * **

CCCAGCACATGGGTGGTACACTAATTATGACTTCCCCCAGCTCTGAGGTAGAAATGACGCCTTTATGCAAGTTGTAAGGAGTTGAACAGTAAAGAGGAAG

CTTAAAAT--------------------------------------------------------------------------------------------

* * *

5.0e-05Multi-species p-value with k = 1.1:

1.5e-08Multi-species p-value with k = 2:

k = 1.1 is the effective k

Page 17: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

Effective number of orthologous targets

21 )xy(x

minargk

'kF)y,x('k

eff

kkeff 1 })p,p(max{]pP,pP[P effk11 2122

Page 18: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

Requirements

For prediction of miRNA targets in large databases we need:

• A fast program

• Good statistics

Page 19: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

True and false positives and negatives

Classify a

s Positiv

es

Classify a

s Negativ

es

TP

FP

TN

FN Positives

Negatives

Page 20: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

FNTPTP Sens

TP

FP

TN

FN

FPTPFP Sel

1

Sensitivity and specificity

p-values control specificity

Spec

FNTPTP Sens

TP

FP

TN

FN

FPTPFP Sel

1Spec

Page 21: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

RNAhybrid

Target prediction workflowtarget

db miRNA registry

individual p-values

multi-species p-values

Poisson p-values

bantam

#sites

target gene E-value Dm Dp Ag

CG13906 0.000141369 2 1 1

CG3629 0.029351532 2 2 0

CG17136 0.047489474 2 0 1

CG5123 0.048580874 2 2 0

CG13761 0.120263377 0 2 2

CG11624 0.605310610 0 3 0

CG1142 0.677123716 0 0 1

CG13333 0.714171923 2 0 0

Page 22: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

Prediction of Drosophila miRNA targets

• 78 miRNAs

• 28,645 3‘UTRs (1/3 from D. mel, 1/3 from D. pseu, 1/3 from A. gamb)

Page 23: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

Bantam hits

#sites Ag

# sites Dp

#sites Dm

E-valuetarget

0220.049Wrinkled (Hid)

0220.029Distal-less

1120.00014nervous fingers 1

Page 24: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

miR-7 hits

3320.000095CG8394

0220.00014Twin of m4

0110.0083E(spl) region transcript m3

0210.094E(spl) region transcript m

0110.21CG7342

1110.27CG10444

0210.30Him

0110.86CG11132

#sites Ag

# sites Dp

#sites Dm

E-valuetarget

0110.87Arginine methyltransferase 1

Page 25: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

miR-2 hits

2 2 00.054sickle

1 1 00.00951 1 00.111 1 00.00061reaper

1 1 00.0451 2 00.0711 1 00.014grim

#sitesE-value#sitesE-value#sitesE-valuetarget

miR-2cmiR-2bmiR-2a

plus a number of others

Page 26: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

RNAhybrid functionality

length normalisation

Poisson statistics

web serverseed/loop constraints

miRNA specific statistics

effective k

comparative analysis

multiple binding sites

RNAhybrid

Page 27: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

miRNA target selection

surprise

miRNA target selection

rank based

p-values E-values

user guidance

p-values indicate not only biochemical possibility, but also biological function.

Page 28: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

Acknowledgements

• Peter Steffen, Robert Giegerich, Jan Krüger

• Matthias Höchsmann

• Alexander Stark, Julius Brennecke, Stephen M. Cohen

• Sven Rahmann

• Gregor Obernosterer

• Robert Heinen

• Leonie Ringrose

Page 29: Fast and effective prediction of miRNA targets file1 7. 0 2. 2 0 0 5 Fast and effective prediction of miRNA targets Marc Rehmsmeier CeBiTec, Bielefeld University, Germany Junior Research

References

Rehmsmeier M, Steffen P, Höchsmann M and Giegerich R. Fast and effective prediction of microRNA/target duplexes. RNA, 10:1507-1517, 2004.

bibiserv.techfak.uni-bielefeld.de/rnahybrid