Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens
-
Upload
lars-juhl-jensen -
Category
Technology
-
view
169 -
download
2
Transcript of Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens
![Page 1: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/1.jpg)
Statistics on big biomedical data
Methods and pitfalls when analyzing high-throughput screens
Lars Juhl Jensen
![Page 2: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/2.jpg)
Statistics on big biomedical data
Methods and pitfalls when analyzing high-throughput screens
Lars Juhl Jensen
![Page 3: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/3.jpg)
t-test
![Page 4: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/4.jpg)
ANOVA
![Page 5: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/5.jpg)
normal distribution
![Page 6: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/6.jpg)
useful tests
![Page 7: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/7.jpg)
counts
![Page 8: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/8.jpg)
contingency table
![Page 9: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/9.jpg)
Jensen et al., Nature Reviews Genetics, 2012
![Page 10: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/10.jpg)
Fisher’s exact test
![Page 11: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/11.jpg)
real numbers
![Page 12: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/12.jpg)
no theoretical distribution
![Page 13: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/13.jpg)
non-parametric statistics
![Page 14: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/14.jpg)
do the medians differ?
![Page 15: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/15.jpg)
Mann–Whitney U test
![Page 16: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/16.jpg)
medians can mislead you
![Page 17: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/17.jpg)
do the distributions differ?
![Page 18: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/18.jpg)
Kolmogorov–Smirnov test
![Page 19: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/19.jpg)
![Page 20: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/20.jpg)
does not tell how they differ
![Page 21: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/21.jpg)
resampling
![Page 22: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/22.jpg)
Monte Carlo testing
![Page 23: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/23.jpg)
![Page 24: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/24.jpg)
always applicable
![Page 25: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/25.jpg)
compute intensive
![Page 26: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/26.jpg)
multiple testing
![Page 27: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/27.jpg)
xkcd.com
![Page 28: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/28.jpg)
xkcd.com
![Page 29: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/29.jpg)
xkcd.com
![Page 30: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/30.jpg)
xkcd.com
![Page 31: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/31.jpg)
compare multiple condition
![Page 32: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/32.jpg)
Gene Ontology enrichment
![Page 33: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/33.jpg)
Bonferroni
![Page 34: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/34.jpg)
avoid making any errors
![Page 35: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/35.jpg)
too conservative
![Page 36: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/36.jpg)
Benjamini–Hochberg
![Page 37: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/37.jpg)
control false discovery rate
![Page 38: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/38.jpg)
assumes independence
![Page 39: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/39.jpg)
resampling
![Page 40: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/40.jpg)
negative set
![Page 41: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/41.jpg)
systematic biases
![Page 42: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/42.jpg)
Huang et al., Journal of Proteome Research, 2014
![Page 43: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/43.jpg)
studiedness bias
![Page 44: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/44.jpg)
we study disease proteins
![Page 45: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/45.jpg)
thus we know many PTMs
![Page 46: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/46.jpg)
abundance bias
![Page 47: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/47.jpg)
higher expressed
![Page 48: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/48.jpg)
easier to detect in assays
![Page 49: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/49.jpg)
better characterized
![Page 50: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/50.jpg)
matched background
![Page 51: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/51.jpg)
the big data effect
![Page 52: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/52.jpg)
if you have enough data
![Page 53: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/53.jpg)
any difference is significant
![Page 54: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/54.jpg)
but maybe not relevant
![Page 55: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/55.jpg)
“significant”
![Page 56: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/56.jpg)
statistical significance
![Page 57: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/57.jpg)
p-value
![Page 58: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/58.jpg)
biological relevance
![Page 59: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/59.jpg)
fold change
![Page 60: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/60.jpg)
relative risk
![Page 61: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/61.jpg)
significant and relevant
![Page 62: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/62.jpg)
volcano plots
![Page 63: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/63.jpg)
Lundby et al., Science Signaling, 2013
![Page 64: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/64.jpg)
rather ad hoc
![Page 65: Statistics on big biomedical data - Methods and pitfalls when analyzing high-throughput screens](https://reader036.fdocuments.in/reader036/viewer/2022062418/554e89f1b4c90573338b4991/html5/thumbnails/65.jpg)
questions?