Text and data mining
-
Upload
lars-juhl-jensen -
Category
Technology
-
view
577 -
download
8
Transcript of Text and data mining
![Page 1: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/1.jpg)
Text and data mining
Lars Juhl Jensen
![Page 2: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/2.jpg)
Part 1text mining
![Page 3: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/3.jpg)
exponential growth
![Page 4: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/4.jpg)
![Page 5: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/5.jpg)
![Page 6: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/6.jpg)
some things are constant
![Page 7: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/7.jpg)
![Page 8: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/8.jpg)
~45 seconds per paper
![Page 9: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/9.jpg)
computer
![Page 10: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/10.jpg)
as smart as a dog
![Page 11: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/11.jpg)
teach it specific tricks
![Page 12: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/12.jpg)
![Page 13: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/13.jpg)
![Page 14: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/14.jpg)
named entity identification
![Page 15: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/15.jpg)
Reflect
![Page 16: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/16.jpg)
Pafilis, O’Donoghue, Jensen et al., Nature Biotechnology, 2009
![Page 17: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/17.jpg)
comprehensive lexicon
![Page 18: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/18.jpg)
orthographic variation
![Page 19: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/19.jpg)
“black list”
![Page 20: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/20.jpg)
information extraction
![Page 21: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/21.jpg)
no access
![Page 22: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/22.jpg)
![Page 23: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/23.jpg)
collaboration
![Page 24: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/24.jpg)
![Page 25: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/25.jpg)
![Page 26: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/26.jpg)
Part 2protein networks
![Page 27: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/27.jpg)
guilt by association
![Page 28: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/28.jpg)
![Page 29: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/29.jpg)
STRING
![Page 30: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/30.jpg)
Szklarczyk, Franceschini et al., Nucleic Acids Research, 2011
![Page 31: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/31.jpg)
genomic context
![Page 32: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/32.jpg)
gene fusion
![Page 33: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/33.jpg)
Korbel et al., Nature Biotechnology, 2004
![Page 34: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/34.jpg)
experimental data
![Page 35: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/35.jpg)
physical interactions
![Page 36: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/36.jpg)
Jensen & Bork, Science, 2008
![Page 37: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/37.jpg)
gene coexpression
![Page 38: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/38.jpg)
genetic interactions
![Page 39: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/39.jpg)
Beyer et al., Nature Reviews Genetics, 2007
![Page 40: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/40.jpg)
![Page 41: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/41.jpg)
curated knowledge
![Page 42: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/42.jpg)
pathways
![Page 43: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/43.jpg)
Letunic & Bork, Trends in Biochemical Sciences, 2008
![Page 44: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/44.jpg)
text mining
![Page 45: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/45.jpg)
![Page 46: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/46.jpg)
many data types
![Page 47: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/47.jpg)
many databases
![Page 48: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/48.jpg)
different formats
![Page 49: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/49.jpg)
different identifiers
![Page 50: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/50.jpg)
variable quality
![Page 51: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/51.jpg)
quality scores
![Page 52: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/52.jpg)
calibrate vs. gold standard
![Page 53: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/53.jpg)
von Mering et al., Nucleic Acids Research, 2005
![Page 54: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/54.jpg)
orthology transfer
![Page 55: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/55.jpg)
Frishman et al., Modern Genome Annotation, 2009
![Page 56: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/56.jpg)
Part 3drug networks
![Page 57: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/57.jpg)
new uses for old drugs
![Page 58: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/58.jpg)
shared target(s)
![Page 59: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/59.jpg)
chemical similarity
![Page 60: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/60.jpg)
Campillos & Kuhn et al., Science, 2008
![Page 61: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/61.jpg)
similar drugs share targets
![Page 62: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/62.jpg)
Campillos & Kuhn et al., Science, 2008
![Page 63: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/63.jpg)
only trivial predictions
![Page 64: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/64.jpg)
phenotypic similarity
![Page 65: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/65.jpg)
chemical perturbations
![Page 66: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/66.jpg)
phenotypic readouts
![Page 67: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/67.jpg)
drug treatment
![Page 68: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/68.jpg)
side effects
![Page 69: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/69.jpg)
no database
![Page 70: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/70.jpg)
package inserts
![Page 71: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/71.jpg)
Campillos & Kuhn et al., Science, 2008
![Page 72: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/72.jpg)
text mining
![Page 73: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/73.jpg)
manual validation
![Page 74: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/74.jpg)
side-effect correlations
![Page 75: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/75.jpg)
Campillos & Kuhn et al., Science, 2008
![Page 76: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/76.jpg)
side-effect frequencies
![Page 77: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/77.jpg)
Campillos & Kuhn et al., Science, 2008
![Page 78: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/78.jpg)
side-effect similarity
![Page 79: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/79.jpg)
chemical similarity
![Page 80: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/80.jpg)
Campillos & Kuhn et al., Science, 2008
![Page 81: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/81.jpg)
categorization
![Page 82: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/82.jpg)
Campillos & Kuhn et al., Science, 2008
![Page 83: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/83.jpg)
20 drug–drug pairs
![Page 84: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/84.jpg)
in vitro binding assays
![Page 85: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/85.jpg)
Ki<10 µM for 11 of 20
![Page 86: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/86.jpg)
cell assays
![Page 87: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/87.jpg)
9 of 9 showed activity
![Page 88: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/88.jpg)
Acknowledgments
reflect.wsSune Frankild
Heiko Horn
Evangelos Pafilis
Michael Kuhn
Reinhardt Schneider
Sean O’Donoghue
sideeffects.embl.deMonica Campillos
Michael Kuhn
Anne-Claude Gavin
Peer Bork
string-db.orgDamian Szklarczyk
Andrea Franceschini
Michael Kuhn
Milan Simonovic
Alexander Roth
Pablo Minguez
Tobias Doerks
Manuel Stark
Jean Muller
Peer Bork
Christian von Mering
![Page 89: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/89.jpg)
larsjuhljensen
![Page 90: Text and data mining](https://reader033.fdocuments.in/reader033/viewer/2022042814/554e92eab4c90573338b4f09/html5/thumbnails/90.jpg)