One tagger, many uses: Simple text-mining strategies for biomedicine
-
Upload
lars-juhl-jensen -
Category
Science
-
view
49 -
download
14
Transcript of One tagger, many uses: Simple text-mining strategies for biomedicine
![Page 1: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/1.jpg)
Lars Juhl Jensen@larsjuhljensen
One tagger, many usesSimple text-mining strategies for
biomedicine
![Page 2: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/2.jpg)
>10 km
![Page 3: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/3.jpg)
too much to read
![Page 4: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/4.jpg)
computer
![Page 5: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/5.jpg)
as smart as a dog
![Page 6: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/6.jpg)
teach it specific tricks
![Page 7: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/7.jpg)
![Page 8: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/8.jpg)
![Page 9: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/9.jpg)
named entity recognition
![Page 10: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/10.jpg)
dictionary
![Page 11: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/11.jpg)
genes / proteins
![Page 12: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/12.jpg)
chemical compounds
![Page 13: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/13.jpg)
diseases
![Page 14: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/14.jpg)
organisms
![Page 15: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/15.jpg)
environments
![Page 16: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/16.jpg)
not comprehensive
![Page 17: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/17.jpg)
expansion rules
![Page 18: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/18.jpg)
prefixes and suffixes
![Page 19: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/19.jpg)
curated blacklist
![Page 20: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/20.jpg)
SDS
![Page 21: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/21.jpg)
software
![Page 22: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/22.jpg)
C++ tagger
![Page 23: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/23.jpg)
>1000 abstracts / second
![Page 24: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/24.jpg)
inherently thread-safe
![Page 25: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/25.jpg)
70–80% recall
![Page 26: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/26.jpg)
80–90% precision
![Page 27: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/27.jpg)
open sourcebitbucket.org/larsjuhljensen/tagger/
![Page 28: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/28.jpg)
Python module
![Page 29: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/29.jpg)
Dockerhub.docker.com/r/larsjuhljensen/tagger/
![Page 30: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/30.jpg)
web servicetagger.jensenlab.org
![Page 31: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/31.jpg)
Extractextract.jensenlab.org
![Page 32: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/32.jpg)
![Page 33: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/33.jpg)
community resources
![Page 34: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/34.jpg)
STRING
![Page 35: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/35.jpg)
string-db.org
![Page 36: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/36.jpg)
functional associations
![Page 37: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/37.jpg)
DISEASES
![Page 38: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/38.jpg)
disease–gene associations
![Page 39: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/39.jpg)
Cytoscape
![Page 40: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/40.jpg)
![Page 41: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/41.jpg)
curated knowledge
![Page 42: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/42.jpg)
experimental data
![Page 43: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/43.jpg)
computational predictions
![Page 44: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/44.jpg)
co-occurrence text mining
![Page 45: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/45.jpg)
Medline abstracts
![Page 46: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/46.jpg)
only abstracts
![Page 47: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/47.jpg)
<1 km
![Page 48: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/48.jpg)
access restrictions
![Page 49: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/49.jpg)
are abstracts sufficient?
![Page 50: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/50.jpg)
15 million full-text articles
![Page 51: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/51.jpg)
Westergaard et al., BioRxiv, 2017
![Page 52: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/52.jpg)
~50% more associations
![Page 53: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/53.jpg)
electronic health records
![Page 54: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/54.jpg)
Jensen et al., Nature Reviews Genetics, 2012
![Page 55: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/55.jpg)
![Page 56: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/56.jpg)
in Danish
![Page 57: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/57.jpg)
dictionary
![Page 58: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/58.jpg)
drugs
![Page 59: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/59.jpg)
adverse events
![Page 60: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/60.jpg)
in Danish
![Page 61: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/61.jpg)
named entity recognition
![Page 62: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/62.jpg)
temporal correlations
![Page 63: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/63.jpg)
Drug introduction Drug discontinuation
Adverse eventNegative modifier Indication Pre-existingcondition
Adverse drug reaction Possibleadverse drug reaction
Adverse event
ADR ofadditional drug
Identification start
Eriksson et al., Drug Safety, 2014
![Page 64: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/64.jpg)
find novel associations
![Page 65: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/65.jpg)
summary
![Page 66: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/66.jpg)
broadly applicable
![Page 67: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/67.jpg)
keep it simple
![Page 68: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/68.jpg)
free tools
![Page 69: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/69.jpg)
AcknowledgmentsEvangelos PafilisSune Pletscher-
FrankildNadezhda Doncheva
Damian SzklarczykMichael Kuhn
Robert Eriksson
Peter Bjødstrup JensenJohn “Scooter” MorrisChristian von MeringPeer BorkChristos ArvanitidisSøren Brunak
![Page 70: One tagger, many uses: Simple text-mining strategies for biomedicine](https://reader033.fdocuments.in/reader033/viewer/2022051710/5a6d52bc7f8b9af8418b535f/html5/thumbnails/70.jpg)