Representation Learning in Medical Documents
-
Upload
irene-li -
Category
Data & Analytics
-
view
142 -
download
1
Transcript of Representation Learning in Medical Documents
![Page 1: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/1.jpg)
ComputerasaDoctor?RepresentationLearninginMedicalDocuments
IreneLi1 andMarkHughes21DublinInstituteTechnology,Ireland
2IBMWatsonHealth,Ireland
![Page 2: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/2.jpg)
▪ MedicareDomainDataset:limited,costy▪DomainExperts:dependency▪ ApplicationRequirements(UseCasenextpage):
•Predictions•Classification•Summarization
Motivation
![Page 3: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/3.jpg)
Usecase:Sentence-LevelNoteClassification
( A 75-y-o woman) with sudden onset back pain last night while lifting turkey from oven. The pain is worse with movement or deep breath, better with rest. No symptoms in legs, no fever or chills. No chest pain, cough, wheezing, abdominal pain, headache… Married. Two children. No smoking.
Sentence Level CategorizationWatson Smart Notes
Free-written texts/chats:
Various TopicsMessy
Irrelevant
![Page 4: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/4.jpg)
▪ Undertheheadof“DeepLearning”or“FeatureLearning”•DLalgorithmsattempttolearnmorecomplexfeatures:multiplelevelsofrepresentation
▪ Why?•Getridof“hand-designed”featuresandrepresentations.•Unsupervisedfeaturelearning.•Everythingintothesamespace.
Example:Lengthsofsentences.
RepresentationLearning
Representation Learning Tutorial, Yoshua Bengio, 2012 http://www.iro.umontreal.ca/~bengioy/talks/icml2012-YB-tutorial.pdf
![Page 5: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/5.jpg)
▪ Undertheheadof“DeepLearning”or“FeatureLearning”•DLalgorithmsattempttolearnmorecomplexfeatures:multiplelevelsofrepresentation
▪ Why?•Getridof“hand-designed”featuresandrepresentations.•Unsupervisedfeaturelearning.•Everythingintothesamespace.
Example:Lengthsofsentences.
RepresentationLearning
Representation Learning Tutorial, Yoshua Bengio, 2012 http://www.iro.umontreal.ca/~bengioy/talks/icml2012-YB-tutorial.pdf
![Page 6: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/6.jpg)
DistributedRepresentationsforwords:•Word2vec[1]:neuralwordembeddings(Eachwordisavector)
•Doc2vec[2,3]:neuraldocument/paragraph/sentenceembeddings(Eachsentenceisavector)
RelatedWork:RLinNLP
[1] Distributed Representations of Words and Phrases and their Compositionality, Mikolov et.al. 2013[2] Distributed Representations of Sentences and Documents, Quoc V.Le et.al. 2014[3] Gensim: https://radimrehurek.com/gensim/models/doc2vec.html
![Page 7: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/7.jpg)
WordClusters:CapturesSemanticMeanings
Visualization using t-SNE.
![Page 8: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/8.jpg)
Visualization using t-SNE.
![Page 9: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/9.jpg)
DocumentClusters
Visualization using t-SNE.Picture from Dai, Andrew M., Christopher Olah, and Quoc V. Le. "Document embedding with paragraph vectors." (2015).
● 4,490,000 Wikipedia English articles
● 915,715 unique words
![Page 10: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/10.jpg)
Approach(1):SentencetoImage
Sentence
Conducted to
examine different features
associated with
NPEV...
WordEmbeddings
2-DImage
![Page 11: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/11.jpg)
Approach(2):Model
Conv Layers: 64 filters; 5x5 Pooling Layers: 2x2;Hidden Layer: 128 unitsOutput: 13 units
![Page 12: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/12.jpg)
Corpus:•3879 publicationsfromPubMed[1]
•27.4millions rawwords•181550wordsinvocabulary•13 classesbytopic/journal
Results(1):Dataset
[1]: US National Library of Medicine National Institutes of Health Search database http://www.ncbi.nlm.nih.gov/pubmed
![Page 13: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/13.jpg)
27.4million wordoccurrencedistribution
Results(1):Dataset
![Page 14: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/14.jpg)
Results(1):Dataset
Plot by https://tagul.com/cloud/2
13classesbytopic/journal
![Page 15: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/15.jpg)
Results(2):R-SquareScoresinClassification
100-d
![Page 16: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/16.jpg)
▪CNNs:abilitytolearndistributedrepresentations.▪ Pre-processing(stop-words,stemming,etc):
Accuracydrops:loseinformation.Example:“studying”,“studies”->“studi”
▪Trainingset:•Arbitrarilychosenbyjournals:overlaps•Noisycontents:irrelevantsentencesExample:“Weexaminedapatientwhohadsalad...”
•No“thebestcase”/baselinesforthesystem
Discussions
![Page 17: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/17.jpg)
▪ Dataset•In-domainknowledge:papers,books,etc•Forspecifictasks:well-labeled▪Representation
•CNNmodel:morecomplex(layers)•Othermodels:Long-shortTermMemory(LSTM),etc▪PotentialApplications
•Notesclassification•Patient2vec(UseCasenextpage):representationlearningonindividualpatient
FutureWorks
![Page 18: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/18.jpg)
Patient2Vec:Everypatientisavector
Featureextraction fromeverything:gender,age, bodyconditions,historytreatments,…
![Page 19: Representation Learning in Medical Documents](https://reader031.fdocuments.in/reader031/viewer/2022030213/589cdd341a28abf86d8b473f/html5/thumbnails/19.jpg)
SpecialthankstoSpyrosKotoulas1 andToyotaroSuzumura2 forsupportandhelp.1IBMWatsonHealth,Dublin,Ireland
2IBMT.J.WatsonResearchCenter,NewYork,USA
Thanks!Q&Aireneli.eu