SAMS: Data and Text Mining for Early Detection of...

33
SAMS: Data and Text Mining for Early Detection of Alzheimer’s Disease November, 2016 Dr Christopher Bull

Transcript of SAMS: Data and Text Mining for Early Detection of...

Page 1: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

SAMS:DataandTextMiningforEarlyDetectionofAlzheimer’sDiseaseNovember,2016DrChristopherBull

Page 2: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Aimoftalk

• WhatisSAMS• DataCapture

– Problemsandsolutionstoacquiringthistypeoftext/data• NLP

– Toolsused• Existing• Bespoke

• Reflections

Page 3: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

WhoamI?

DrChristopherBull

[email protected]@ChrisBull88

[Insertdashingphotohere]

• 2011– PhD• 2014– SAMS(PDRA)• 2016– MobileAge(PDRA)------------------------------------------• SoftwareEngineering• Education/Pedagogy• DigitalHealthTechnologies

Page 4: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

SAMSOverview

Page 5: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Problem

• NationalDementiaStrategy(2009):early(‘timely’)diagnosis

• Onlyabout50%ofpeoplewithdementiacurrentlyreceiveadiagnosis

• Diagnosisisoftenlate- moderateorseverestages

Page 6: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

WhatisAlzheimer’sDisease?

• Alzheimer’sisthemostcommoncauseofdementia(estimated60%-80%ofcases)– Dementia“describessymptomsthatoccurwhenthebrainisaffectedby

certaindiseasesorconditions”• Symptomsinclude:

– memoryloss– difficultieswith:

• thinking• problem-solving• language

• UltimatelyfatalSource:Alzheimer’sSociety

Page 7: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

SAMS

Goal:ExploreTechnology-dependentproxymarkersOfAlzheimer’sDisease

Aims:• Nonintrusivecaptureofcomputeruse• Minethedatafortrendsandpatterns• Inferlongitudinalchangesincognitivehealth

Page 8: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Team

ProfessorPeteSawyer SchoolofComputingandCommunications,LancasterUniversity

DrPaulRayson SchoolofComputingandCommunications,LancasterUniversity

DrChristopherBull SchoolofComputingandCommunications,LancasterUniversity

ProfessorAlistairSutcliffe SchoolofComputingandCommunications,LancasterUniversity

ProfessorAlistairBurns NationalClinicalDirectorforDementiainEngland,InstituteofBrain,BehaviourandMentalHealth,UniversityofManchester

DrIracema Leroi InstituteofBrain,BehaviourandMentalHealth,UniversityofManchester

GemmaStringer InstituteofBrain,BehaviourandMentalHealth,UniversityofManchester

DrSamuelCouth InstituteofBrain,BehaviourandMentalHealth,UniversityofManchester

ProfessorJohnKeane SchoolofComputerScience,UniversityofManchester

DrAnnGledson SchoolofComputerScience,UniversityofManchester

ProfessorCliveBallard WolfsonCentreforAge-RelatedDiseases,King'sCollegeLondon

Page 9: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

DataFlows

Page 10: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

CurrentStatus

• ProjectfundingendedSeptember2016

• On-goinganalysis

Page 11: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

MyRoleinSAMS

…andDataCollection

Page 12: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

MyRole

• Datacapturesoftware– SoftwareDesign/implementation

• SAMSManager• Browserextensions

– Maintenance(obviously)• TextMining

– Textextraction(reconstruction)– ReusingexistingNLPpipeline(Wmatrix;UCREL)– Implementingextensionstopipelineforspecificheuristics

• GeneralProjectSupport(Team&Participants)• Considerchallenges

Page 13: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Challenges

• Volatilityofparticipantcomputers– Unexpectedupdates– Varyingshutdownprocedures– Varioussoftwaresetups(anti-virusetc.)

• Weakperformingcomputers(andnotmonopolisevaluableresources)– Again,varioushardware/softwaresetups

• Ethicalchallenges– Privacy/Security

• Novelmonitoringapproaches• InternetExplorer*sigh*• Win10roll-outmidprojectà

Page 14: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

AbstractArchitecture(DataCollection)

BrowserExtensions

Desktop/ApplicationMonitorProcesses

EncryptLogs

SecureSAMSServer

ManagerProcess

Collectingcontext,notjustrawdata

Page 15: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Desktop/ApplicationMonitorProcesses

u C#inputeventlisteners

u VarietyofMouse,keyboard.

u WindowsAutomationAPI:UIAutomation(UIA)

u ObserveUIelements(andproperties)auserinteractswith.

u Providescontextbehindevents.

Desktop/AppMonitor

*WorkofDrAnnGledson,Mancs

Page 16: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

BrowserExtensionsBrowserExtension

Webpageblack/whitelist(e.g.nohttps://unlesspredefined)

JSDOMparsing(textfields andinteractiveelements)

JSeventlisteners&contextidentifier(Click,Mouse-Move,Focusetc.)

Logmessagecaching(volatile)

Encryption

Writelogfiles

Page 17: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

BrowserMonitoring- Challenges

• Contexttoevents

• ConstantlychangingordynamicDOM

Page 18: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Manager/Uploader

• Processmanagement

• Servercommunication

• Remoteupdating

• Logmessagecachingandencryption

Page 19: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Manager(2)

EarlyUI

Page 20: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

ProjectSupport

• ParticipantStatusChecker– Forclinical&Techteams– +Androidapp

• Phonesupport– ClinicalTeam– Participants

• Participantvisits(Installs)

Page 21: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

ExistingStudy(s)

NunStudy:• Measures

obtainedfromautobiographies

• writtenovera60-yearspan(age22to83).

Nodementia Dementia

Grammaticalcomplexity

-mean4.78-declined.04unitsperyear

-mean3.86-declined.03unitsperyear.

Ideadensity -mean5.35propositionsper10words- declined.03unitsperyear

-mean 4.34propositionsper10words-declined.02unitsperyear.

Page 22: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

PropositionalIdeaDensity(P-density)

• “Ideadensity[…]isthenumberofexpressedpropositionsdividedbythenumberofwords.Intermsofsemantics,ideadensityisameasureoftheextenttowhichthespeakerismakingassertions(oraskingquestions)ratherthanjustreferringtoentities”– “Automaticmeasurementofpropositionalideadensityfrompart-

of-speechtagging”(Brownetal,2008)• ExistingImplementation

– CPIDR(ComputerizedPropositionalIdeaDensityRater)– (pronounced“spider”)– onlytooltoautomatethis*

*AttimeofstartingSAMS

Page 23: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Kusari (Toolchainmanager)

“ToolchainanddatadependencymanagerforusewithconventionalNLPtoolchains”

DrSteveWattamhttps://delta.lancs.ac.uk/Steve/kusarihttps://delta.lancs.ac.uk/Steve/kusari-links

Page 24: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Toolchain

SpellingVariation VARDucrel.lancs.ac.uk/vard/Java

PartOfSpeechTagger CLAWSucrel.lancs.ac.uk/claws/C

SemanticTagger USASucrel.lancs.ac.uk/usas/C

FrequencyLists Tmatrixucrel.lancs.ac.uk/wmatrix/C

SAMSsoftware SNOWCATdelta.lancs.ac.uk/SAMS/SNOWCATJava

Page 25: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

SNOWCAT

Sams aNalysis ofOutputfromWmatrix fortheCognitiveAssessmentofText

• Input– Tmatrix (FQLs)– USAS(Sem)

• Output– CSVofmetrics

Page 26: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

SNOWCAT:SampleOutput(1/2)

• TotalWords(MWE), 26278• TotalWords, 27787• Vocabularysize(MWE), 3533• Vocabularysize, 3444• Type:Token (ratio;MWE), 0.134• Type:Token (ratio), 0.124• Type:Token (normalisedratio), 0.403• Wordsoccurringonce(MWE), 1842• Adjective(total;MWE), 1288• Adjective(ratio;MWE), 0.049• Noun(total;MWE), 4280• Noun(ratio;MWE), 0.163• …

Page 27: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

SNOWCAT:SampleOutput(2/2)

• Pronoun(total;MWE), 2672• Pronoun(ratio;MWE), 0.102• Verb(total;MWE), 6135• Verb(ratio;MWE), 0.233• Contentwords(total;MWE), 13757• Contentwords(ratio;MWE), 0.524• Fillerwords(total;MWE), 183• Fillerwords(ratio;MWE), 0.007• Noun:Verb (ratio;MWE), 0.698• MeanLengthofUtterance, 27.653• VARDVariant(total), 69• VARDVariant(ratio), 0.003• PropositionalIdeaDensity, 0.565

Page 28: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Early(unpublished)Results

• ValidateP-Density(comparisontoCPIDRtool)

• UsesnoveliststudytoexploreusefulnessofSNOWCATmetrics

• [Showspreadsheetofearly(unpublished)results]

Page 29: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Charts

Page 30: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

What’snext?

• ContinueNLPanalysis

• CorrelateDataandTextMininganalyses

• …SAMS2.0

Page 31: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

LessonsLearnt

• Ethicalprocess– Affectsfundamentaldesigndecisions

• Complexityofdatacollectionoutsideof“labsetting”

• Validatingotherstudies/claimsimportant

Page 32: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Thankyou

November,2016DrChristopherBull

http://ucrel.lancs.ac.uk/sams/[email protected]@ChrisBull88

Page 33: SAMS: Data and Text Mining for Early Detection of ...ucrel.lancs.ac.uk/crs/attachments/UCRELCRS-2016-12-01-Bull-Christ… · SAMS: Data and Text Mining for Early Detection of Alzheimer’s

Publications

ucrel.lancs.ac.uk/sams/papers.php• Combiningdataminingandtextminingfordetectionofearlystagedementia:the

SAMSframework.Bull,C.,Asfiandy,D.,Gledson,A.,Mellor,J.,Couth,S.,Stringer,G.,Rayson,P.,Sutcliffe,A.,Keane,J.,Zeng,X.,Burns,A.,Leroi,I.,Ballard,C.,&Sawyer,P.(2016).In LREC-2016Workshop: RaPID-2016 [proceedings; slides]

• FromClicktoCognition:Detectingcognitivedeclinethroughdailycomputeruse.Stringer,G.,Sawyer,P.,Sutcliffe,A.,&Leroi,I.(2015).InD.Bruno(Ed.), ThePreservationofMemory:TheoryandPracticeforClinicalandNon-ClinicalPopulations (pp.93-103).Hove,UK:PsychologyPress.[onlinepreview]

• DementiaandSocialSustainability:ChallengesforSoftwareEngineering.Sawyer,P.,Sutcliffe,A.,Rayson,P.,& Bull,C. (2015).In 37thInternationalConferenceonSoftwareEngineering(ICSE'15) (pp.527-530).Florence,Italy:IEEE.DOI: 10.1109/ICSE.2015.188

• Discoveringaffect-ladenrequirementstoachievesystemacceptance.Sutcliffe,A.,Rayson,P., Bull,C.,&Sawyer,P.(2014).In 22ndIEEEInternationalRequirementsEngineeringConference(RE'14). (pp.173-182).IEEE.DOI: 10.1109/RE.2014.6912259