Mobile devices in traditional web panels, what can we do - CentERdata
Detecting activity patterns on accelerometer data: a Deep ... · •The LISS (Longitudinal Internet...
Transcript of Detecting activity patterns on accelerometer data: a Deep ... · •The LISS (Longitudinal Internet...
Detectingactivitypatternsonaccelerometerdata:aDeepLearning
approachusingGoogleCloudKai-TaoYang
DataScienceTeamatCentERdataGDGDevFest NL
November18th,2017,Amsterdam,Netherlands
ContactInformation:Email:ykaitao.hotmail.comLinkedIn:https://www.linkedin.com/in/kaitaoyang/GitHub:https://github.com/ykaitaoWebsite:www.dlapplied.com
Collaborators:LennardKuijtenPradeepKumarMarciadenUijlPatriciaPrüferEricBalster
3
Outline
• What istheproblem?• Why isitimportant?• Howwastheproblemsolved?• What arethefuturedirections?
4
Outline
• What istheproblem?ØAccelerometerdeviceØAccelerometerdataØAccelerometerdatainLISSpanel
• Why isitimportant?• Howwastheproblemsolved?• What arethefuturedirections?
5
Accelerometerrecording
OfficeTilburgUniversiteit
stationEindhovenstation Home
8
walking sittingonthetrain walkingcycling
walking
sittingonthetrain
cycling
AccelerometerdatainLISSpanel
• In2013,805accelerometerrecordingswerecollectedfrom805LISSpanelmembers.Eachrecordinghasthelengthfrom10to13days,atthesamplingrateof60Hz(eachrecordingisabout3Gb).
• TheLISS(LongitudinalInternetStudiesfortheSocialsciences,maintainedbyCentERdata)panelconsistsofabout5000households,comprising8000individuals,basedonatrueprobabilitysampleofhouseholdsdrawnfromthepopulationregisterbyStatisticsNetherlands.
9
AccelerometerdatainLISSpanel(continued)
• Onamonthlybasis,LISSpanelmemberscompleteonlinequestionnaires,resultinginrichdatalikegender,age,income,livingcondition,educationlevel,healthstatus,politicalview,etc.
• AllLISSpaneldataarepubliclyavailableforresearchpurposeonly,throughtheLISSdataarchive:https://www.dataarchive.lissdata.nl/.
10
Outline
• What istheproblem?• Why isitimportant?
ØGoalsofthisstudy.ØAdvantagesofaccelerometermeasurement.
• Howwastheproblemsolved?• What arethefuturedirections?
11
Importanceofthisstudy
• Goalsofthisstudy:• Detect activitypatternsfromtheaccelerometerdataoftheLISSpanel.• Find therelationshipbetweenactivitypatternsandbackgroundvariables(e.g.,thejoggingpatternandthehealthstatus).
• Advantagesofaccelerometermeasurement• Moreobjective thenquestionnaire.• Non-invasive ofprivacyandlessexpensivethanvideosurveillance.
12
Outline
• What istheproblem?• Why isitimportant?• Howwastheproblemsolved?
ØApproachesintheliterature§ Conventionalapproaches§ Deeplearningapproaches
ØOurapproach
• What arethefuturedirections?
13
Approachesintheliterature(conventional)
14Figureadoptedfrom:Wangetal.DeepLearningforSensor-basedActivityRecognition:ASurvey,PatternRecognitionLetters,2017
Approachesintheliterature(deeplearning)
15Figureadoptedfrom:Wangetal.DeepLearningforSensor-basedActivityRecognition:ASurvey,PatternRecognitionLetters,2017
CNN:ConvolutionalNeuralNetworkDNN:DeepNeuralNetworkDBN:DeepBeliefNetworkRNN:RecurrentNeuralNetworkLSTM:Long-ShortTermMemorySdA:Stackedde-noisingAuto-encoder
SummationofDeepLearningapproaches
16
CNN
Tableadoptedfrom:Wangetal.DeepLearningforSensor-basedActivityRecognition:ASurvey,PatternRecognitionLetters,2017
CNNincomputervisioncommunity
Name Year Layers Honor Achievements
LeNet 1990 5 ThefirstsuccessfulapplicationsofCNN(toreadzipcodes,digits,etc.).
AlexNet 2012 8 WinnerinILSVRC ThefirstworkthatpopularizedConvolutionalNetworksinComputerVision(GPUsNVIDIAGTX580toreducetrainingtime).Top5errorof16%comparedtorunner-upwith26%error.
ZFNet 2013 8 WinnerinILSVRC Top-5errorrateof 11.2%.
VGGNet 2014 19 Runner-upinILSVRC Top-5errorrateof 7.5% forVGGNet-19.
GoogLeNet(InceptionV2, V3,V4)
2014 22 WinnerinILSVRC Top-5errorrateof 6.67%,5.6%,5.0%forV2,V3,V4,respectively[1].
ResNet 2015 152 WinnerinILSVRC Top-5errorrateof 3.57%forResNet-152.
17ILSVRC=ImageNetLargeScaleVisualRecognitionCompetition
Outline
• What istheproblem?• Why isitimportant?• Howwastheproblemsolved?
ØApproachesintheliteratureØOurapproach
§ Overviewo Preparingtrainingdatao Buildingmodelso Trainingmodelso Testingmodels
§ Details
• What arethefuturedirections?
18
Ourapproach (overview)
• Sixactivities:• cycling,• drivingcar,• jogging,• sittingonthetrain,• sleeping,• walking
• WeadaptedourmodelfromVGGNet-19[1].• Trained3models(ofdifferentsegmentlengths:about5,10,20seconds)usingtheGUPsofDutchsupercomputer.• Testedmodelson805LISSpanelrecordingsusingGoogleCloudservices.
19[1]https://github.com/fchollet/keras/blob/master/keras/applications/vgg19.py
TrainingusingDutchsupercomputer(Cartesius)
#!/bin/bash#SBATCH-N1#SBATCH-t120:00:00#SBATCH-pgpumoduleunloadmpimoduleloadmpi/openmpi/2.0.1-cuda80moduleloadcuda/8.0.44moduleloadcudnn/8.0-v5.1moduleloadpython/2.7.11srunpythonacc_keras_vgg19_small.pyrelu50.1
21
Job.sh
Numberofnode
Expectedexecutingtime(e.g.,120hours)
Partition(e.g.,gpu)
Unloadmodule
Loadmodule
ExecutePythonapplication
Outline
• What istheproblem?• Why isitimportant?• Howwastheproblemsolved?
ØApproachesintheliteratureØOurapproach
§ Overview§ Details
o NeuronandNeuralNetworko ConvolutionalNeuralNetworkandVGGNeto AdaptingVGGNet-19forourapplicationo Datapre-preparation
• What arethefuturedirections?
29
Neuralnetwork(linear)
31
∑
b1w11x1
y1
∑
b2x2
y2
x3
w12
w21
w22
w31
w32
y=xW+b
wherex=[x1,x2,x3]
w11,w12W=w21,w22
w31,w32
b=[b1,b2]y=[y1,y2]
Inputlayer Outputlayer
Neuralnetwork(linear)
32
∑
b1w11x1
y1
∑
b2x2
y2
x3
w12
w21
w22
w31
w32
y=xW+bwherexis1-by-myis1-by-nwism-by-nbis1-by-n
Inputlayer Outputlayer
Neuralnetwork(non-linear)
33
∑
b1w11x1
y1
∑
b2x2
y2
x3
w12
w21
w22
w31
w32
y=ƒ(xW+b)
ƒ
ƒ
Activationfunction:sigmoid
𝑓 𝑥 =1
1 + 𝑒'(
Inputlayer Outputlayer
Neuralnetwork(non-linear,multi-layers)
34
∑
bhWihx h
∑
h= ƒ(xWih+bh)y= ƒ(hWho+bo)y= ƒ(ƒ(xWih+bh)Who+bo)
ƒ
ƒ
∑
bo y
∑
ƒ
ƒ
Who
Inputlayer Hiddenlayer Outputlayer
WhatisCNN?
36Figureadoptedfrom:https://devblogs.nvidia.com/parallelforall/deep-learning-nutshell-core-concepts/
Convolution(2D)
37Figureadoptedfrom:http://deeplearning.stanford.edu/wiki/index.php/Feature_extraction_using_convolution
Ourapproach(detailed)kernels
40Codeadaptedfrom:https://github.com/fchollet/keras/blob/master/keras/applications/vgg19.py
Thinanddeep
Ourapproach(detailed)activationfunction
41Moreoptionsaboutactivation:https://keras.io/activations/
Sigmoidinourmodel
Ourapproach(detailed)activationfunction
42
𝜎 𝑦+ =𝑒,-
∑ 𝑒,/0123
ReLU
Sigmoid
Softmax
𝜎 𝑦+ =1
1 + 𝑒',-
𝜎 𝑦+ = 4𝑦+, 𝑓𝑜𝑟𝑦+ > 00, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
𝑦+
𝜎 𝑦+
0
𝜎 𝑦+ =𝑦+
𝑦+
𝜎 𝑦+
0
0.51
𝑦+
𝜎@ 𝑦+
0
0.51
𝑦+
𝜎@ 𝑦+
0
0.25
Ourapproach(detailed)lossfunction
44
cycling drivingcar
jogging sittingonthetrain
sleeping walking
1 0 0 0 0 0
0 0 0 0 1 0
0 0 1 0 0 0
0 0 0 0 0 1
0 0 0 1 0 0
0 1 0 0 0 0
Moreoptionsaboutlossfunctions:https://keras.io/losses/
cycling drivingcar
jogging sittingonthetrain
sleeping walking
0.98 0.004 0.001 0.019 0.0016 0.0114
0.0019 0.095 0.0044 0.052 0.96 0.051
0.087 0.058 0.93 0.036 0.0034 0.0047
0.028 0.004 0.003 0.037 0.054 0.97
0.012 0.00084 0.0048 0.95 0.021 0.0093
0.003 0.94 0.00025 0.0089 0.0082 0.076
Predictedprobabilities Trueprobabilities
categorical_crossentropy:maximizingthevaluesingreencells.binary_crossentropy:maximizingthevaluesingreencells,andminimizingthevaluesinyellowcells.
Ourapproach(detailed)datapre-processing,streaming
while(notterminate):[X,Y]=get_batch(dir_name,class_names,seg_len=seg_len)loss=model.train_on_batch(X,Y)
46
Ourapproach(detailed)datapre-processing,randomshuffling
47
Withoutrandomshuffling Withrandomshuffling
Reddotspresentsthedatapointsselectedineachbatch.
Outline
• What istheproblem?• Why isitimportant?• Howwastheproblemsolved?• What arethefuturedirections?
ØGetmorelabeleddataØModelØLearningstrategiesØApplications
49
Futuredirections
• Getmorelabeleddata• Publiclyavailabledata.• Crowd-sourcing:takeadvantageofthecrowdtoannotatetheunlabeledactivities.
• Model• Learningstrategies• Applications
50
Futuredirections
• Getmorelabeleddata• Model• ImplementotherCNNmodels(e.g.,GoogLeNet,ResNet,Xception,andMobileNet)• Light-weightdeepmodels.• Fine-tunehyper-parameters(e.g.,numberoflayers,sizeofkernels,activationfunctions,lossfunctions).
• Learningstrategies• Applications
51
Futuredirections
• Getmorelabeleddata• Model• Learningstrategies• Activelearning.• Incrementallearning.
• Applications
52
Futuredirections
• Getmorelabeleddata• Model• Learningstrategies• Applications• Assistant:computingsystemsareawareoftheactivitiesoftheuser,suchthattheycanproactivelyassisttheuser.
53
Futuredirections
• Getmorelabeleddata• Publiclyavailabledata.• Crowd-sourcing:takeadvantageofthecrowdtoannotatetheunlabeledactivities.
• Model• ImplementotherCNNmodels(e.g.,GoogLeNet,ResNet,Xception,andMobileNet)• Light-weightdeepmodels.• Fine-tunehyper-parameters(e.g.,numberoflayers,sizeofkernels,activationfunctions,lossfunctions).
• Learningstrategies• Activelearning.• Incrementallearning.
• Applications• Assistant:computingsystemsareawareoftheactivitiesoftheuser,suchthatthecomputingsystemscanproactivelyassistusers.
54
ThankYou!
Takehomemessages:CNNs(e.g.,LeNet,AlexNet,VGGNet,GoogLeNet,ResNet,Xception,andMobileNet)for2Ddatacanbeeasilymodifiedtoprocess1Ddata.LISSpaneldata(richinformation,publiclyavailableforresearchpurpose);GPUcomputing;Dutchsupercomputer;Googlecloudservices;Activationfunction(ReLU,Softmax,Sigmoid);Lossfunction(categorical_crossentropy,binary_crossentropy);Datapre-processing(streaming,randomshuffling,augmentation).
ContactInformation:Email:ykaitao.Hotmail.comLinkedIn:https://www.linkedin.com/in/kaitaoyang/GitHub:https://github.com/ykaitao
55