Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... ·...
-
Upload
hoangduong -
Category
Documents
-
view
216 -
download
0
Transcript of Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... ·...
![Page 1: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/1.jpg)
VisualDynamics:ProbabilisticFutureFrameSynthesisviaCrossConvolutionalNetworks
TianfanXue* JiajunWu* KatieBouman BillFreeman
NIPS2016
VGGReadingGroup,24Feb2017AnkushGupta
![Page 2: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/2.jpg)
Frame2
Task:futureframeprediction
![Page 3: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/3.jpg)
Frame1 Frame2Deterministicneuralnetwork
Deterministicpredictionsfailtomodeluncertainty
![Page 4: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/4.jpg)
Frame1 Deterministicneuralnetwork
Deterministicpredictionsfailtomodeluncertainty
Prediction
Whatistheproblem?
![Page 5: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/5.jpg)
Frame1 Deterministicneuralnetwork
Deterministicpredictionsfailtomodeluncertainty
Prediction
Whatistheproblem?
![Page 6: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/6.jpg)
SynthesisnetworkInputframe Sampledfutureframe
Sampledifferentfutureframes
Mainidea NetworkstructureOutline Whatthenetworklearns Result
Inputrandommotionvector𝑧~𝑝$(𝑧)
![Page 7: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/7.jpg)
SynthesisnetworkInputframe
Sampledifferentfutureframes
Mainidea NetworkstructureOutline Whatthenetworklearns Result
Inputrandommotionvector𝑧~𝑝$(𝑧)
Sampledfutureframe
![Page 8: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/8.jpg)
Inputframe Anothersampledfutureframe
Segments Transformedsegments
Inputrandommotionvector𝑧~𝑝$(𝑧)
Synthesizeusingdifferenttransformations
Mainidea NetworkstructureOutline Whatthenetworklearns Result
![Page 9: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/9.jpg)
Sampledfutureframe
Motionvector𝑧
SynthesisnetworkInputframe
Encodingnetwork
Futureframe(groundtruth)
Training
Mainidea NetworkstructureOutline Whatthenetworklearns Result
![Page 10: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/10.jpg)
Motionvector𝑧
Encodingnetwork
Synthesisnetwork Futureframe
(prediction)Trainingsamples
(Label-free)
Training
Inputframe
Futureframe(groundtruth)
Mainidea NetworkstructureOutline Whatthenetworklearns Result
![Page 11: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/11.jpg)
Futureframe𝐼()*(prediction)
Motionvector𝑧
Encodingnetwork
Synthesisnetwork
Training
Futureframe𝐼+,(groundtruth)
Inputframe
Objectivefunction:𝐼()* − 𝐼+, + 𝐷01(𝒛||𝑁(𝟎, 𝐈))
Reconstructionloss
Mainidea NetworkstructureOutline Whatthenetworklearns Result
![Page 12: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/12.jpg)
Futureframe𝐼()*(prediction)
Futureframe𝐼+,(groundtruth)
Inputframe
Encodingnetwork
Synthesisnetwork
Training Objectivefunction:𝐼()* − 𝐼+, + 𝐷01(𝒛||𝑁(𝟎, 𝐈))
KL-divergenceloss
Motionvector𝑧
Mainidea NetworkstructureOutline Whatthenetworklearns Result
Variational Autoencoder[Kingma andWelling,2014]
![Page 13: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/13.jpg)
Futureframe𝐼()*(prediction)
Synthesisnetwork
Testing
Futureframe𝐼+,(groundtruth)
Encodingnetwork
Inputframe
Inputframe
Mainidea NetworkstructureOutline Whatthenetworklearns Result
u
Inputrandommotionvector𝑧~𝑝$(𝑧)
Realoutputfromournetwork
![Page 14: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/14.jpg)
Inputframe Futureframe
TransformsegmentsFindsegments
Inputrandommotionvector𝑧
Synthesizebytransformingsegments
Mainidea NetworkstructureOutline Whatthenetworklearns Result
Imagesegments Convolution
![Page 15: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/15.jpg)
0 0 0
0 1 0
0 0 0
0 0 1
0 0 0
0 0 0
Movementcanbesynthesizedthroughconvolution
Mainidea NetworkstructureOutline Whatthenetworklearns Result
![Page 16: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/16.jpg)
Imagesegments
Applyingmotiontoeachsegment
Mainidea NetworkstructureOutline Whatthenetworklearns Result
Motionkernels
Thedecodingnetworkgeneratesamotionkernelforeachcorrespondingsegment
Decodingnet
Motionvector𝑧
[Brabandere etal.2016][Finnetal.2016]
![Page 17: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/17.jpg)
Motionvector𝑧
Inputframe
Futureframe
Synthesisnetwork
Futureframe
Mainidea NetworkstructureOutline Whatthenetworklearns Result
Whatisencodedinthemotionvector?
Encodingnetwork
![Page 18: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/18.jpg)
Motionvector𝑧 Upwardmotionwhenchangingthisdimension
Mainidea NetworkstructureOutline Whatthenetworklearns Result
Eachdimensionencodesatypeofmotion
![Page 19: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/19.jpg)
Motionvector𝑧 Legmotionwhenchangingthisdimension
Eachdimensionencodesatypeofmotion
Mainidea NetworkstructureOutline Whatthenetworklearns Result
![Page 20: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/20.jpg)
• Simulatedshapes
• Trainingsamples
Results:toyexample
Mainidea NetworkstructureOutline Whatthenetworklearns Result
![Page 21: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/21.jpg)
Input
Learnedsegments
Networkautomaticallydetectssegments
Triangles
Circles
Mainidea NetworkstructureOutline Whatthenetworklearns Result
![Page 22: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/22.jpg)
Input SamplednextframeGroundtruthdistribution
Sampledistribution
Networklearnsthecorrelationbetweenappearanceandmotion
Mainidea NetworkstructureOutline Whatthenetworklearns Result
![Page 23: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/23.jpg)
Input Sampledfutureframes
Results:real-worldimages
Mainidea NetworkstructureOutline Whatnetworklearns Result
![Page 24: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/24.jpg)
Challenge:largemotion
Mainidea NetworkstructureOutline Whatthenetworklearns Result
Input TwosampledfutureframesArtifactsappearwhenmotionislarge
![Page 25: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/25.jpg)
Baseline:Transferflow 25.5%Ourmethod 31.3%
Labeledasreal
MechanicalTurkstudytoassesssynthesisquality
Idealsynthesisalgorithmachieves50%
Mainidea NetworkstructureOutline Whatthenetworklearns Result
![Page 26: Visual Dynamics: Probabilistic Future Frame Synthesis …vgg/rg/slides/vgg_rg_23_feb_2017... · Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks](https://reader031.fdocuments.in/reader031/viewer/2022022609/5b92258d09d3f204338d58a8/html5/thumbnails/26.jpg)
• Samplemultiplefutureframesthatareconsistentwiththeinput
• Synthesizeframesbytransformingsegments
• Learnamotionrepresentationwithoutsupervision
…
Contributions