FlowNet - Learning Optical Flow with...
Transcript of FlowNet - Learning Optical Flow with...
![Page 1: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/1.jpg)
FlowNet -Learning OpticalFlow with CNNs
Prahal Arora
Local maxima
![Page 2: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/2.jpg)
What is Motion ...There are two possible answers.
1. First, physical movement of pixels, therefore motion has to be measured in a physical way.
1. Second, motion is human percept--motion is what we perceive in our brain, something we can sense and communicate
![Page 3: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/3.jpg)
What do you observe?
![Page 4: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/4.jpg)
Same video in slow motion
![Page 5: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/5.jpg)
What is Optical Flow ...
Apparent motion of objects as perceived by an observer (eye or camera)
Measure objects/pixels displacement across 2 consecutive video frames/images
![Page 6: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/6.jpg)
Optical flowCEO
![Page 7: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/7.jpg)
Why is to so important to study optical flow
Video compression Tracking of objects Driving assistance sys
![Page 8: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/8.jpg)
Visualizing optical flow fieldCEO
● Flow can be visualized using vectors○ visualization quickly becomes
unreadable
● Use HSV (Hue Saturation Value) components ○ codify the direction using color (Hue)○ codify the magnitude of the movement
by color intensity
![Page 9: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/9.jpg)
Visualizing optical flow fieldCEO
● Stationary object (white)
● Moving object (color = direction)
![Page 10: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/10.jpg)
FlowNet● Input: 2 images
● Output: Optical Flow Field
● Trained end to end
![Page 11: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/11.jpg)
Background● Goal: Estimate Optical Flow (Motion) from images
● Related work: ○ DeepMatching - Interpolate dense flow fields, preserve image
boundary [Revaud 2015] ○ DeepFlow - Aggregate features from fine to coarse [Weinzaepfel
2013] ○ EpicFlow - Focus on sparse matching (subset of pixels) [Revaud
2015] ○ FlowNet - Real-time, fast, and correlated features learning
[Fischer 2015]
![Page 12: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/12.jpg)
The contracting part of the network extracts a rich feature representationTwo architectures are proposed for the contracting part
![Page 13: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/13.jpg)
In simpler one, input image are stacked jointly
![Page 14: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/14.jpg)
Alternatively, two images are processed separately, then the features are correlated and processed further
![Page 15: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/15.jpg)
Correlation layer- Measuring patch
similarity in neighborhood
- Convolution of 2 data, no weights
![Page 16: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/16.jpg)
Optimization trick- Compute correlation
only in neighborhood of size D = 2d + 1
- Reduce total computation fromc * w2 * h2
toc * w * h * D2
Where,
D: Neighborhood size d: Maximum displacement w: Width h: Height c: Number of channels
![Page 17: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/17.jpg)
The expanding part of the network, produces high resolution flow from the coarser features
![Page 18: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/18.jpg)
It makes use of upconvolution layers (deconvolution and transposed convolution) with concatenation of downsampled data
![Page 19: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/19.jpg)
Post processing : Variational refinementCEO
● small motions (first row) the predicted flow is changed dramatically.
● larger motions (second row), big errors not corrected, but the flow field is smoothed, resulting in lower EPE
![Page 20: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/20.jpg)
Loss and performance metric usedCEO
● L: Loss/EPE
● U, V: Ground Truth (horizontal, vertical displacement)
● U’, V’: Flow Estimate (horizontal, vertical displacement)
● N: Total Pixels in Image
![Page 21: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/21.jpg)
DatasetCEO
● Middlebury (Moving, rigid scenes)
● KITTI (Street view scenes)
● Sintel (Animation movie)
● Flying Chairs (Background (from Flickr) + chair + transformation)
![Page 22: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/22.jpg)
Flying chair datasetCEO
● Random background images from Flickr
● Overlay segmented images of chairs
● Randomly sample affine transformation parameters for the background and the chairs
● Data have little in common with the real world
● Generate arbitrary amounts of samples
● Also, data is augmented
![Page 23: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/23.jpg)
Results
+v variational refinement +ft fine tuning
FlowNet worse
FlowNet better
![Page 24: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/24.jpg)
Results..(cont)CEO
Outperforms EpicFlow for small objects with large displacement
![Page 25: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/25.jpg)
Results..(cont)CEO
![Page 26: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/26.jpg)
Pros
Thorough comparison with EpicFlow, DeepFlow
Pros
Processes image 5 to 10 frames per second, also very fast
Pros
Smoother flow (variational refinement)
Pros
No handcrafted methods for aggregation, matching, and interpolation
11.01.XX
Captures fine details and small objects and also small and big displacements
Pros
![Page 27: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/27.jpg)
Pro 6.0CEO
● FlowNet 2.0 is even better! ○ FlowNet 2.0 newer version was just released last year (2017).
○ Crispier, more stacking.
○ “FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks” (http://openaccess.thecvf.com/content_cvpr_2017/papers/Ilg_FlowNet_2.0_Evol ution_CVPR_2017_paper.pdf)
○ Marginal improvement over FlowNet 2015
![Page 28: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/28.jpg)
Cons
The correlation layer is not simple, with k and D as additional hyper parameters
Cons
Unrealistic images, synthetic data set
Cons
The performance only competitive, could not beat some state of the art methods
11.01.XX
![Page 29: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/29.jpg)
Demo
![Page 30: FlowNet - Learning Optical Flow with CNNscseweb.ucsd.edu/~mkchandraker/classes/CSE291/Winter2018/Lectures/... · What is Motion ... There are two possible answers. 1. First, physical](https://reader035.fdocuments.in/reader035/viewer/2022062413/5bf3bc6909d3f256398cc24c/html5/thumbnails/30.jpg)
Thank you