Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification...
Transcript of Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification...
![Page 1: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/1.jpg)
Video Object Segmentation with Re-identification
Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping ShiPing Luo, Chen Change Loy, Xiaoou Tang
The Chinese University of Hong Kong, SenseTime Group Limited
![Page 2: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/2.jpg)
Semi-supervised Segmentation• Input :Video sequence, ground-truth label of the first frame
• Output : Masks of all instances
![Page 3: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/3.jpg)
Challenge
• Instance Segmentation• Small objects and fine structures• Scale & pose-variations
• Tracking• Frequent occlusions
![Page 4: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/4.jpg)
• Instance Segmentation• Small objects and fine structures• Scale & pose-variations
• Tracking• Frequent occlusions
Mask Propagation Module
Re-identification Module
Short Term
Long Term
Challenge
![Page 5: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/5.jpg)
Video Object Segmentation with Re-identification (VS-ReID)
Iterative InferenceMask Initialization
Proposed Framework
Mask Propagation Module
Re-identification Module
Input Video Sequence
Mask Propagation Module
![Page 6: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/6.jpg)
Video Object Segmentation with Re-identification (VS-ReID)
Iterative InferenceMask Initialization
Mask Propagation Module
Re-identification Module
Input Video Sequence
Mask Propagation Module
Mask Propagation Module
![Page 7: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/7.jpg)
Mask Propagation Module
• Inspired by MSK[1] and LucidTracker[2]
• Use the temporal continuity property of the video sequence
• Propagate the mask from the previous frame to the current frame
[1] Perazzi F, Khoreva A, Benenson R, et al. Learning video object segmentation from static images[C]. CVPR, 2017. [2] Khoreva A, Benenson R, Ilg E, et al. Lucid Data Dreaming for Object Tracking[J]. arXiv preprint arXiv:1703.09554, 2017.
![Page 8: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/8.jpg)
Mask Propagation Module
Image
Optical Flow
RGB Branch
Flow Branch Prediction
Guided Probability Map
![Page 9: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/9.jpg)
Mask Propagation Module
RGB Branch
Flow Branch Prediction
Image
Optical Flow
Guided Probability Map
![Page 10: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/10.jpg)
Mask Propagation Module
Previous Frame Current Frame
Previous Mask
![Page 11: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/11.jpg)
Mask Propagation Module
Previous Frame Current Frame
FlowNet
Warping
Optical Flow
Guided Probability MapPrevious Mask
![Page 12: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/12.jpg)
Mask Propagation Module
Previous Frame
FlowNet
Warping
Previous Mask
Current FrameOptical Flow
Guided Probability Map
![Page 13: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/13.jpg)
Mask Propagation Module
RGB Branch
Flow Branch Prediction
Image
Optical Flow
Guided Probability Map
![Page 14: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/14.jpg)
Mask Propagation Module
Image
Optical Flow
Guided Probability Map
RGB Branch
Flow Branch Prediction
![Page 15: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/15.jpg)
Video Fram
eG
uided Probability M
apPrediction
Warping
Mask Propagation Module
![Page 16: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/16.jpg)
Mask Propagation Module• Deeper Backbone Network
• ResNet101
• RGB-branch• Pre-trained on the MS-COCO and PASCAL VOC dataset
• Augmented ground-truth label as the guided probability map• Fine-tuned on the DAVIS dataset
• Flow-branch• Initialized with RGB-Branch’s weights• Trained on the DAVIS dataset
• Multi-instance• Inference on each instance individually
![Page 17: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/17.jpg)
Mask Propagation Module
![Page 18: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/18.jpg)
Video Object Segmentation with Re-identification (VS-ReID)
Alternating InferenceMask Initialization
Proposed Framework
Input Video Sequence
Mask Propagation Module
Mask Propagation Module
Re-identification Module
![Page 19: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/19.jpg)
Re-identification Module• Detection and re-identification
First Frame Rest Frames
Template Candidate Bounding Boxes
Re-identification
Most Confident Candidate
![Page 20: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/20.jpg)
Re-identification Module• Recover the mask from a bounding box
Template
Most Confident Candidate & Flow
Guided Probability Map
Mask Propagation Module
Recovered Mask
![Page 21: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/21.jpg)
Re-identification Module
• Detection Model• Faster RCNN• Trained on the ImageNet
• Re-identification Model• ‘Identification Net’ in Person Search[1]• For the person category, we directly use the ‘Identification Net’ in Person
Search[1]• Trained on the ImageNet VID
• Retrieve an instance in a single frame each time[1] Xiao T, Li S, Wang B, et al. Joint detection and identification feature learning for person search[C] CVPR. 2017.
![Page 22: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/22.jpg)
Video Object Segmentation with Re-identification (VS-ReID)
Iterative InferenceMask Initialization
Mask Propagation Module
Re-identification Module
Input Video Sequence
Mask Propagation Module
Mask Propagation Module
![Page 23: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/23.jpg)
VS-ReID
1 8 20 37 52 64 82
Input Frames
InitializationM
ask
Prop
agat
ion
• Mask Initialization
![Page 24: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/24.jpg)
1 8 20 37 52 64 82Input Fram
esInitialization
Mas
k Pr
opag
atio
nR
e-Id
entif
icat
ion 1
stRound
21
![Page 25: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/25.jpg)
1 8 20 37 52 64 82Input Fram
es
21
Re-
Iden
tific
atio
n 1stR
ound
![Page 26: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/26.jpg)
1 8 20 37 52 64 82Input Fram
es
21
1stR
oundR
e-Id
entif
icat
ion
Mas
k Pr
opag
atio
n
𝒙"
![Page 27: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/27.jpg)
1 8 20 37 52 64 82Input Fram
es
80
2ndR
oundR
e-Id
entif
icat
ion
Mas
k Pr
opag
atio
n
𝒙"
![Page 28: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/28.jpg)
Performance
J Mean F Mean Global MeanVoigt 54.8 60.5 57.7
Haamo 59.8 63.2 61.5Vanta 61.5 66.2 63.8Apata 65.1 70.6 67.8Ours 67.9 71.9 69.9
(DAVIS 2017 Challenge test-challenge set)
![Page 29: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/29.jpg)
Visualization
![Page 30: Video Object Segmentation with Re-identificationVideo Object Segmentation with Re-identification Xiaoxiao Li, Yuankai Qi, Zhe Wang, Kai Chen, Ziwei Liu, Jianping Shi Ping Luo, Chen](https://reader033.fdocuments.in/reader033/viewer/2022052423/5f0527677e708231d4118a0b/html5/thumbnails/30.jpg)
Thanks!