YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection...
Transcript of YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection...
![Page 1: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/1.jpg)
YOLO:
You Only Look OnceUnified Real-Time Object Detection
Slides by: Andrea FerriFor: Computer Vision Reading Group (08/03/16)
Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi
[Website] [Paper] [arXiv] [Reviews]
![Page 2: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/2.jpg)
INTRODUCTION
![Page 3: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/3.jpg)
Nowadays State of the Art approach, are so architected:
Conv
Layer 5C
on
v
layers
RPN RPN Proposals
RPN Proposals
Class probabilities
RoI pooling layer
FC layers
Class scores
![Page 4: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/4.jpg)
This complex pipeline means that:
Slow Pipeline
Single Pipelines Hard to Optimize
Need Parallel Training for Components
![Page 5: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/5.jpg)
WHAT’S NEW?(In the architecture approach.)
![Page 6: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/6.jpg)
Developed as Single Convolutional Network
Reason Globally on the Entire Image
Learns Generalizable Representations
Easy & Fast
Detection as Single Regression Problem
Concepts
![Page 7: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/7.jpg)
Unified Detection
![Page 8: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/8.jpg)
Divide the image into a SxS grid.
If the center of an object fall into a grid cell, it will be the responsible for the object.
Each grid cell predict:
B bounding boxes;
B confidence scores as C=Pr(Obj)*IOU;
Confidence Prediction is obtained as IOU of predicted box and any ground truth box.
C cond. Class prob. as P=Pr(𝑪𝒍𝒂𝒔𝒔𝒊|Object);
![Page 9: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/9.jpg)
We obtain the class-specific confidence score as:
Pr(𝑪𝒍𝒂𝒔𝒔𝒊|Object)*Pr(Object)*IOU=
Pr(𝑪𝒍𝒂𝒔𝒔𝒊)*IOU
![Page 10: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/10.jpg)
Design
![Page 11: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/11.jpg)
Loss-Function
![Page 12: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/12.jpg)
LimitationsStruggle with Small Object.
Loss function threats errors in different boxes ratio at the same.
Struggle with Different aspects and ratios of objects.Loss function is an approximation.
![Page 13: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/13.jpg)
EXPERIMENTS(How performs?.)
![Page 14: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/14.jpg)
General Comparison
![Page 15: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/15.jpg)
Fast R-CNN & YOLO
![Page 16: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/16.jpg)
Fast R-CNN & YOLOUsing YOLO accuracy for Big object to avoid detection mistakes into Fast R-CNN:
![Page 17: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/17.jpg)
Fast R-CNN & YOLO
![Page 18: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/18.jpg)
SUMMARY(Why is an interesting approach.)
![Page 19: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/19.jpg)
The fastest general-purpose object
detector in the literature.
Trained on a loss function that
directly corresponds to detection performance.
The entire model is trained jointly.
At least detection at 45fps.
Pros
![Page 20: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/20.jpg)
• You Only Look Once: Unified, Real-Time Object Detection,
Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi.
References
![Page 21: YOLO: You Only Look Once - UniFI · YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon,](https://reader031.fdocuments.in/reader031/viewer/2022022106/5be06bf609d3f2de4d8c2ba2/html5/thumbnails/21.jpg)
QUESTIONS?
THANKS !!!