You only look once: Unified, real-time object detection (UPC Reading Group)

21
YOLO: You Only Look Once Unified Real-Time Object Detection Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16) Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi [Website ] [Paper ] [arXiv ] [Reviews ]

Transcript of You only look once: Unified, real-time object detection (UPC Reading Group)

Page 1: You only look once: Unified, real-time object detection (UPC Reading Group)

YOLO: You Only Look

OnceUnified Real-Time Object

Detection

Slides by: Andrea Ferri For: Computer Vision Reading Group (08/03/16)

Joseph Redmon, Santosh Divvala, Ross Girshick, Ali Farhadi[Website] [Paper] [arXiv] [Reviews]

Page 2: You only look once: Unified, real-time object detection (UPC Reading Group)

INTRODUCTION

Page 3: You only look once: Unified, real-time object detection (UPC Reading Group)

Nowadays State of the Art approach, are so architected:

Conv Layer 5Co

nv

laye

rs

RPN RPN Proposals

RPN Proposals

Class probabilities

RoI pooling layerFC layersClass scores

Page 4: You only look once: Unified, real-time object detection (UPC Reading Group)

This complex pipeline means that:

Slow Pipeline

Single Pipelines Hard to Optimize

Need Parallel Training for Components

Page 5: You only look once: Unified, real-time object detection (UPC Reading Group)

WHAT’S NEW?(In the architecture approach.)

Page 6: You only look once: Unified, real-time object detection (UPC Reading Group)

Developed as Single Convolutional Network

Reason Globally on the Entire Image

Learns Generalizable Representations

Easy & Fast

Detection as Single Regression Problem

Concepts

Page 7: You only look once: Unified, real-time object detection (UPC Reading Group)

Unified Detection

Page 8: You only look once: Unified, real-time object detection (UPC Reading Group)

Divide the image into a SxS grid.If the center of an object fall into a grid cell, it will be the responsible for the object.

Each grid cell predict:B bounding boxes;B confidence scores as C=Pr(Obj)*IOU;

Confidence Prediction is obtained as IOU of predicted box and any ground truth box.

C cond. Class prob. as P=Pr(|Object);

Page 9: You only look once: Unified, real-time object detection (UPC Reading Group)

We obtain the class-specific confidence score as:

Pr(|Object)*Pr(Object)*IOU=

Pr()*IOU

Page 10: You only look once: Unified, real-time object detection (UPC Reading Group)

Design

Page 11: You only look once: Unified, real-time object detection (UPC Reading Group)

Loss-Function

Page 12: You only look once: Unified, real-time object detection (UPC Reading Group)

LimitationsStruggle with Small Object.

Loss function threats errors in different boxes ratio at the same.

Struggle with Different aspects and ratios of objects.Loss function is an approximation.

Page 13: You only look once: Unified, real-time object detection (UPC Reading Group)

EXPERIMENTS(How performs?.)

Page 14: You only look once: Unified, real-time object detection (UPC Reading Group)

General Comparison

Page 15: You only look once: Unified, real-time object detection (UPC Reading Group)

Fast R-CNN & YOLO

Page 16: You only look once: Unified, real-time object detection (UPC Reading Group)

Fast R-CNN & YOLOUsing YOLO accuracy for Big object to avoid detection mistakes into Fast R-CNN:

Page 17: You only look once: Unified, real-time object detection (UPC Reading Group)

Fast R-CNN & YOLO

Page 18: You only look once: Unified, real-time object detection (UPC Reading Group)

SUMMARY(Why is an interesting approach.)

Page 19: You only look once: Unified, real-time object detection (UPC Reading Group)

The fastest general-purpose object detector in the literature.

Trained on a loss function that directly corresponds to detection performance.The entire model is trained jointly.

At least detection at 45fps.

Pros

Page 20: You only look once: Unified, real-time object detection (UPC Reading Group)

• You Only Look Once: Unified, Real-Time Object Detection,Joseph Redmon, Santosh Divvala, Ross Girshick, Ali

Farhadi.

References

Page 21: You only look once: Unified, real-time object detection (UPC Reading Group)

QUESTIONS?

THANKS !!!