Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.
-
Upload
jeffrey-jacobs -
Category
Documents
-
view
225 -
download
2
Transcript of Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.
![Page 1: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/1.jpg)
Sketch Tokens: A Learned Mid-levelRepresentation for Contour and Object
Detection
CVPR2013 POSTER
![Page 2: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/2.jpg)
Outline
IntroductionMethodResultsDiscussion
![Page 3: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/3.jpg)
Introduction
Figure 1. Examples of sketch tokens learned from hand drawn sketches represented using their mean contour structure. Notice the variety and richness of the sketch tokens.
![Page 4: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/4.jpg)
We typically utilize a few hundred tokens, which captures a majority of the commonly occurring edge structures.
an efficient approach that can compute per-pixel token labelings in about one second per image.
We propose a novel approach to both learning and detecting local edge-based mid-level features.
Introduction
![Page 5: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/5.jpg)
Defining sketch token classes Detecting sketch tokens
Method
![Page 6: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/6.jpg)
These include straight lines, t-junctions, y-junctions, corners, curves, parallel lines, etc.
Defining sketch token classes
![Page 7: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/7.jpg)
1. Feature extraction2. Classification
Detecting sketch tokens
![Page 8: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/8.jpg)
Two types of features are then employed: 1.features directly indexing into the
channels2.self-similarity features
Feature extraction
![Page 9: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/9.jpg)
1.features directly indexing into the channels:
channels are composed of color, gradient, and oriented gradient information in a patch extracted from a color image.
Pixels in the resulting channels serve as the first type of feature for our classifier.
![Page 10: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/10.jpg)
2.self-similarity features:
The self-similarity features capture the portions of an image patch that contain similar textures based on color or gradient information.
![Page 11: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/11.jpg)
For channel and grid cells and , we define the self-similarity feature as:
where is the sum of grid cell in channel . An illustration of self-similarity features is shown in Fig. 3.
𝑓 𝑖𝑗𝑘=𝑠 𝑗𝑘−𝑠𝑖𝑘
![Page 12: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/12.jpg)
![Page 13: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/13.jpg)
Two considerations must be taken into account when choosing a classifier for labeling sketch tokens in image patches.
First, every pixel in the image must be labeled, so the classifier must be efficient.
Second, the number of potential classes for each patch ranges in the hundreds.
Classification
![Page 14: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/14.jpg)
![Page 15: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/15.jpg)
![Page 16: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/16.jpg)
1. Contour detection2. Object detection
Results
![Page 17: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/17.jpg)
If is the probability of patch belonging to token , and is the probability of belonging to the “no contour” class, the estimated probability of the patch’s center containing a contour is:
Contour detection
𝑒𝑖=∑𝑗
𝑡𝑖𝑗=1−𝑡 𝑖0
![Page 18: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/18.jpg)
We test our contour detector on the popular Berkeley Segmentation Dataset and Benchmark (BSDS500).
Contour detection results
![Page 19: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/19.jpg)
![Page 20: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/20.jpg)
1. INRIA pedestrian2. PASCAL VOC 2007
Object detection
![Page 21: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/21.jpg)
1. INRIA pedestrian:For pedestrian detection we use an improved
implementation of Doll´ar et al. that utilizes multiple image channels as features for a boosted detector .
![Page 22: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/22.jpg)
![Page 23: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/23.jpg)
![Page 24: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/24.jpg)
2. PASCAL VOC 2007:Our final set of results use the PASCAL VOC
2007 dataset. The dataset contains real world images with 20 labeled object categories such as people, dogs, chairs, etc.
![Page 25: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/25.jpg)
![Page 26: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/26.jpg)
![Page 27: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/27.jpg)
![Page 28: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/28.jpg)
![Page 29: Sketch Tokens: A Learned Mid-level Representation for Contour and Object Detection CVPR2013 POSTER.](https://reader035.fdocuments.in/reader035/viewer/2022081515/56649d1c5503460f949f1989/html5/thumbnails/29.jpg)
Discovering new sets of observable mid-level information that may be used for feature learning is an interesting and open question.
We’ve explored several in this paper, but other tasks may also benefit.
Discussion