Open Cv Tutorial

CS221 Artificial Intelligence: Principles & Techniques

Challenge ProblemObject Recognition and Tracking

Overview Challenge problem

Problem statement Source code overview Sliding-window detectors and Haar features Milestone requirements

OpenCV tutorial Introduction and installation Code samples

Object recognition tips and tricks

Challenge Problem

Challenge Problem Important Dates

team milestone final submission

Source Code Overview

classifier.cpp Defines the CClassifier class. You are free to modify this file, but do not modify the interface to the loadState() and run() methods.

CXMLParser.cpp Class for parsing XML files. Used for replaying ground truth object labels.

objects.cpp Contains the data structures for annotated objects. Do not modify this file.

replay.cpp Contains code for replaying object labels (such as ground truth labels). Do not modify this file.

test.cpp Contains main() code for testing classifiers on videos. Do not modify this file.

train.cpp Contains training starter code. You are free to modify everything in this file.

utils.cpp Contains useful utility functions.

Command-line Options train [<options>] <directory>

<directory> is the root directory containing (subdirectories of) all the training images

-c <filename> writes learned parameters to a file after training using CClassifier::saveState()

-h provides help -v gives verbose output

test [<options>] <AVI filename> <AVI filename> is the name of the video you want to test on (e.g.

“easy.avi”) -c <filename> configures the classifier with parameters from a file

using CClassifier::loadState() -g <filename> displays ground truth labels from an XML file -h provides help -o <filename> saves classifications to an XML file (same format as

–g) -v gives verbose output -x disables display of the video (if you don’t have X-windows)

Training File Liststypedef struct _TTrainingFile {

std::string filename; // full path to image file

std::string label; // subdirectory name

} TTrainingFile;

typedef struct _TTrainingFileList {

std::vector<TTrainingFile> files; // list of files

std::vector<std::string> classes; // list of classes (subdirectories)

} TTrainingFileList;

data/

mug/

other/

code/

CObject Classclass CObject {public: CvRect rect; // object's bounding box (x,y,width,height) std::string label; // object's class

public: // constructors CObject(); CObject(const CObject&); CObject(const CvRect&, const std::string&); // destructor virtual ~CObject(); // helper functions void writeAsXML(std::ostream&); void draw(IplImage *, CvScalar, CvFont *); CvRect intersect(const CObject&); int overlap(const CObject&);

// operators CObject& operator=(const CObject&);};

CClassifier Classclass CClassifier {protected: CvRNG rng; CvMat *parameters; // TO DO: ADD YOUR MEMBER VARIABLES HERE public: // constructors and destructors CClassifier(); virtual ~CClassifier();

// load and save classifier configuration virtual bool loadState(const char *); virtual bool saveState(const char *);

// run the classifier over a single frame virtual bool run(const IplImage *, CObjectList *); // train the classifier using given set of files virtual bool train(TTrainingFileList&);

protected: // TO DO: ADD YOUR MEMBER FUNCTIONS HERE};

Sliding-window Object Detectors

e.g. task: find all coffee cups

Haar Features Compute difference of intensity over

image regions

Milestone Requirements Build a decision tree classifier for the “mug” class

using Haar features Load positive and negative training images Convert to grayscale Resize to 64-by-64 Extract (given list of) Haar features Train decision tree Implement runtime code to run classifier over all

scales (you can assume height = width for the milestone) and shifts (in increments of 8 pixels) within each video frame

Remember: after the milestone you are free to use whatever features and classifiers you like

What is OpenCV? The Open Computer Vision Library is a collection of

algorithms and sample code for various computer vision problems:

libcxcore: core data structures and linear algebra library

libcv: computer vision library libhighgui: media and graphics i/o handling libml: machine learning library (decision trees, boosting,

neural networks)

Originally developed by Intel; now supported by Willow Garage

Wiki has lots of information and API documentation http://opencvlibrary.sourceforge.net/

Installing OpenCV (Linux) Install ffmpeg

svn checkout svn://svn.mplayerhq.hu/ffmpeg/trunk ffmpeg

./configure --prefix=<home> --enable-shared make && make install

Install opencv download (version 1.0.0) from:

http://sourceforge.net/projects/opencvlibrary/ tar -xvf opencv-1.0.0.tar.gz ./configure --prefix=<home> \

CPPFLAGS="-I<home>/include" \LDFLAGS="-L<home>/lib"

make && make install

Example 1: Loading and Displaying Images#include "cv.h"#include "cxcore.h"#include "highgui.h"

#define WINDOW_NAME "MyWindow"

int main(int argc, char *argv[]){

IplImage *image;

cvNamedWindow(WINDOW_NAME, CV_WINDOW_AUTOSIZE);for (int i = 1; i <= argc; i++) {

image = cvLoadImage(argv[i], 0); // load from filecvShowImage(WINDOW_NAME, image); // display on screencvWaitKey(0); // wait for key presscvReleaseImage(&image); // free memory

}cvDestroyWindow(WINDOW_NAME);

return 0;}

Example 2: Converting from Color to GrayscaleIplImage *image;

// acquire RGB color image somehow...

// allocate memory for grayscale imageIplImage *gray = cvCreateImage(

cvGetSize(image),// same size as original imageIPL_DEPTH_8U, // data type (8-bit unsigned)1); // grayscale has one channel

// color convert the image (source, destination)cvCvtColor(image, gray, CV_BGR2GRAY);

// do something with greyscale image...

// free memory used by imagescvReleaseImage(&gray);cvReleaseImage(&image);

Example 3: Resizing an ImageIplImage *image;

// acquire image somehow...

// allocate memory for resized imageIplImage *resizedImage = cvCreateImage(

cvSize(64, 64), // new size (width, height)image->depth, // data type (e.g. 8-bit unsigned)image->nChannels); // number of planes (e.g. RGB)

// resize the image (source, destination)cvResize(image, resizedImage);

// do something with resized image...

// free memory used by imagescvReleaseImage(&resizedImage);cvReleaseImage(&image);

Example 4: Clipping a Small Region out of an ImageIplImage *image;

...// acquire image somehow

// clip out 64-by-64 image patch at (4,8)CvRect region = cvRect(4, 8, 64, 64); IplImage *clippedImage = cvCreateImage(

cvSize(region.width, region.height),image->depth, image->nChannels);

cvSetImageROI(image, region);cvCopyImage(image, clippedImage);cvResetImageROI(image);

...// do something with clipped region

// and always free memorycvReleaseImage(&clippedRegion);cvReleaseImage(&image);

Example 5: Computing an Integral ImageIplImage *image;CvRect r;

...

// compute integral image on grayscale imageIplImage *iImage = cvCreateImage(

cvSize(image->width + 1, image->height + 1),IPL_DEPTH_32S, 1);

cvIntegral(image, iImage);

...

// compute sum of pixels in area r (could also use CV_IMAGE_ELEM)double value;value = cvGetReal2D(iImage, r.y, r.x);value += cvGetReal2D(iImage, r.y + r.height, r.x + r.width);value -= cvGetReal2D(iImage, r.y, r.x + r.width);value -= cvGetReal2D(iImage, r.y + r.height, r.x);

A B

C D

Example 6: Logistic Regression

CvMat *logistic(const CvMat *X, const CvMat *theta){

assert(X->cols == theta->rows);CvMat *Y = cvCreateMat(X->rows, 1, CV_32FC1);

for (int i = 0; i < X->rows; i++) { double sigma = 0.0; for (int j = 0; j < X->cols; j++) { sigma += cvmGet(X, i, j) * cvmGet(theta, j, 0); }

cvmSet(Y, i, 0, 1.0 / (1.0 + exp(-1.0 * sigma)); }

return Y; // caller must free Y with cvReleaseMat}

)()(

exp1

1iT

i

xy

Example 7: Boosting (Train)CvMat *data = cvCreateMat(…, …, CV_32FC1);CvMat *labels = cvCreateMat(…, …, CV_32SC1);

... // acquire training data somehow

//define variable typesCvMat *varType = cvCreateMat(data->width + 1, 1, CV_8UC1);for (int j = 0; j < data->width; j++) CV_MAT_ELEM(*varType, unsigned char, j, 0) = CV_VAR_NUMERICAL;CV_MAT_ELEM(*varType, unsigned char, data->width, 0) = CV_VAR_CATEGORICAL;

// trainCvBoostParams parameters(CvBoost::GENTLE, numRounds, 0.95, numSplits,

false, NULL);parameters.split_criteria = CvBoost::DEFAULT;CvBoost model = new CvBoost();models->train(data, CV_ROW_SAMPLE, labels, NULL, NULL, varType, NULL,

parameters);

// free memory...

Example 7: Boosting (Test)CvBoost *model;CvMat *x = cvCreateMat(1, …, CV_32FC1);

...// acquire model and test sample somehow

// allocate memory for weak learner outputint length = cvSliceLength(CV_WHOLE_SEQ, model-

>get_weak_predictors());CvMat *weakResponses = cvCreateMat(length, 1, CV_32FC1);

// test (y is prediction, score is log-probability)int y = model->predict(x, NULL, weakResponses, CV_WHOLE_SEQ);double score = cvSum(weakResponses).val[0];

// free memorycvReleaseMat(&weakResponses);...

Example 8: Optical FlowIplImage *bwImg1, *bwImg2;

...// acquire images somehow

// allocate memory for optical flow vectorsCvMat *dx = cvCreateMat(bwImg1.height, bwImg1.width, CV_32FC1);CvMat *dy = cvCreateMat(bwImg1.height, bwImg1.width, CV_32FC1);

// compute dense Lucus-Kanade optical flow (previous, current)// (also see cvCalcOpticalFlowPyrLK for sparse optical flow)cvCalcOpticalFlowLK(bwImg1, bwImg2, cvSize(5, 5), dx, dy);

double deltaX = cvAvg(dx).val[0]; // bulk x-direction motiondouble deltaY = cvAvg(dy).val[0]; // bulk y-direction motion

// free memorycvReleaseMat(&dy);cvReleaseMat(&dx);cvReleaseImage(&bwImg2);cvReleaseImage(&bwImg1);

OpenCV Tips Don’t forget to free allocated memory

cvReleaseImage, cvReleaseMat

Try to allocate and free memory outside of loops if possible

Image size is width-by-height; matrix size is rows-by-columns

Never set the region of interest (ROI) outside of the image/matrix

x ≥ 0, y ≥ 0, x + width < imagewidth, y + height < imageheight

Never operate on images of different types (imagedepth) Make sure you convert first (e.g. RGB to grayscale)

Check the return values (for NULL) e.g., loading images and allocating images

Object Recognition Tips Read some papers for ideas on better features

[Viola and Jones, 2001], [Wolf, Serre and Poggio, 2004], [Lowe, 2004], [Dalal and Triggs, 2005], [Torralba, Murphy, and Freeman, 2007].

The milestone uses sliding-window based detector, but there are other techniques that you can try

template matching, chamfer matching/shape matching, bag-of-features with locality constraints, eigenspace representation, scene context, color

Try different classifiers and training methods logistic classifiers, support vector machines, boosted classifiers

Try to normalize your features for intensity variation

Filter the output of your classifiers and use motion estimation to predict where an object in frame n will be in frame n+1

Anything that works!

Coding Tips Modular, general design

allows you to test lots of things Use OpenCV and other libraries, e.g.,

GSL, SVMLight, STAIR Vision Library (SVL) make sure you cite external libraries

Use source control (SVN or CVS)

Your code must compile on CS machines

Design Process Tips Automate testing early Consider how you will avoid overfitting Test features without training an entire

classifier (e.g., entropy function in Matlab)

Visualization code usually pays off Profile your code and optimize

bottlenecks (e.g., use integral images)

Start early!

Example Results A simple decision tree classifier trained

using Haar features is not all that good so don’t expect brilliant results

State-of-the-art Results(courtesy Ben Sapp)

Milestone Results

Finally…

Good luck!

(and have fun)

Open Cv Tutorial

Documents

Transcript of Open Cv Tutorial