A step-by-step guide to the use of the Intel OpenCV...

OpenCV Tutorial by R. Laganiere http://www.site.uottawa.ca/~laganier/tutorial/opencv+directshow/

1 von 33 11.04.2008 11:43

Programming computer vision applications:

A step-by-step guide to the use of the Intel OpenCV libraryand the Microsoft DirectShow technology

Robert Laganière, VIVA lab, University of Ottawa.

The objective of this page is to teach you how to use the Intel libraries to build applications where images orsequences of images have to be processed. In addition, the DirectShow technology is also introduced; thisone is particularly useful to process image sequences or sequences captures using PC cameras. Since this is a beginner’s guide, efforts have been made to describe in details all the necessary steps to obtainthe shown results. In addition, all the source codes used here have been made available. Note, however, thatthe goal was to keep these programs as simple and short as possible; as a consequence the programming styleis not always of good quality. In particular, a better adherence to the object-oriented paradigm would haveconsiderably increased the quality of the programming. The Intel Image Processing Library can be found at:

q developer.intel.com/software/products/perflib/ipl/index.htmHowever, the IPLis no longer available at this official site. This is not too problematic, since most functionalities are stillavailable through OpenCV (including the IplImage data structure). The home page of the Open ComputerVision library is at:

q www.intel.com/research/mrl/research/opencv/Finally, to use the DirectShow technology, you must download the Microsoft DirectX SDK, that can befound at:

q www.microsoft.com/windows/directxThe OpenCV beta 2.1 has been used to produce the examples below with DirectX 8.1 and Visual C++ 6.0service pack 5 under Windows 2000.March 28, 2003: Note that the last section has been updated and that the OpenCV beta 3.1 has been used inthese last examples. February 18, 2003: All source codes have been updated to OpenCV beta 3.1 and any reference to the old IPLlibrary has been removed. An additional example using an image iterator has been added. Your inputs are welcome. 1. Creating a Dialog-based application All applications presented here will be simple dialog-based applications. This kind of applications can easilybe created using the MFC application wizard. On you Visual C++ menu bar, select the File|New option. Then start the MFCAppWizard (exe). You choose a dialog-based application; select a name for theapplication (here it is called cvision). VC++ should create a simple OK/Cancel Dialog for you. The classwith a name ending by Dlg will contain the member functions that control the widget of the dialog. The first task will be to open and display an image. To do this, we will first add a button that will allow us toselect the file that contains the image. Drag a button onto the dialog and then right click on it and selectProperties option; this will allow you to change the caption to Open Image. Once this done,double-click on the new button and change the corresponding member function name to OnOpen. The dialog now looks like this:


2 von 33 11.04.2008 11:43

The CFileDialogclass is the one to use in order to create a file dialog. This one will show up by adding the following code tothe the OnOpen member function void CCvisionDlg::OnOpen(){ CFileDialog dlg(TRUE, _T("*.bmp"), "", OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY, "image files (*.bmp; *.jpg) |*.bmp;*.jpg| AVI files (*.avi) |*.avi|All Files (*.*)|*.*||",NULL); char title[]= {"Open Image"}; dlg.m_ofn.lpstrTitle= title; if (dlg.DoModal() == IDOK) { CString path= dlg.GetPathName(); // contain the // selected filename }} Note how the extensions of interest (here .bmp .jpg and .avi) for the files to be opened are specified usingthe fourth argument of the CFileDialogconstructor. Now, by clicking on the Open Image button, the following dialog appears:

2. Loading and displaying an image Now that we learnt how to select a file, let’s load and display the underlying image. The Intel libraries willhelp us to accomplish this task. In particular, the HighGui component of OpenCV will be put to contribution.This one contains the required functions to load, save and display images under the Windows environment. Since we will be using these libraries in all the example to follow, we will first see how to setup adequately


3 von 33 11.04.2008 11:43

our VC++ projects in order to have the libraries linked to our application. Selection theProject|Settings… option. A dialog will pop up. Select the C/C++ tab and the categoryPreprocessor. Add the following directories to additional include directories:

q C:\Program Files\Intel\plsuite\includeq C:\Program Files\Intel\opencv\cv\includeq C:\Program Files\Intel\opencv\otherlibs\highgui

Select now the Link Tab, category Input. Add to additional library path the following directories:q C:\Program Files\Intel\plsuite\lib\msvcq C:\Program Files\Intel\opencv\lib

And finally select the category General of the Link tab and add the following libraries to library modules:

ipl.lib cv.lib highgui.lib This setup is valid for the current project. It could be a good idea to add all these directories to the globalsearch path of your VC++ such that they will always be active each time you create a new project. This canbe done from the Tools|Options…menu. You then select the Directories tab. The following two screenshots show you the information thatshould be included there.

Note also that we have included the DirectX directory information (which is, in our case, C:\DXSDK\Lib)


4 von 33 11.04.2008 11:43

that we will use in later examples. This one should always be the first in the list to avoid incompatibilitieswith other libraries. With these global settings, only the names of the library modules need to be specified when a new project iscreated:

Now add the following header file to the project, here called cvapp.h: #if !defined IMAGEPROCESSOR#define IMAGEPROCESSOR #include <stdio.h>#include <math.h>#include <string.h>#include "cv.h" // include core library interface#include "highgui.h" // include GUI library interface class ImageProcessor { IplImage* img; // Declare IPL/OpenCV image pointer public: ImageProcessor(CString filename, bool display=true) { img = cvvLoadImage( filename ); // load image if (display) { // create a window cvvNamedWindow( "Original Image", 1 ); // display the image on window cvvShowImage( "Original Image", img ); } }


5 von 33 11.04.2008 11:43

~ImageProcessor() { cvReleaseImage( &img ); }}; #endif The function names starting with cvv are HighGui functions. To use the ImageProcessor class in theapplication, just include the header to the dialog. Once a file is open, an ImageProcessor instance can becreated, this can be done as follows: void CCvisionDlg::OnOpen(){ CFileDialog dlg(TRUE, _T("*.bmp"), "", OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY, "BMP files (*.bmp) |*.bmp|AVI files (*.avi) |*.avi| All Files (*.*)|*.*||",NULL); char title[]= {"Open Image"}; dlg.m_ofn.lpstrTitle= title; if (dlg.DoModal() == IDOK) { CString path= dlg.GetPathName(); ImageProcessor ip(path); // load, create and display }} Then when you select an image, this window should appear:

3. Processing an image Now let’s try to call one of the OpenCV function. We rewrite the header as follows: #if !defined IMAGEPROCESSOR#define IMAGEPROCESSOR #include <stdio.h>#include <math.h>#include <string.h>


6 von 33 11.04.2008 11:43

#include "cv.h" // include core library interface#include "highgui.h" // include GUI library interface class ImageProcessor { IplImage* img; // Declare IPL/OpenCV image pointer public: ImageProcessor(CString filename, bool display=true) { img = cvvLoadImage( filename ); // load image if (display) { cvvNamedWindow( "Original Image", 1 ); cvvShowImage( "Original Image", img ); } } void display() { cvvNamedWindow( "Resulting Image", 1 ); cvvShowImage( "Resulting Image", img ); } void execute(); ~ImageProcessor() { cvReleaseImage( &img ); }}; extern ImageProcessor *proc; #endif and we add a C++ source file, here named cvapp.cpp, that contains the function that does the processing. #include "stdafx.h"#include "cvapp.h" // A global variableImageProcessor *proc = 0; // the function that processes the imagevoid process(void* img) { IplImage* image = reinterpret_cast<IplImage*>(img); cvErode( image, image, 0, 2 ); } void ImageProcessor::execute() { process(img);}


7 von 33 11.04.2008 11:43

The processfunction is the one that calls the OpenCV function that does the processing. In this example, the processingconsists in a simple morphological erosion (cvErode). Obviously, all the processing could have been donedirectly inside the execute member function. Also, there is no justification, at this point, to have use avoid pointer as parameter for the processfunction. This has been done just for consistency with the examples to follow where the process function will become a callback function in the processing of a sequence. Note that for simplicity, we have added aglobal variable that points to the ImageProcessor instance that this application uses. Let’s now modifyour dialog by adding another button, i.e.:

The member functions now become: void CCvisionDlg::OnOpen(){ CFileDialog dlg(TRUE, _T("*.bmp"), "", OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST| OFN_HIDEREADONLY, "image files (*.bmp; *.jpg) |*.bmp;*.jpg| AVI files (*.avi) |*.avi|All Files (*.*)|*.*||",NULL); char title[]= {"Open Image"}; dlg.m_ofn.lpstrTitle= title; if (dlg.DoModal() == IDOK) { CString path= dlg.GetPathName(); if (proc != 0) delete proc; proc= new ImageProcessor(path); }} void CCvisionDlg::OnProcess(){ if (proc != 0) { // process and display proc->execute(); proc->display(); } } If you open an image and push the process button, then the result is:


8 von 33 11.04.2008 11:43

Check point #1: source code of the above example. 4. Creating an image and accessing its pixels In the preceding example, the image has been created from a file. In many applications, it would be alsouseful to create an image from scratch. This can be done using the IPL functions in which case you must firstcreate a header that specify the image format. The following two examples show how to create a gray levelimage and a color image. // Creating a gray level imageIplImage* gray= iplCreateImageHeader(1,0,IPL_DEPTH_8U, "GRAY","G", IPL_DATA_ORDER_PIXEL,IPL_ORIGIN_TL,IPL_ALIGN_QWORD, width,height, NULL,NULL,NULL,NULL); iplAllocateImage(gray, 1, 0); // Creating a color imageIplImage* color = iplCreateImageHeader(3,0, IPL_DEPTH_8U, "RGB", "BGR", IPL_DATA_ORDER_PIXEL, IPL_ORIGIN_TL, IPL_ALIGN_QWORD, width, height NULL,NULL,NULL,NULL); iplAllocateImage(color, 1, 0); The first parameter specifies the number of channel and the second is 0 if there is no alpha channel in theimage (which is most often the case in computer vision). The third parameter defines the pixel type. Anunsigned 8 bits pixel (IPL_DEPTH_8U ) is the common choice but 2-byte signed integer(IPL_DEPTH_16S) and 4-byte float (IPL_DEPTH_32F ) are also very useful. The next parametersspecify the color model (basically "GRAY" or "RGB") and the channel sequence (in case of a color image).The data order parameter specifies how the different color channels are ordered. Under IPL the choices arepixel-oriented, i.e. RGBRGBRGB… or plane-oriented, i.e. RRRR…GGGGG…BBBB… The origin isnormally at the top left corner (IPL_ORIGIN_TL). For an efficient use of the MMX capabilities of theprocessor, the line length of an image should be a multiple of 8 bytes. This is guaranteed by choosing thequad-word alignment, each line being padded with dummy pixels if necessary. Finally, the width (number ofcolumn) and the height (number of lines) of the image are specified. The last four parameters are usually


9 von 33 11.04.2008 11:43

NULL. Once the header created, memory must be allocated. This is the role of the iplAllocateImage function.An initial value for the pixel data can be specified, this is the last parameter. The middle parameter of thisfunction must be set to 0 if no initialization is required. Do not forget to deallocate the images at the end ofthe process by calling iplDeallocate(image, IPL_IMAGE_ALL ). Note that for floating pointimage, iplAllocateImageFP and iplDeallocateImageFP must be used instead. An alternative way to create and allocate image is to use the OpenCV equivalent function. Here only the size,the pixel depth and the number of channels need to be specified, e.g.: IplImage* color = cvCreate( cvSize(width,height), IPL_DEPTH_8U, 3); To deallocate, you can then call cvReleaseImage(&image). When manipulating images, it is common to sequentially access all pixels of an image. To this end theiplPutPixel and iplGetPixelcan be used. You just specified the pixel coordinates and an array containing the values, as follows: unsigned char values[3]; // 3 is for color imageiplGetPixel(image, x, y, values); But for a more efficient loop, it is possible to directly access the buffer containing the pixels. Caution musthowever be taken, because the way this loop must be executed depends on the exact image format. This isillustrated by the following processfunction, where a 8-bit RGB image, with pixel-oriented data order is scanned. void process(void* img) { IplImage* image = reinterpret_cast<IplImage*>(img); int nl= image->height; int nc= image->width * image->nChannels; int step= image->widthStep; // because of alignment // because imageData is a signed char* unsigned char *data= reinterpret_cast<unsigned char *>(image->imageData); for (int i=0; i<nl; i++) { for (int j=0; j<nc; j+= image->nChannels) { // 3 channels per pixel if (data[j+1] > data[j] && data[j+1] > data[j+2]) { data[j]= 0xFF; // 255 data[j+1]= 0xFF; data[j+2]= 0xFF; } } data+= step; // next line }}


10 von 33 11.04.2008 11:43

The result is:

Although this is the most efficient way to scan an image, this process can be error prone. In order to simplifythis frequent task, an image Iterator can be introduced. The role of this iterator template is to take care ofthe pointer manipulation involve in the processing of an image. The template is as follows: template <class PEL>class IplImageIterator { int i, i0,j; PEL* data; PEL* pix; int step; int nl, nc; int nch; public: /* constructor */ IplImageIterator(IplImage* image, int x=0, int y=0, int dx= 0, int dy=0) : i(x), j(y), i0(0) { data= reinterpret_cast<PEL*>(image->imageData); step= image->widthStep / sizeof(PEL); nl= image->height; if ((y+dy)>0 && (y+dy)<nl) nl= y+dy; if (y<0) j=0; data+= step*j; nc= image->width ; if ((x+dx)>0 && (x+dx)<nc) nc= x+dx; nc*= image->nChannels; if (x>0) i0= x*image->nChannels; i= i0; nch= image->nChannels; pix= new PEL[nch];} /* has next ? */ bool operator!() const { return j < nl; }


11 von 33 11.04.2008 11:43

/* next pixel */ IplImageIterator& operator++() {i++; if (i >= nc) { i=i0; j++; data+= step; } return *this;} IplImageIterator& operator+=(int s) {i+=s; if (i >= nc) { i=i0; j++; data+= step; } return *this;} /* pixel access */ PEL& operator*() { return data[i]; } const PEL operator*() const { return data[i]; } const PEL neighbor(int dx, int dy) const { return *(data+dy*step+i+dx); } PEL* operator&() const { return data+i; } /* current pixel coordinates */ int column() const { return i/nch; } int line() const { return j; }}; An iterator of this type can be declared by specifying the type of the pixels in the image and by giving apointer to the IplImage as argument to the iterator constructor, e.g.: IplImageIterator<unsigned char> it(image); Once the iterator constructed, two operators can be used to iterate over an image. First the ! operator allowsto determine if we reach the end of the image and the * operator that give access to the current pixel. Atypical loop will therefore look like this: while (!it) { if (*it < 10) { *it= 0xFF; // 255 } ++it;} Note that if the image contains more than one channel, each iteration will give access to one of the channelof a pixel. This means that in the case of a color pixel, you have to iterate three times for each pixel. In orderto access all components of a pixel, the operator &can be used. This one returns an array that contains the current pixel channel values. For example, theprevious example will look like this (note how the iterator is incremented this time to make sure that we gofrom one pixel to another): void process(void* img) { IplImage* image = reinterpret_cast<IplImage*>(img); IplImageIterator<unsigned char> it(image); unsigned char* pixel; while (!it) { pixel= ⁢ if (pixel[1]>pixel[0] && pixel[1]>pixel[2]) { pixel[0]= 0xFF; // 255 pixel[1]= 0xFF; // 255 pixel[2]= 0xFF; // 255


12 von 33 11.04.2008 11:43

} it+= 3; }} The use of image iterators is as efficient as directly looping with pointers. This is true as long as you set thecompiler to optimize for speed, i.e.:

When the processing involves more than one image, more than one iterator can be used. This is illustrated inthe following example: void process(void* img) { IplImage* image = reinterpret_cast<IplImage*>(img); IplImage* tmp= cvCloneImage(image); IplImageIterator<unsigned char> src(tmp,1,1,tmp->width-2,tmp->height-2); IplImageIterator<unsigned char> res(image,1,1,image->width-2,image->height-2); while (!src) { *res= abs(*src - src.neighbor(-1,-1) + src.neighbor(-1,0) – src.neighbor(0,-1)); ++src; ++res; } cvReleaseImage(&tmp);} Here the clone of the source image is used as input while the source image is modified inside the loop. Twoiterators are therefore defined. Since the processing also involves the neighboring pixels, the neighbormethod defined by the iterator is used. Also, in this case, a window is specified when creating the iterator(here it defines a 1-pixel strip around the image where no processing is undertaken). The resulting image is:


13 von 33 11.04.2008 11:43

Check point #1b: source code of the above example. 5. Displaying an image sequence In order to process image sequences (from files or from a camera), you have to use DirectShow. TheDirectShow architecture that is part of Microsoft DirectX relies on a filter architecture. There are three typesof filters: source filters that output video and/or audio signals, transform filters that process an input signaland produce one (or several) output and finally rendering filters that display or save a media signal. Theprocessing of a sequence is therefore done using a series of filters connected together; the output of one filterbecoming the input of the next one (you can also have filters with multiple outputs). The first filter is usuallya decompressor that reads a file stream and the last filter could be a renderer that displays the sequence in awindow. In the DirectShow terminology, a series of filters is called a filter graph. We will first try to process an AVI sequence. Let’s first see if DirectX is working fine. To do so, just use theGraphEditapplication. This a very useful application included in the DirectX SDK that makes easy the building of filtergraphs. It can be started from the Start|Programs|Microsoft DirectX 8.1 SDK|DirectXUtilities menu. The GraphEdit application window will pop up.

Our objective is now to visualize the building blocks required to obtain an AVI renderer.Select Graph|Insert Filters… A window will display the list of available filters. Choose the DirectShow Filters tree and select the File Source(Async.) filter.


14 von 33 11.04.2008 11:43

You will be asked to select an AVI file. The filter will appear in the GraphEdit window in the form of abox. Right-click on the output pin and select the Render Pin option. This is an intelligent option that willdetermine what filters are required to render the selected source file and will automatically assemble themtogether as shown here:

For an AVI sequence, the video renderer should be composed of 3 filters. The first one is the splitter thatseparates the video and audio components; this filter normally has two outputs (video and audio) but notethat in the case of the selected sequence, no audio component was available. The second one is theappropriate decompressor that decodes the video sequence. Finally, the third filter is the renderer itself thatcreates the window and that displays the frame sequence in it. Just push the play button to execute the graph and the selected AVI sequence should be displayed in a window. We can build the same filter graph using Visual C++. You first need to include the following include path inyour project settings: C:\DXSDK\samples\Multimedia\DirectShow\BaseClasses


15 von 33 11.04.2008 11:43

And the following library path: C:\DXSDK\lib Finally add the following library: STRMBASE.LIB DirectX is implemented using the Microsoft COM technology. This means that when you want to dosomething, you do it by using a given COM interface. In order to initialize the COM layer, you must call: CoInitialize(NULL); And similarly, when you are done with COM, you need to uninitialize it: CoUninitialize(); A COM interface is an abstract class containing pure virtual functions (forming together the interface). Usinga COM interface is the only way to communicate with a COM object. They are obtained by calling theappropriate API function. These functions return a value of type HRESULT representing an error code. Thesimplest way to verify whether a COM call failed or succeeded is to check the return value using theFAILED macro. All COM interface derives from the IUnknown interface. A very important rule when you use an interface is to never forget to release it after you have finished to useit otherwise it will result resource leaks. This is done by calling the Release method of the IUnknowninterface which decrements the object's reference count by 1; when the count reaches 0, the object isdeallocated. The safest way to call the Realease method is to use the macro SAFE_RELEASE that can be found in dxutil.h located in C:\DXSDK\samples\Multimedia\Common\includeThis macro is simply defined as:#define SAFE_RELEASE(p) { if(p){(p)->Release();(p)=NULL;}} To use a component of DirectX, you must first call its top-level interface. These are identified by a CLSIDidentifier and each interface is identified by an IID. For example, to create a DirectShow filter graph (use tobuild a series of filters) you call: IGraphBuilder *pGraph;CoCreateInstance(CLSID_FilterGraph, // object identifier NULL, CLSCTX_INPROC, IID_IGraphBuilder, // interface identifier (void **)&pGraph); // pointer to the // top-level interface To request the other interfaces of this object, you use QueryInterface method. For example: pGraph->QueryInterface( IID_IMediaControl, // interface identifier void **)&pMediaControl); // pointer to the interface Once the filter graph is created, it becomes easy to create all the filters required to render an AVI file. This isdone by calling pGraph->RenderFile(MediaFile, NULL); This call does what the Render Pin option do in the GraphEdit application. To play the video, twomore interfaces are required the IMediaControl that is used to start the playback and theIMediaEvent used to catch when the stream rendering has completed. Here is the complete class:


16 von 33 11.04.2008 11:43

class SequenceProcessor { IplImage* img; // Declare IPL/OpenCV image pointer IGraphBuilder *pGraph; IMediaControl *pMediaControl; IMediaEvent *pEvent; public: SequenceProcessor(CString filename, bool display=true) { CoInitialize(NULL); pGraph= 0; // Create the filter graph if (!FAILED( CoCreateInstance(CLSID_FilterGraph, NULL, CLSCTX_INPROC, IID_IGraphBuilder, (void **)&pGraph))) { // The two control interfaces pGraph->QueryInterface(IID_IMediaControl, (void **)&pMediaControl); pGraph->QueryInterface(IID_IMediaEvent, (void **)&pEvent); // Convert Cstring into WCHAR* WCHAR *MediaFile= new WCHAR[filename.GetLength()+1]; MultiByteToWideChar(CP_ACP, 0, filename, -1, MediaFile, filename.GetLength()+1); // Create the filters pGraph->RenderFile(MediaFile, NULL); if (display) { // Execute the filter pMediaControl->Run(); // Wait for completion. long evCode; pEvent->WaitForCompletion(INFINITE, &evCode); } } } ~SequenceProcessor() { // Do not forget to release after use SAFE_RELEASE(pMediaControl); SAFE_RELEASE(pEvent); SAFE_RELEASE(pGraph); CoUninitialize(); }


17 von 33 11.04.2008 11:43

}; When an AVI file is selected, a rendering filter is created and the sequence is displayed. To have an idea ofwhat filters have been created, we can enumerate them by adding the following member function to ourclass: std::vector<CString> enumFilters() { IEnumFilters *pEnum = NULL; IBaseFilter *pFilter; ULONG cFetched; std::vector<CString> names; pGraph->EnumFilters(&pEnum); while(pEnum->Next(1, &pFilter, &cFetched) == S_OK) { FILTER_INFO FilterInfo; char szName[256]; CString fname; pFilter->QueryFilterInfo(&FilterInfo); WideCharToMultiByte(CP_ACP, 0, FilterInfo.achName, -1, szName, 256, 0, 0); fname= szName; names.push_back(fname); SAFE_RELEASE(FilterInfo.pGraph); SAFE_RELEASE(pFilter); } SAFE_RELEASE(pEnum); return names;} This method simply creates a vector of strings (you have to include <vector>) containing the names of thefilters associated with the generated filter graph. This name is obtained by reading the FILTER_INFOstructure. The enumeration is obtained by calling the method EnumFilter of the FilterGraphinstance. Note how all interfaces are released, including the one indirectly obtained through FILTER_INFO that also contains a pointer to the associated filter graph. To display the filter names, we add a CListBoxto the dialog. Do not forget to add a control member variable to this list. This can be done using the Class Wizard of the View menu. Select the Member Variables tab and then select the control ID thatcorresponds to the ClistBox (the name should be IDC_LIST1). Click on Add Variable… button, callthe variable m_list ; its Category must be Control. The m_list variable is now available as a member variable of the dialog class. The filter names are added to this list by changing the OnOpen methodas follows: void CCvisionDlg::OnOpen(){ CFileDialog dlg(TRUE, _T("*.bmp"), "", OFN_FILEMUSTEXIST|OFN_PATHMUSTEXIST|OFN_HIDEREADONLY, "image files (*.bmp; *.jpg) | *.bmp;*.jpg|AVI files (*.avi) | *.avi|All Files (*.*)|*.*||",NULL);


18 von 33 11.04.2008 11:43

char title[]= {"Open Image"}; dlg.m_ofn.lpstrTitle= title; if (dlg.DoModal() == IDOK) { CString path= dlg.GetPathName(); CString ext= dlg.GetFileExt(); if (proc != 0) delete proc; if (procseq != 0) delete procseq; if (ext.Compare("avi")) { proc= new ImageProcessor(path); } else { procseq= new SequenceProcessor(path); // Obtaining the list of filters std::vector<CString> names= procseq->enumFilters(); m_list.ResetContent(); for (int i=0; i<names.size(); i++) m_list.AddString(names[i]); } }} and now if you open an AVI file, you can see the filter list:

Check point #2: source code of the above example. 6. Building a filter graph The next step is to try to build the same filter graph ourselves without using the RenderFile method. Instead, we will create each filter and connect them together. This way we will able to modify the graph byadding our own filters and thus performing the processing we want. Filters are connected together using theirpins; an output pin of a filter is connected to the input pin of the next filter. To obtain the pin of a filter, youhave to use the EnumPinsmethod. You then iterate through all the pins until you find the required one (either output or input). This iswhat the following function does: IPin *GetPin(IBaseFilter *pFilter, PIN_DIRECTION PinDir)


19 von 33 11.04.2008 11:43

{ BOOL bFound = FALSE; IEnumPins *pEnum; IPin *pPin; pFilter->EnumPins(&pEnum); while(pEnum->Next(1, &pPin, 0) == S_OK) { PIN_DIRECTION PinDirThis; pPin->QueryDirection(&PinDirThis); if (bFound = (PinDir == PinDirThis)) break; pPin->Release(); } pEnum->Release(); return (bFound ? pPin : 0); } The PIN_DIRECTION can be PINDIR_OUTPUT or PINDIR_INPUT. For example, to obtain a sourcefilter and its output pin ready to be connected, we can do: IBaseFilter* pSource= NULL;// Add a source filter to the current graphpGraph->AddSourceFilter(mediaFile,0,&g_pSource);// Obtain the output pinIPin* pSourceOut= GetPin(pSource, PINDIR_OUTPUT); To add a filter (it must first be created) to the filter graph, we use the AddFilter method: // Add the pFilter to the current graphpGraph->AddFilter( pFilter, L"Name of the Filter"); The second argument is a name for the filter that must identifies it uniquely in the filter graph (if you set it toNULL, the graph manager will generate one for you). To connect to pins together, we simply use theConnect method // Connect pIn to pOutpGraph->Connect(pOut, pIn); What filters do we need to display an AVI sequence? We know the answer from the results displayed in thefilter list box or in the GraphEdit application:

a Source filter that reads the file1.an AVI splitter that reads the stream and split it into a video and an audio channel (we ignore the latterhere).

2.

an AVI video decompressor that decodes the video stream3.a Video renderer that plays the video sequence in a window.4.

Note that for some filter, the pins are created dynamically. This is the case of the AVI splitter that will createthe required output pins (video and/or audio) only when the source is connected to its input. This makessense since the format of the output of this kind of filter is known only when the type of its input is known. Itmust also be obvious that, to be connected together, the respective output and input pins of two filters mustbe of compatible types. The properties of a given pin (such as major type and subtype) can be obtained asfollows: AM_MEDIA_TYPE amt;pPin->ConnectionMediaType(&amt);


20 von 33 11.04.2008 11:43

The following member function will now create the complete filter graph. The procedure is simple: we firstcreate the filter using CoCreateInstance (finding the right CLSID identifier is the key to obtain the filter we want), add it to the filter graph, obtain its input pin and connect if to the output pin of the previousfilter. bool createFilterGraph(CString filename) { WCHAR *mediaFile= new WCHAR[filename.GetLength()+1]; MultiByteToWideChar(CP_ACP, 0, filename, -1, mediaFile, filename.GetLength()+1); // Create a source filter specified by filename IbaseFilter* pSource= NULL; if(FAILED(pGraph->AddSourceFilter(mediaFile,0,&pSource))) { ::MessageBox( NULL, "Unable to create source filter", "Error", MB_OK | MB_ICONINFORMATION ); return 0; } IPin* pSourceOut= GetPin(pSource, PINDIR_OUTPUT); if (!pSourceOut) { ::MessageBox( NULL, "Unable to obtain source pin", "Error", MB_OK | MB_ICONINFORMATION ); return 0; } // Create an AVI splitter filter IBaseFilter* pAVISplitter = NULL; if(FAILED(CoCreateInstance(CLSID_AviSplitter, NULL, CLSCTX_INPROC_SERVER, IID_IBaseFilter, (void**)&pAVISplitter)) || !pAVISplitter) { ::MessageBox( NULL, "Unable to create AVI splitter", "Error", MB_OK | MB_ICONINFORMATION ); return 0; } IPin* pAVIsIn= GetPin(pAVISplitter, PINDIR_INPUT); if (!pAVIsIn) { ::MessageBox( NULL, "Unable to obtain input splitter pin", "Error", MB_OK | MB_ICONINFORMATION ); return 0; } // Connect the source and the splitter if(FAILED(pGraph->AddFilter( pAVISplitter, L"Splitter")) || FAILED(pGraph->Connect(pSourceOut, pAVIsIn)) ) { ::MessageBox( NULL, "Unable to connect AVI splitter filter", "Error", MB_OK | MB_ICONINFORMATION ); return 0; } // Create an AVI decoder filter


21 von 33 11.04.2008 11:43

IBaseFilter* pAVIDec = NULL; if(FAILED(CoCreateInstance(CLSID_AVIDec, NULL, CLSCTX_INPROC_SERVER, IID_IBaseFilter, (void**)&pAVIDec)) || !pAVIDec) { ::MessageBox( NULL, "Unable to create AVI decoder", "Error", MB_OK | MB_ICONINFORMATION); return 0; } IPin* pAVIsOut= GetPin(pAVISplitter, PINDIR_OUTPUT); if (!pAVIsOut) { ::MessageBox( NULL, "Unable to obtain output splitter pin", "Error", MB_OK | MB_ICONINFORMATION ); return 0; } IPin* pAVIDecIn= GetPin(pAVIDec, PINDIR_INPUT); if (!pAVIDecIn) { ::MessageBox( NULL, "Unable to obtain decoder input pin", "Error", MB_OK | MB_ICONINFORMATION ); return 0; } if(FAILED(pGraph->AddFilter( pAVIDec, L"Decoder")) || FAILED(pGraph->Connect(pAVIsOut, pAVIDecIn)) ) { ::MessageBox( NULL, "Unable to connect AVI decoder filter", "Error", MB_OK | MB_ICONINFORMATION ); return 0; } IPin* pAVIDecOut= GetPin(pAVIDec, PINDIR_OUTPUT); if (!pAVIDecOut) { ::MessageBox( NULL, "Unable to obtain decoder output pin", "Error", MB_OK | MB_ICONINFORMATION ); return 0; } // Render from the decoder if(FAILED(pGraph->Render( pAVIDecOut ))) { ::MessageBox( NULL, "Unable to connect to renderer", "Error", MB_OK | MB_ICONINFORMATION ); return 0; } SAFE_RELEASE(pAVIDecIn); SAFE_RELEASE(pAVIDecOut); SAFE_RELEASE(pAVIDec); SAFE_RELEASE(pAVIsOut); SAFE_RELEASE(pAVIsIn);


22 von 33 11.04.2008 11:43

SAFE_RELEASE(pAVISplitter); SAFE_RELEASE(pSourceOut); SAFE_RELEASE(pSource); return 1;} By executing this manually built filter, the result is the same as previously.

Check point #3: source code of the above example. 7. Processing an image sequence It is now time to process an image sequence. What we want to do is to sequentially process each frame of anAVI sequence. To do so, the OpenCV library offers a special filter called ProxyTrans. It should belocated in C:\Program Files\Intel\opencv\bin. To be used, it must first be registered. This can be donefrom the MS-Dos window using the regsvr32 application (you just type regsvr32 ProxyTrans.ax, you might have to includeC:\Program Files\Intel\opencv\bin in your path environment variable). To check if the ProxyTrans filter is ready to be used, we use again the GraphEdit application. Build arendering filter graph and then delete the connection between the Decompression filter and the VideoRenderer (just click on the arrow and push on Delete button). Now select Select Graph|InsertFilters…, the ProxyTransshould be in the list of DirectShow filter. Insert it and connect its input pin to the decompressor and its outputpin to the renderer. The sequence should appear again when you play the filter graph.


23 von 33 11.04.2008 11:43

We will see latter how the ProxyTransfilter can be used to process the sequence. But since we want to transform the original sequence throughsome process, it might be useful to be able to save the processed sequence. Let’s make some test using againthe GraphEditapplication. Delete the Video Renderer filter; we will replace it by a chain that will compress back thesequence and save it to a file. We therefore need a Video Compressor, an AVI multiplexor and a File Writer.You can easily find all these filters in the list of available filters when you click on the Insert Filterbutton. Note that when you select the File Writer filter, you will be ask to specify a name for the output file.The resulting graph should be as follows:


24 von 33 11.04.2008 11:43

Obviously, if you play this graph, the resulting file will be the same as the original because ourProxyTransfilter that is supposed to do the processing does not do anything for now. However, the size of the outputsequence might be different from the size of the original sequence, this is because of the compressor used inthe graph that might use different parameters to compress the sequence. You probably also noted that whenyou play the graph, no sequence is displayed, simply because we removed the Renderer. It is quite easy toadd an extra path to the filter in order to allow the simultaneous display and saving of the sequence. TheSmart Tee is the filter you need. Add it and create the following graph:

Note that the Smart Teefilter has two output pins. The capture pin controls the sequence flow; the preview pin will receive framesonly if extra computational resources are available. When processing a sequence, you could also use twoSmart Teefilters, one to display the original sequence, the other to display the processed one; that is what we will donow when building manually our filter graph.


25 von 33 11.04.2008 11:43

As you can see in the figure above, the creation of a video processing filter graph requires connecting severalfilters together. Many lines have to be added to our createFilterGraph method; the probability ofmaking an error becomes then quite high. However, a closer look at this method reveals that the samesequence is repeated several times, suggesting that some generic function could be introduced to help theprogrammer. Following this idea, we can write an addFilterutility function. This one will be called each time a new filter need to be created and connected to some filterof a graph. This function has the following signature: bool addFilter(REFCLSID filterCLSID, WCHAR* filtername, IGraphBuilder *pGraph, IPin **outputPin, int numberOfOutput); The first parameter is the CLSIDidentifier that specifies which filter will be created. The second parameter is the name that will be given tothis filter in the current graph. The third parameter is a pointer to the filter graph. The outputPinparameter is both an input and an output parameter. As an input, it contains a pointer to the output pin towhich the filter to be created must be connected. When the function returns, this parameter will contain apointer to the output pin(s) of the filter thus created; the number of output pins that needs to be created isgiven by the last parameter of this function. The function returns true if the filter has been successfullycreated and connected to the filter graph. This function can be written in a straightforward manner. First the filter is created usingCoCreateInstance, then the input pin is obtained and is connected to the specified output pin. Once thisdone, the last step consists in obtaining the required number of output pins. The function is then as follows: bool addFilter(REFCLSID filterCLSID, WCHAR* filtername, IGraphBuilder *pGraph, IPin **outputPin, int numberOfOutput) { // Create the filter. IBaseFilter* baseFilter = NULL; char tmp[100]; if(FAILED(CoCreateInstance( filterCLSID, NULL, CLSCTX_INPROC_SERVER, IID_IBaseFilter, (void**)&baseFilter)) ||!baseFilter) { sprintf(tmp,"Unable to create %ls filter", filtername); ::MessageBox( NULL, tmp, "Error", MB_OK|MB_ICONINFORMATION ); return 0; } // Obtain the input pin. IPin* inputPin= GetPin(baseFilter, PINDIR_INPUT); if (!inputPin) { sprintf(tmp, "Unable to obtain %ls input pin", filtername);


26 von 33 11.04.2008 11:43

::MessageBox( NULL, tmp, "Error", MB_OK | MB_ICONINFORMATION ); return 0; } // Connect the filter to the ouput pin. if(FAILED(pGraph->AddFilter( baseFilter, filtername)) || FAILED(pGraph->Connect(*outputPin, inputPin)) ) { sprintf(tmp, "Unable to connect %ls filter", filtername); ::MessageBox( NULL, tmp, "Error", MB_OK | MB_ICONINFORMATION ); return 0; } SAFE_RELEASE(inputPin); SAFE_RELEASE(*outputPin); // Obtain the output pin(s). for (int i=0; i<numberOfOutput; i++) { outputPin[i]= 0; outputPin[i]= GetPin(baseFilter, PINDIR_OUTPUT, i+1); if (!outputPin[i]) { sprintf(tmp, "Unable to obtain %s output pin (%d)", filtername, i); ::MessageBox( NULL, tmp, "Error", MB_OK | MB_ICONINFORMATION ); return 0; } } SAFE_RELEASE(baseFilter); return 1;} Using this function, it becomes easy to create a complex filter graph. The one we will build now will includethe ProxyTrans filter (note that the header file initguid.h must be included to be able to use thisfilter). To be useful, this filter must do something. In fact, the objective of this filter is to give access to theprogrammer to each frame of the sequence that can thus be processed. This is realized through a callbackfunction that is automatically called for each frame of the sequence. This callback function passes inargument a pointer to the current image, the user is then free to analyze and modify this image. Here is anexample of a valid callback function that can be used with the ProxyTrans filter. void process(void* img) { IplImage* image = reinterpret_cast<IplImage*>(img); cvErode( image, image, 0, 2 );}


27 von 33 11.04.2008 11:43

In order to have this function called, it must be registered to the ProxyTrans filter. This is simply done bycalling this method of the IProxyTransform interface. pProxyTrans->set_transform(process, 0); Here is now the function that creates the filter graph that processes an input sequence and save the result in afile. Two preview windows are displayed, one for the original sequence, the other one for the out sequence. bool createFilterGraph() { IPin* pSourceOut[2]; pSourceOut[0]= pSourceOut[1]= NULL; // Video source addSource(ifilename, pGraph, pSourceOut); // Add the decoding filters addFilter(CLSID_AviSplitter, L"Splitter", pGraph, pSourceOut); addFilter(CLSID_AVIDec, L"Decoder", pGraph, pSourceOut); // Insert the first Smart Tee addFilter(CLSID_SmartTee, L"SmartTee(1)", pGraph, pSourceOut,2); // Add the ProxyTrans filter addFilter(CLSID_ProxyTransform, L"ProxyTrans", pGraph, pSourceOut); // Set the ProxyTrans callback IBaseFilter* pProxyFilter = NULL; IProxyTransform* pProxyTrans = NULL; pGraph->FindFilterByName(L"ProxyTrans",&pProxyFilter); pProxyFilter->QueryInterface(IID_IProxyTransform, (void**)&pProxyTrans); pProxyTrans->set_transform(process, 0); SAFE_RELEASE(pProxyTrans); SAFE_RELEASE(pProxyFilter); // Render the original (decoded) sequence // using 2nd SmartTee(1) output pin addRenderer(L"Renderer(1)", pGraph, pSourceOut+1); // Insert the second Smart Tee addFilter(CLSID_SmartTee, L"SmartTee(2)", pGraph, pSourceOut,2); // Encode the processed sequence addFilter(CLSID_AviDest, L"AVImux", pGraph, pSourceOut); addFileWriter(ofilename, pGraph, pSourceOut); // Render the transformed sequence // using 2nd SmartTee(2) output pin addRenderer(L"Renderer(2)", pGraph, pSourceOut+1); return 1;}


28 von 33 11.04.2008 11:43

Check point #4: source code of the above example. You will note that the output file produced by this program is quite big. This is simply because we are notusing any compressor when the sequence is saved. This is because such filter can only be obtained throughenumeration. This is discussed in the next section. 8. Enumerating filters and devices Filters are registered COM objects made available by the operating system to your applications. Dependingon the software applications installed on your machine, different sets of filters might be available. These filesare classified by category. There is, for example, a category identified byCLSID_VideoCompressorCategoryand that includes all the compression filters available. When you wish to use a filter of a given category, youmust enumerate the filters available and select one of these. To enumerate the filters, the first step consists in the creation of a system device enumerator: ICreateDevEnum *pSysDevEnum;CoCreateInstance(CLSID_SystemDeviceEnum, NULL, CLSCTX_INPROC_SERVER, IID_ICreateDevEnum, (void **)&pSysDevEnum); Then an enumerator for a given category is obtained as follows: IEnumMoniker *pEnumCat = NULL;pSysDevEnum->CreateClassEnumerator( CLSIDcategory, &pEnumCat, 0); A simple loop is then required to obtain each filter of the category: IMoniker *pMoniker;ULONG cFetched;while(pEnumCat->Next( 1, // number of elements requested &pMoniker, // pointer to the moniker &cFetched) // number of elements returned == S_OK) These ones are identified by the IMonikerinterface, an interface used to uniquely identify a COM object. A moniker is similar to a path in a file systemand it can be used to obtain information about a given filter: IPropertyBag *pPropBag;


29 von 33 11.04.2008 11:43

pMoniker->BindToStorage(0, 0, IID_IPropertyBag, (void **)&pPropBag); Properties of a filter are obtained using the IPropertyBag interface. This generic interface is used to readand write properties using text. Moniker can also be used to create a filter: IBaseFilter* baseFilter;pMoniker->BindToObject(NULL, NULL, IID_IBaseFilter, (void**)&baseFilter); This later approach must be used to create an enumerated filter instead of using the CoCreateInstancefunction. The function presented below can be used to obtain the available filters of a category. It returns thefriendly names and the CLSID identifier of each filter. Either can be used after to create a given filter. void enumFilters(REFCLSID CLSIDcategory, std::vector<CString>& names, std::vector<CLSID>& clsidFilters) { // Create the System Device Enumerator. HRESULT hr; ICreateDevEnum *pSysDevEnum = NULL; hr = CoCreateInstance(CLSID_SystemDeviceEnum, NULL, CLSCTX_INPROC_SERVER, IID_ICreateDevEnum, (void **)&pSysDevEnum); // Obtain a class enumerator for the specified category. IEnumMoniker *pEnumCat = NULL; hr = pSysDevEnum->CreateClassEnumerator(CLSIDcategory, &pEnumCat, 0); if (hr == S_OK) { // Enumerate the monikers. IMoniker *pMoniker; ULONG cFetched; while(pEnumCat->Next(1, &pMoniker, &cFetched) == S_OK) { IPropertyBag *pPropBag; pMoniker->BindToStorage(0, 0, IID_IPropertyBag, (void **)&pPropBag); // To retrieve the friendly name of the filter VARIANT varName; VariantInit(&varName); hr = pPropBag->Read(L"FriendlyName", &varName, 0); if (SUCCEEDED(hr)) { CString str(varName.bstrVal); names.push_back(str); SysFreeString(varName.bstrVal); } VariantClear(&varName); VARIANT varFilterClsid; varFilterClsid.vt = VT_BSTR; // Read CLSID string from property bag hr = pPropBag->Read(L"CLSID", &varFilterClsid, 0);


30 von 33 11.04.2008 11:43

if(SUCCEEDED(hr)) { CLSID clsidFilter; // Save filter CLSID if(CLSIDFromString(varFilterClsid.bstrVal, &clsidFilter) == S_OK) { clsidFilters.push_back(clsidFilter); } SysFreeString(varFilterClsid.bstrVal); } // Clean up. pPropBag->Release(); pMoniker->Release(); } pEnumCat->Release(); } pSysDevEnum->Release();} This function is used to select a compression filter to be used in our sequence processing application. The listof compression filters is displayed in a list box (obtained after the output sequence is selected: void CCvisionDlg::OnSave(){ // Select output file … // Obtain and display compressors std::vector<CString> fname; std::vector<CLSID> fclsid; enumFilters(CLSID_VideoCompressorCategory, fname, fclsid); m_list.ResetContent(); for (int i=0; i<fname.size(); i++) m_list.AddString(fname[i]);} The compression filter is selected by clicking on the corresponding item before pushing the process button.The sequence will then be saved, compressed according to the default control parameters of the chosencompressor. What if you are not satisfied with the resulting compression rate? You can obviously try toselect another compression filter; however, it is also possible to use different control parameter values for thechosen filter. This can be done through a special interface called IAMVideoCompression. This interfaceis normally supported by the output pin of a compression filter. You can obtain the interface by calling theQueryInterface method of the pin: IAMVideoCompression *pCompress;pPin->QueryInterface(IID_IAMVideoCompression, (void**)&pCompress);


31 von 33 11.04.2008 11:43

Once obtained, the interface can be used to set the compression properties, namely: the key frame rate (along integer), the number of predicted frames per key frame (also a long integer), and the relativecompression quality (a double expressing a percentage between 0.0 and 1.0). It is then easy to set thesevalues using the appropriate methods. long keyFrames, pFrames;double quality;hr = pCompress->put_KeyFrameRate(keyFrames);hr = pCompress->put_PFramesPerKeyFrame(pFrames);hr = pCompress->put_Quality(quality);

Check point #5: source code of the above example. The same strategy can be used to select a video capture device (e.g. a USB camera). The only difference isthat these devices obviously do not have input pins. However, they normally have two output pins (one forcapture and one for preview). The basic steps to build a camera-based video processing filter graph are asfollows. First add the video capture device through enumeration: CString cameraName= ?;IPin* pSourceOut[2];pSourceOut[0]= pSourceOut[1]= NULL;addFilterByEnum(CLSID_VideoInputDeviceCategory, cameraName,pGraph,pSourceOut,2); Second, add the ProxyTrans filter: addFilter(CLSID_ProxyTransform, L"ProxyTrans", pGraph, pSourceOut); Then you should add a renderer to the preview pin: addRenderer(L"Renderer(1)", pGraph, pSourceOut+1); And finally, you add the required filters to save the resulting sequence to a file: addFilter(CLSID_AviDest, L"AVImux", pGraph, pSourceOut);addFileWriter(ofilename, pGraph, pSourceOut); The complete application that includes the camera selection is given here.


32 von 33 11.04.2008 11:43

Now to be able to change the camera settings (such as resolution or frame rate), you must access the facilitiesoffered by the driver of the camera. The easiest way to do it is to use the old VideoForWindows technology(an ancestor of DirectShow). If the camera you use has a driver compatible with this technology, then it ispossible to obtain dialog boxes to control the camera settings. This is done through theIAMVfwCaptureDialogsinterface of the camera filter. The first thing to do is then to check if the camera supports this filter and if yes,to check what dialogs are available. The three standard dialogs are designated by an enumerated type:VfwCaptureDialog_Source, VfwCaptureDialog_Format,VfwCaptureDialog_Display. The procedure to obtain one of these dialogs is quite straightforward: IAMVfwCaptureDialogs *pVfw = 0; // pCap is a pointer to the camera base filterif (SUCCEEDED(pCap->QueryInterface(

IID_IAMVfwCaptureDialogs, (void**)&pVfw))) { // Check if the device supports this dialog box. if (S_OK == pVfw->HasDialog(VfwCaptureDialog_Format)){ // Show the dialog box. pVfw->ShowDialog(VfwCaptureDialog_Format, hwndParent); // parent window }} A dialog like the following should then appear:

Check point #6: source code of the above example.

Since March 28th 2003, this page has been visited times according to StatCounter.com


33 von 33 11.04.2008 11:43

Since March 28th 2003, this page has been visited times according to www.digits.com

A step-by-step guide to the use of the Intel OpenCV...

Documents

Transcript of A step-by-step guide to the use of the Intel OpenCV...