Project Proposal Networking

8/12/2019 Project Proposal Networking

1/8

Video Based Tracking & Recognition via (Wireless) Network

Transmission

A Proposal Prepared for the

Final Project of the Graduate CourseEE635: Digital Image Processing

March 20, 2007

Yingxue Feng* [email protected]

Jianjun Yang [email protected]

Ping Yi [email protected]

Ping Guo [email protected]

Department of Computer Science

University of Kentucky

Lexington, KY 40506-0046 USA

* Team Leader

1
mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]


2/8

Research Problem

As the inexpensive computing power and image sensors become available, automated

object tracking and recognition in live video have become an important research topic

as well as application in the image processing and computer vision community in recent

years. The results of tracking and recognition can be applied further to the areas of

surveillance, vision-based human computer interaction and semantic indexing of video.

Many approaches for video tracking have been proposed over the last few decades.

Among these methods, Kalman filter, as an optimal recursive Bayesian filter for linear

functions and Gaussian noise, is useful for object tracking. Its so powerful even for

overlap tracking objects, while its too complicated. Sum of the Squared Difference is

efficient but its not so accurate. For objective detection, the typical problem is that

there have been a training set of images with N objects, given a new image F containing

one of these N objects (database), can we find out which one F is? There are many

papers on this issue. Network Transmission involves topics such as improving the

compression rate, increasing the transmission speed, and reliable video coding/decoding

and transferring.

Research Goals and Objectives

The main goal of our research focuses on how to carry out recognition and tacking of

several moving objects in time based on simple background and how to provide reliable

and speedy video transferring over the network. We are going to study how to reduce

the computational complexity, increase the efficiency as well as improve the accuracy

on these proposed algorithms and methods. We will study how to achieve a high

accuracy that 100% object and gesture recognition andevery objectin the scene will be

detected and tracked as well. For our wireless network model, we will try to improve

the compression ratio to be 200, its resolution to be 352288, and its transferring speed

to be 10Kbps-1.0Mbps. In addition, we use the TCP protocol to transfer video on the

LAN/Internet to ensure reliable transferring. Finally, we will design a friendly user

interface. Basically, the interface has the following menus: load (load a video file), play

(play the video file), track and transmit. These menus display different corresponding

2


3/8

operations.

Research Design and Methods

For simplicity, we will make the following assumptions on the objects in the video: a

stationary backgroundwith three people keeping moving in the same directionall the

time; the images of the three people will not overlap each other at any time; for each

person, only two gestures are to be recognized.

Wedivide the whole system design into the following four parts:

1) Background extraction and image preprocessingThis part is the fundamental step to our following work. It provides an effective

means of segmenting objects moving in front of a static background. Other

preprocessing tasks such as binary image conversion and noise removal are also quite

necessary. We will use a standard graph-cut algorithm to produce a

foreground-background segmentation to reduce the error around segmented

fore-ground objects. The codes we will utilize and modify are from

http://maven.smith.edu/~nhowe/research/code/. These codes written in Matlab

include the following content:

(1)Load the video from an avi file;(2)Convert to HSV color space;(3)Generate Gaussian background model in hsv space for each pixel;(4)Do frame-by-frame differencing

(a)Find scaled deviation of this frame from background;(b)Compare with threshold to generate labeling;(c)Use graph cuts;(d)Find the largest connected component;

3
http://maven.smith.edu/~nhowe/research/code/http://maven.smith.edu/~nhowe/research/code/


4/8

(5) End of extract foreground

2) Objective detection, matching and trackingInstead of the database based algorithm, we will take a relative simple template

matching. We choose Normalized Correlation as our main method in our project for

the reason that it is robust, accurate and it is one of the most commonly used

template matching criteria. Therefore, detection (recognition) and tracking are

combined and considered as a whole in our project.

(1) Our recognition and matching algorithm and other related content are described

asfollows:

1. Definition:

1.1 Template: a small image or sub-image, e.g.

1.2 Definition of C': For two images A and B, we want to find object matching

between the two. Define Correlation:

C' = yx

yxByxA

,

),(),(

1.3 Corrected definition of C: In our algorithm, we use the formulation

(Normalized Correlation):

4


5/8

C =

yx yx

yx

ByxBAyxA

ByxBAyxA

, ,

2

1

22

,

))](),([)](),([(

)](),()][(),([

2. Goal:

The goal is to find occurrences of this template in a larger image. That is, find

matches of the template above in the image:

3. Algorithm Description:

3.1 Basic idea:

For each image coordinate (i, j), and the size of the template s, compute a

pixel-wise metric between the image and the template. Then record the

similarity. A match is based on the closest similarity measurement at each (i,

j).

3.2 Steps:

(a)Calculate C. The value the Normalized Correlation returns must be = the threshold, we consider it as

matching.

(2) Reference Codes 1--from a tracking-related project before.

These codes have the following aspects:

(a) Background extraction;

5


6/8

(b) Object detection: to locate objects on the background and forming template

for each object

Firstly, the difference operation is taken between the coming frame and

background. Then change the images to binary images. Finally for each

objects, calculating their centers and size to form templates;

(c) Template matching: In this sample program, the matching algorithm is SAD

(sum of the absolute difference)

These codes are written in C. We borrow the idea of the different phases of

tracking. Also, we would see how the recognition and matching are

implemented so that we should be able to use Matlab code to implement our

corresponding functionalities. Rather than the SAD approach in the codes

above, we will use Normalized Correlation in our project instead.

(3) Reference Codes 2--from http://www.deas.harvard.edu/projects/weitzlab/matlab/

These codes are written in Matlab. The general tracking functions are used in

this program. The program includes the following content:

1) Successfully generated a set of files that contains the x, y positions2) Put the position list in the required format3) Invoke track with the right parametersFrom these codes, we learn how to use the tracking related functions in Matlab,

such as track(), track1(),and track2(), etc.

The two codes mentioned above have different functions. The first provides us

with the basic idea and the algorithm about recognition and matching. The second

6
http://www.deas.harvard.edu/projects/weitzlab/matlab/http://www.deas.harvard.edu/projects/weitzlab/matlab/


7/8

shows how to use the Matlab functions to implement object detection and

tracking.

3) Video compression and transmissionFor this part, the necessity of video coding will be discussed, and the MPEG 4 series

will be reviewed. We also will study in detail the use of compression methods into

the MPEG 4 standard, such as DCT transform, motion estimation and compensation,

and variable length coding, etc. In addition,how the MPEG 4 video codec data is

transferred over the LAN or Internet (TCP/IP) will be introduced. We mainly use C

language to program the transmission part.

4) Intelligent human-machine interface designWe will develop a friendly user interface. The major developing tool is Matlab.

Staffing Plan

Ping Guo: Graduate student of Computer Science. Major contributor to background

extraction and noise removal

Yingxue Feng: Graduate student of Computer Science. Major contributor to object

detection,matching and tracking

Jianjun Yang: Graduate student of Computer Science. Major contributor to object

detection,matching and tracking,and user interface design

Ping Yi: Graduate student of Computer Science. Major contributor to video

coding/decoding and network transferring,and user interface design

Timeline

1. March 12-19 Making preparation for proposal on final projects. (1 week)

3. March 21-24 Background extraction and noise removal. (4 days)

4. March 25-April 1 Object detection, matching and tracking. (2 weeks)

5. April 13-19 Video compression and transmission. (1 week)

6. April 20-26 User Interface design and code debugging. (1 week)

Besides, we will strictly follow the project schedule the lecturer has proposed.

Reference

7


8/8

[1] N. Howe & A. Deschamps, Better Foreground Segmentation through Graph Cuts

[2] Y. Boykov, O. Veksler, and R. Zabih, Efficient approximate energy minimization

via graph graph cuts, IEEE Transactions on Pattern Analysis and Machine

Intelligence, 20(12): 1222-1239, November 2001

[3] B.V. Cherkassky and A.V. Goldberg, On implementing the push-relabel method

for the maximum flow problem. Algorithmica, 19(4): 390-410, 1997

[4] Mircea Nicolescu, Grard Medioni, Mi-Suen Lee, Segmentation, tracking and

iterpretation using panoramic video, IEEE Workshop on Omnidirectional Vision,

Hilton Head, South Carolina, U.S.A., June 2000

[5] http://www.mathworks.com/

[6] http://arxiv.org/abs/cs.CV/0401017

[7] http://maven.smith.edu/~nhowe/research/code/

[8] Emanuele Trucco, Alessandro Verri, Introductory Technique for 3-D Computer

Vision, Fraunhofer-Institute

[9] David A. Forsyth, Jean Ponce, Computer Vision: A Modern Approach, Publishing

House of Electronics Industry, China

[10] Yongsheng Gao, Maylor K.H. Leung, Face Recognition Using Line Edge Map

[11] http://www.math.ntnu.no/~holden/FrontBook/matlabcode.html

[12] http://www.advancedsourcecode.com/faceverification.asp

[13] http://www.iii.org.tw/special/article/VideoTracking.htm

[14] http://www.deas.harvard.edu/projects/weitzlab/matlab/

[15] Project from Prof. Ruigang Yangs class, Computer Science Department at UK

[16] References provided by the lecturer

8

Project Proposal Networking

Documents

Transcript of Project Proposal Networking