Project Proposal Networking
-
Upload
georgeazmir -
Category
Documents
-
view
216 -
download
0
Transcript of Project Proposal Networking
-
8/12/2019 Project Proposal Networking
1/8
Video Based Tracking & Recognition via (Wireless) Network
Transmission
A Proposal Prepared for the
Final Project of the Graduate CourseEE635: Digital Image Processing
March 20, 2007
Yingxue Feng* [email protected]
Jianjun Yang [email protected]
Ping Yi [email protected]
Ping Guo [email protected]
Department of Computer Science
University of Kentucky
Lexington, KY 40506-0046 USA
* Team Leader
1
mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected] -
8/12/2019 Project Proposal Networking
2/8
Research Problem
As the inexpensive computing power and image sensors become available, automated
object tracking and recognition in live video have become an important research topic
as well as application in the image processing and computer vision community in recent
years. The results of tracking and recognition can be applied further to the areas of
surveillance, vision-based human computer interaction and semantic indexing of video.
Many approaches for video tracking have been proposed over the last few decades.
Among these methods, Kalman filter, as an optimal recursive Bayesian filter for linear
functions and Gaussian noise, is useful for object tracking. Its so powerful even for
overlap tracking objects, while its too complicated. Sum of the Squared Difference is
efficient but its not so accurate. For objective detection, the typical problem is that
there have been a training set of images with N objects, given a new image F containing
one of these N objects (database), can we find out which one F is? There are many
papers on this issue. Network Transmission involves topics such as improving the
compression rate, increasing the transmission speed, and reliable video coding/decoding
and transferring.
Research Goals and Objectives
The main goal of our research focuses on how to carry out recognition and tacking of
several moving objects in time based on simple background and how to provide reliable
and speedy video transferring over the network. We are going to study how to reduce
the computational complexity, increase the efficiency as well as improve the accuracy
on these proposed algorithms and methods. We will study how to achieve a high
accuracy that 100% object and gesture recognition andevery objectin the scene will be
detected and tracked as well. For our wireless network model, we will try to improve
the compression ratio to be 200, its resolution to be 352288, and its transferring speed
to be 10Kbps-1.0Mbps. In addition, we use the TCP protocol to transfer video on the
LAN/Internet to ensure reliable transferring. Finally, we will design a friendly user
interface. Basically, the interface has the following menus: load (load a video file), play
(play the video file), track and transmit. These menus display different corresponding
2
-
8/12/2019 Project Proposal Networking
3/8
operations.
Research Design and Methods
For simplicity, we will make the following assumptions on the objects in the video: a
stationary backgroundwith three people keeping moving in the same directionall the
time; the images of the three people will not overlap each other at any time; for each
person, only two gestures are to be recognized.
Wedivide the whole system design into the following four parts:
1) Background extraction and image preprocessingThis part is the fundamental step to our following work. It provides an effective
means of segmenting objects moving in front of a static background. Other
preprocessing tasks such as binary image conversion and noise removal are also quite
necessary. We will use a standard graph-cut algorithm to produce a
foreground-background segmentation to reduce the error around segmented
fore-ground objects. The codes we will utilize and modify are from
http://maven.smith.edu/~nhowe/research/code/. These codes written in Matlab
include the following content:
(1)Load the video from an avi file;(2)Convert to HSV color space;(3)Generate Gaussian background model in hsv space for each pixel;(4)Do frame-by-frame differencing
(a)Find scaled deviation of this frame from background;(b)Compare with threshold to generate labeling;(c)Use graph cuts;(d)Find the largest connected component;
3
http://maven.smith.edu/~nhowe/research/code/http://maven.smith.edu/~nhowe/research/code/ -
8/12/2019 Project Proposal Networking
4/8
(5) End of extract foreground
2) Objective detection, matching and trackingInstead of the database based algorithm, we will take a relative simple template
matching. We choose Normalized Correlation as our main method in our project for
the reason that it is robust, accurate and it is one of the most commonly used
template matching criteria. Therefore, detection (recognition) and tracking are
combined and considered as a whole in our project.
(1) Our recognition and matching algorithm and other related content are described
asfollows:
1. Definition:
1.1 Template: a small image or sub-image, e.g.
1.2 Definition of C': For two images A and B, we want to find object matching
between the two. Define Correlation:
C' = yx
yxByxA
,
),(),(
1.3 Corrected definition of C: In our algorithm, we use the formulation
(Normalized Correlation):
4
-
8/12/2019 Project Proposal Networking
5/8
C =
yx yx
yx
ByxBAyxA
ByxBAyxA
, ,
2
1
22
,
))](),([)](),([(
)](),()][(),([
2. Goal:
The goal is to find occurrences of this template in a larger image. That is, find
matches of the template above in the image:
3. Algorithm Description:
3.1 Basic idea:
For each image coordinate (i, j), and the size of the template s, compute a
pixel-wise metric between the image and the template. Then record the
similarity. A match is based on the closest similarity measurement at each (i,
j).
3.2 Steps:
(a)Calculate C. The value the Normalized Correlation returns must be = the threshold, we consider it as
matching.
(2) Reference Codes 1--from a tracking-related project before.
These codes have the following aspects:
(a) Background extraction;
5
-
8/12/2019 Project Proposal Networking
6/8
(b) Object detection: to locate objects on the background and forming template
for each object
Firstly, the difference operation is taken between the coming frame and
background. Then change the images to binary images. Finally for each
objects, calculating their centers and size to form templates;
(c) Template matching: In this sample program, the matching algorithm is SAD
(sum of the absolute difference)
These codes are written in C. We borrow the idea of the different phases of
tracking. Also, we would see how the recognition and matching are
implemented so that we should be able to use Matlab code to implement our
corresponding functionalities. Rather than the SAD approach in the codes
above, we will use Normalized Correlation in our project instead.
(3) Reference Codes 2--from http://www.deas.harvard.edu/projects/weitzlab/matlab/
These codes are written in Matlab. The general tracking functions are used in
this program. The program includes the following content:
1) Successfully generated a set of files that contains the x, y positions2) Put the position list in the required format3) Invoke track with the right parametersFrom these codes, we learn how to use the tracking related functions in Matlab,
such as track(), track1(),and track2(), etc.
The two codes mentioned above have different functions. The first provides us
with the basic idea and the algorithm about recognition and matching. The second
6
http://www.deas.harvard.edu/projects/weitzlab/matlab/http://www.deas.harvard.edu/projects/weitzlab/matlab/ -
8/12/2019 Project Proposal Networking
7/8
shows how to use the Matlab functions to implement object detection and
tracking.
3) Video compression and transmissionFor this part, the necessity of video coding will be discussed, and the MPEG 4 series
will be reviewed. We also will study in detail the use of compression methods into
the MPEG 4 standard, such as DCT transform, motion estimation and compensation,
and variable length coding, etc. In addition,how the MPEG 4 video codec data is
transferred over the LAN or Internet (TCP/IP) will be introduced. We mainly use C
language to program the transmission part.
4) Intelligent human-machine interface designWe will develop a friendly user interface. The major developing tool is Matlab.
Staffing Plan
Ping Guo: Graduate student of Computer Science. Major contributor to background
extraction and noise removal
Yingxue Feng: Graduate student of Computer Science. Major contributor to object
detection,matching and tracking
Jianjun Yang: Graduate student of Computer Science. Major contributor to object
detection,matching and tracking,and user interface design
Ping Yi: Graduate student of Computer Science. Major contributor to video
coding/decoding and network transferring,and user interface design
Timeline
1. March 12-19 Making preparation for proposal on final projects. (1 week)
3. March 21-24 Background extraction and noise removal. (4 days)
4. March 25-April 1 Object detection, matching and tracking. (2 weeks)
5. April 13-19 Video compression and transmission. (1 week)
6. April 20-26 User Interface design and code debugging. (1 week)
Besides, we will strictly follow the project schedule the lecturer has proposed.
Reference
7
-
8/12/2019 Project Proposal Networking
8/8
[1] N. Howe & A. Deschamps, Better Foreground Segmentation through Graph Cuts
[2] Y. Boykov, O. Veksler, and R. Zabih, Efficient approximate energy minimization
via graph graph cuts, IEEE Transactions on Pattern Analysis and Machine
Intelligence, 20(12): 1222-1239, November 2001
[3] B.V. Cherkassky and A.V. Goldberg, On implementing the push-relabel method
for the maximum flow problem. Algorithmica, 19(4): 390-410, 1997
[4] Mircea Nicolescu, Grard Medioni, Mi-Suen Lee, Segmentation, tracking and
iterpretation using panoramic video, IEEE Workshop on Omnidirectional Vision,
Hilton Head, South Carolina, U.S.A., June 2000
[5] http://www.mathworks.com/
[6] http://arxiv.org/abs/cs.CV/0401017
[7] http://maven.smith.edu/~nhowe/research/code/
[8] Emanuele Trucco, Alessandro Verri, Introductory Technique for 3-D Computer
Vision, Fraunhofer-Institute
[9] David A. Forsyth, Jean Ponce, Computer Vision: A Modern Approach, Publishing
House of Electronics Industry, China
[10] Yongsheng Gao, Maylor K.H. Leung, Face Recognition Using Line Edge Map
[11] http://www.math.ntnu.no/~holden/FrontBook/matlabcode.html
[12] http://www.advancedsourcecode.com/faceverification.asp
[13] http://www.iii.org.tw/special/article/VideoTracking.htm
[14] http://www.deas.harvard.edu/projects/weitzlab/matlab/
[15] Project from Prof. Ruigang Yangs class, Computer Science Department at UK
[16] References provided by the lecturer
8