Project Proposal Networking

download Project Proposal Networking

of 8

Transcript of Project Proposal Networking

  • 8/12/2019 Project Proposal Networking

    1/8

    Video Based Tracking & Recognition via (Wireless) Network

    Transmission

    A Proposal Prepared for the

    Final Project of the Graduate CourseEE635: Digital Image Processing

    March 20, 2007

    Yingxue Feng* [email protected]

    Jianjun Yang [email protected]

    Ping Yi [email protected]

    Ping Guo [email protected]

    Department of Computer Science

    University of Kentucky

    Lexington, KY 40506-0046 USA

    * Team Leader

    1

    mailto:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]:[email protected]
  • 8/12/2019 Project Proposal Networking

    2/8

    Research Problem

    As the inexpensive computing power and image sensors become available, automated

    object tracking and recognition in live video have become an important research topic

    as well as application in the image processing and computer vision community in recent

    years. The results of tracking and recognition can be applied further to the areas of

    surveillance, vision-based human computer interaction and semantic indexing of video.

    Many approaches for video tracking have been proposed over the last few decades.

    Among these methods, Kalman filter, as an optimal recursive Bayesian filter for linear

    functions and Gaussian noise, is useful for object tracking. Its so powerful even for

    overlap tracking objects, while its too complicated. Sum of the Squared Difference is

    efficient but its not so accurate. For objective detection, the typical problem is that

    there have been a training set of images with N objects, given a new image F containing

    one of these N objects (database), can we find out which one F is? There are many

    papers on this issue. Network Transmission involves topics such as improving the

    compression rate, increasing the transmission speed, and reliable video coding/decoding

    and transferring.

    Research Goals and Objectives

    The main goal of our research focuses on how to carry out recognition and tacking of

    several moving objects in time based on simple background and how to provide reliable

    and speedy video transferring over the network. We are going to study how to reduce

    the computational complexity, increase the efficiency as well as improve the accuracy

    on these proposed algorithms and methods. We will study how to achieve a high

    accuracy that 100% object and gesture recognition andevery objectin the scene will be

    detected and tracked as well. For our wireless network model, we will try to improve

    the compression ratio to be 200, its resolution to be 352288, and its transferring speed

    to be 10Kbps-1.0Mbps. In addition, we use the TCP protocol to transfer video on the

    LAN/Internet to ensure reliable transferring. Finally, we will design a friendly user

    interface. Basically, the interface has the following menus: load (load a video file), play

    (play the video file), track and transmit. These menus display different corresponding

    2

  • 8/12/2019 Project Proposal Networking

    3/8

    operations.

    Research Design and Methods

    For simplicity, we will make the following assumptions on the objects in the video: a

    stationary backgroundwith three people keeping moving in the same directionall the

    time; the images of the three people will not overlap each other at any time; for each

    person, only two gestures are to be recognized.

    Wedivide the whole system design into the following four parts:

    1) Background extraction and image preprocessingThis part is the fundamental step to our following work. It provides an effective

    means of segmenting objects moving in front of a static background. Other

    preprocessing tasks such as binary image conversion and noise removal are also quite

    necessary. We will use a standard graph-cut algorithm to produce a

    foreground-background segmentation to reduce the error around segmented

    fore-ground objects. The codes we will utilize and modify are from

    http://maven.smith.edu/~nhowe/research/code/. These codes written in Matlab

    include the following content:

    (1)Load the video from an avi file;(2)Convert to HSV color space;(3)Generate Gaussian background model in hsv space for each pixel;(4)Do frame-by-frame differencing

    (a)Find scaled deviation of this frame from background;(b)Compare with threshold to generate labeling;(c)Use graph cuts;(d)Find the largest connected component;

    3

    http://maven.smith.edu/~nhowe/research/code/http://maven.smith.edu/~nhowe/research/code/
  • 8/12/2019 Project Proposal Networking

    4/8

    (5) End of extract foreground

    2) Objective detection, matching and trackingInstead of the database based algorithm, we will take a relative simple template

    matching. We choose Normalized Correlation as our main method in our project for

    the reason that it is robust, accurate and it is one of the most commonly used

    template matching criteria. Therefore, detection (recognition) and tracking are

    combined and considered as a whole in our project.

    (1) Our recognition and matching algorithm and other related content are described

    asfollows:

    1. Definition:

    1.1 Template: a small image or sub-image, e.g.

    1.2 Definition of C': For two images A and B, we want to find object matching

    between the two. Define Correlation:

    C' = yx

    yxByxA

    ,

    ),(),(

    1.3 Corrected definition of C: In our algorithm, we use the formulation

    (Normalized Correlation):

    4

  • 8/12/2019 Project Proposal Networking

    5/8

    C =

    yx yx

    yx

    ByxBAyxA

    ByxBAyxA

    , ,

    2

    1

    22

    ,

    ))](),([)](),([(

    )](),()][(),([

    2. Goal:

    The goal is to find occurrences of this template in a larger image. That is, find

    matches of the template above in the image:

    3. Algorithm Description:

    3.1 Basic idea:

    For each image coordinate (i, j), and the size of the template s, compute a

    pixel-wise metric between the image and the template. Then record the

    similarity. A match is based on the closest similarity measurement at each (i,

    j).

    3.2 Steps:

    (a)Calculate C. The value the Normalized Correlation returns must be = the threshold, we consider it as

    matching.

    (2) Reference Codes 1--from a tracking-related project before.

    These codes have the following aspects:

    (a) Background extraction;

    5

  • 8/12/2019 Project Proposal Networking

    6/8

    (b) Object detection: to locate objects on the background and forming template

    for each object

    Firstly, the difference operation is taken between the coming frame and

    background. Then change the images to binary images. Finally for each

    objects, calculating their centers and size to form templates;

    (c) Template matching: In this sample program, the matching algorithm is SAD

    (sum of the absolute difference)

    These codes are written in C. We borrow the idea of the different phases of

    tracking. Also, we would see how the recognition and matching are

    implemented so that we should be able to use Matlab code to implement our

    corresponding functionalities. Rather than the SAD approach in the codes

    above, we will use Normalized Correlation in our project instead.

    (3) Reference Codes 2--from http://www.deas.harvard.edu/projects/weitzlab/matlab/

    These codes are written in Matlab. The general tracking functions are used in

    this program. The program includes the following content:

    1) Successfully generated a set of files that contains the x, y positions2) Put the position list in the required format3) Invoke track with the right parametersFrom these codes, we learn how to use the tracking related functions in Matlab,

    such as track(), track1(),and track2(), etc.

    The two codes mentioned above have different functions. The first provides us

    with the basic idea and the algorithm about recognition and matching. The second

    6

    http://www.deas.harvard.edu/projects/weitzlab/matlab/http://www.deas.harvard.edu/projects/weitzlab/matlab/
  • 8/12/2019 Project Proposal Networking

    7/8

    shows how to use the Matlab functions to implement object detection and

    tracking.

    3) Video compression and transmissionFor this part, the necessity of video coding will be discussed, and the MPEG 4 series

    will be reviewed. We also will study in detail the use of compression methods into

    the MPEG 4 standard, such as DCT transform, motion estimation and compensation,

    and variable length coding, etc. In addition,how the MPEG 4 video codec data is

    transferred over the LAN or Internet (TCP/IP) will be introduced. We mainly use C

    language to program the transmission part.

    4) Intelligent human-machine interface designWe will develop a friendly user interface. The major developing tool is Matlab.

    Staffing Plan

    Ping Guo: Graduate student of Computer Science. Major contributor to background

    extraction and noise removal

    Yingxue Feng: Graduate student of Computer Science. Major contributor to object

    detection,matching and tracking

    Jianjun Yang: Graduate student of Computer Science. Major contributor to object

    detection,matching and tracking,and user interface design

    Ping Yi: Graduate student of Computer Science. Major contributor to video

    coding/decoding and network transferring,and user interface design

    Timeline

    1. March 12-19 Making preparation for proposal on final projects. (1 week)

    3. March 21-24 Background extraction and noise removal. (4 days)

    4. March 25-April 1 Object detection, matching and tracking. (2 weeks)

    5. April 13-19 Video compression and transmission. (1 week)

    6. April 20-26 User Interface design and code debugging. (1 week)

    Besides, we will strictly follow the project schedule the lecturer has proposed.

    Reference

    7

  • 8/12/2019 Project Proposal Networking

    8/8

    [1] N. Howe & A. Deschamps, Better Foreground Segmentation through Graph Cuts

    [2] Y. Boykov, O. Veksler, and R. Zabih, Efficient approximate energy minimization

    via graph graph cuts, IEEE Transactions on Pattern Analysis and Machine

    Intelligence, 20(12): 1222-1239, November 2001

    [3] B.V. Cherkassky and A.V. Goldberg, On implementing the push-relabel method

    for the maximum flow problem. Algorithmica, 19(4): 390-410, 1997

    [4] Mircea Nicolescu, Grard Medioni, Mi-Suen Lee, Segmentation, tracking and

    iterpretation using panoramic video, IEEE Workshop on Omnidirectional Vision,

    Hilton Head, South Carolina, U.S.A., June 2000

    [5] http://www.mathworks.com/

    [6] http://arxiv.org/abs/cs.CV/0401017

    [7] http://maven.smith.edu/~nhowe/research/code/

    [8] Emanuele Trucco, Alessandro Verri, Introductory Technique for 3-D Computer

    Vision, Fraunhofer-Institute

    [9] David A. Forsyth, Jean Ponce, Computer Vision: A Modern Approach, Publishing

    House of Electronics Industry, China

    [10] Yongsheng Gao, Maylor K.H. Leung, Face Recognition Using Line Edge Map

    [11] http://www.math.ntnu.no/~holden/FrontBook/matlabcode.html

    [12] http://www.advancedsourcecode.com/faceverification.asp

    [13] http://www.iii.org.tw/special/article/VideoTracking.htm

    [14] http://www.deas.harvard.edu/projects/weitzlab/matlab/

    [15] Project from Prof. Ruigang Yangs class, Computer Science Department at UK

    [16] References provided by the lecturer

    8