1 Formation et Analyse d’Images Session 12 Daniela Hall 16 January 2006.

1

Formation et Analyse d’ImagesSession 12

Daniela Hall

16 January 2006

2

Course Overview

• Session 1 (19/09/05)– Overview– Human vision – Homogenous coordinates– Camera models

• Session 2 (26/09/05)– Tensor notation– Image transformations– Homography computation

• Session 3 (3/10/05)– Camera calibration– Reflection models– Color spaces

• Session 4 (10/10/05)– Pixel based image analysis

• 17/10/05 course is replaced by Modelisation surfacique

3

Course overview

• Session 5 + 6 (24/10/05) 9:45 – 12:45– Contrast description– Hough transform

• Session 7 (7/11/05)– Kalman filter

• Session 8 (14/11/05)– Tracking of regions, pixels, and lines

• Session 9 (21/11/05)– Gaussian filter operators

• Session 10 (5/12/05)– Scale Space

• Session 11 (12/12/05)– Stereo vision – Epipolar geometry

• Session 12 (16/01/06): exercises and questions

4

Exam

• Date: to be defined

• Duration: to be defined (last year it was 3h)

• Documents needed for the exam– Class notes– Pocket calculator– Kalman tutorial– Isard, Blake: Active Contours, chap 12.1, 12.2

5

Exercises

6

Exercises

7

Exercises

8

Exercises

9

Exercises

• You have a camera that observes a corridor.• People can enter at the left or the right of the image.• Your task is to count the number of people that walk by. • What approach do you propose?

10

Exercise

• How can you count the number of flowers in the image and determine their scale?

11

Rectifying images

• You need to display the image on the paper display. You have a steerable video projector and a camera. How do you proceed?

12

Exercise

How can you automatically count the number of objects in the image?

13

Robust tracking of objects

Trigger regions

Detection New targets

List of targets

PredictList of predictions

Correct

Detection

Measurements

14

Detection methods• Background differencing

– used to detect targets that have different color than the background– detection image is the difference between the current image and the background image. – Measure the energy of the detection image– If the energy is above a threshold, a target is detected.– The position of the new target is described by first and second moment of the thresholded detection image.

• Image differencing– used to detect moving targets. The detection image contains only the borders of the target.– detection image is the difference between the current image and the previous image.– Measure the energy of the detection image– If the energy is above a threshold, a target is detected.– The position of the new target is more difficult to describe, because detection image contains only the borders

of the object.• Color histogram for detection

– make a color histogram of the empty detection region Htot– at each frame make a color histogram of the detection region Hobj– if the sum_i Hobj(i)/Htot(i) > threshold, a target is detected.– make a detection image where each pixel is marked by Hobj(i)/Htot(i)– The position of the new target is described by first and second moment of the thresholded detection image.

15

BG differencing Img differencing

16

Session overview

1. Tracking of point neighborhoods1. using SSD

2. using CC and NCC

3. using Gaussian receptive fields

17

Tracking of point neighborhoods

• When we have additive noise, the euclidean norm is the optimal method for neighborhood matching, because it minimises the error probability.

• Goal: which position (i,j) of the image I(i,j) is the most similar to the pattern X(i,j).

• Hypotheses:– additive Gaussian noise– No image rotation (2D)– No rotation in space (3D)– No scale changes

• The euclidean norm is known as SSD (« sum of squared distances »)• The method is efficient and precise, but sensible.

18

Sum of squared distances (SSD)

• Definition:– Let X(m,n) be the pattern with 0<m<M-1, 0<n<N-1

– Let I(i,j) be the image with 0<i<I-1, 0<j<J-1, (M<<I, N<<J)

• The position (i,j) of the image I(i,j) is the most similar to the pattern X(i,j) is computed as

1

0

1

0

222

2

22

)),(),((

),(),(),(M

m

N

n

NM

NM

nmXnjmiP

nmXnjmiPjiSSD

1

0

1

0

222,

)),(),((minargM

m

N

n

NM

jinmXnjmiP

19

Sum of squared distances

• Searching a pattern X(m,n) within an image I(i,j) corresponds to placing the pattern at all possible positions (i,j) and computing the SSD(i,j).

• Depending on the size of the pattern and the image, this can be costly.

• SSD is sensitive to rotations, scale and noise.

20

Pattern as a feature vector

• Any image patch can be seen as a vector.

• To transform an 2D image patch to a vector, you need to concatenate the lines one after another. For an image of size MxN, you obtain a vector with M*N dimensions.

mnMkXXnmX k with ,),(

21

SSD using feature vectors

• Transform the pattern X(m,n) and the neighborhood of size MxN at the position (i,j) of the image I to vectors.

• SSD is the norm of the difference of these two vectors.

1

0

22

))()((),(MN

m

mXmPPXjiSSD

22

Cross Correlation (CC)

• Another method for pattern matching is cross correlation (scalar product). The best match is characterised by maximising the product.

• In the case of normalised vectors, the scalar product is the cos of the angles between the vectors. This is the definition of the normalised cross correlation (NCC). -1 <NCC<1

1

0

))()((,),(MN

m

mXmPPXjiCC

X

mX

P

mPPXjiNCC

X

X

X

XX

MN

muu

MN

mm

ku

)()(,),(

)(

1

0

12

23

Relation of SSD and NCC

• The best match minimises SSD and maximises NCC.

• We note:

uuuu

uuuuuu

PXPX

PPXXPX

,221,21

2222

24

Tracking by correlation

• Computation time of tracking by correlation depends on the size of the pattern (target) and the size of the image.

• When all possible positions in the image are tested, this is slow.

• How can we optimise tracking by correlation (reduce the computation time):– Reduce the number of tests by testing only one position out of

two. Increases speed by 4, reduces precision of the result. Problem: if too little positions are tested, the target might be missed.

– Reduce the number of tests by restricting the search to a small search region (region of interest, ROI).

25

Speed up of tracking

• The search region can be determined from the position of the target at time t-1 and its maximum speed. This is measured in pixels/delta t.

• If we can reduce the search region, we can process more images (reduce delta t), which allows us to reduce the search region more, ....

• Problem: speed depends on the distance of the object to the camera. Close objects have higher speeds than objects far away.

26

Example

27

Example

• Person traverses entry hall in 5.2s (130 frames*25frames/s)

• Distance is 288 pixels, target size is 45x35 pixels• Speed 55.4pixels/s• Let maximum speed be twice the measured speed

110.8pixels/s• Then we need a search region of size target size +

• ROI = target size +/- 4.4 pixels = 54 x 44 pixels.

framepixels

sframes

spixels

4.425

8.110

28

Example

• Number of tests exhaustive search (searching whole image of size 384x288 pixels)

• (384-45)(288-35)=85767 tests• Number of tests using search region (54x44

pixels)• (54-45)(44-35)=81 tests• Speed up factor 85767/81= 1090

1 Formation et Analyse d’Images Session 12 Daniela Hall 16 January 2006.

Documents

Transcript of 1 Formation et Analyse d’Images Session 12 Daniela Hall 16 January 2006.