Structural information

51
Structural information

description

Structural information. Structural information. Structural information deals with geometry of objects. We are able to deal with very limited amounts of structural information. How to interpret structural information ? We were showing b efore that this is difficult problem - PowerPoint PPT Presentation

Transcript of Structural information

Page 1: Structural information

Structural information

Page 2: Structural information

Structural information

• Structural information deals with geometry of objects

We are able to deal with very limited amounts of structural information

How to interpret structural information? We were showingbefore that this is difficult problem

We will introduce this by SHAPE CONTEXT method

Page 3: Structural information

We take now a very difficult case

Handwriting is very difficult:We recognizenumbers easily even if they are very distorted.What are the algorithms achieving this?

Page 4: Structural information

We think that first the contour of object is detectedas illustrated below

Page 5: Structural information

Next we think that location of points on the contour decide about the geometry of the object

• We need thus to measure the location of EACH contour point RELATIVE to all other points. In other words we need vectors from a point to all other points.

For example for point Z we need all 6 red vectors. Having all vectors for all pointsdescribes the object but is very complicated

Z

Page 6: Structural information

So now we reduce the description by using APPROXIMATEpolar coordinate net. The center of the net is located at each point at we only count HOW MANY other pointsare in each area of the net.

Page 7: Structural information

Shape histogram

• Shape histogram of a contour point ai is denoted by Hi and it is a vector obtained from the polar net by counting the number of points in each area

Hi = {hin=(#points in bin b), 0<k<M}

For a contour with M points we obtain a list

of m histograms.

Two contours are similar if the sum of

differences between the histograms is small.

Page 8: Structural information

Histogram differences

Hi - Hj =

m

kji kHkH

1

)()(

These are differences for two points i, jTaking differences for all contour points will result in the difference between contours. Two contours which are ver ysimilar will have very small difference

Page 9: Structural information

Example: Below we can see contours with point marked examples of histograms for points

Page 10: Structural information

Example: Here we see handwritten numbers and histograms of contour points marked in grey levels

Page 11: Structural information

Here we can see contours with points and the polarnet with areas marked in different colours

What counts is the number of points in each area and this forms histogram

Page 12: Structural information

Other methods - examples

• There are hundreds of other methods for

object retrieval and recognition

It is impossible to lecture about all of them since they are based on different principles.

To illustrate this we can look into an example of a best method known currently. This the method of eigenfaces which uses completely different principle.

Page 13: Structural information

EIGENFACES – global method

1. Construction of Face Space

Suppose a face image consists of N pixels, so it can be represented by a vector   of dimension N. Let                   be the training set of face images. The average face of these M images is given by

Then each face differs from the average face by :

Page 14: Structural information

EIGENFACES

Now covariance matrix of the training images can be constructed:

where The basis vectors of the face space, i.e., the eigenfaces, are then the orthogonal eigenvectors of the covariance matrix   .

The number of training images is usually less than the number of pixels in an image, there will be only M-1, instead of N, meaningful eigenvectors .

Page 15: Structural information

Eigenvalues, eigenvectors

x is eigenvector for matrix A, is eigenvalue

B = SAS-1

If S is an nonsingular n x n matrix then matrix B has the sameeigenvalues

nxn matrix has n eigenvalues

Page 16: Structural information

EIGENFACES

Therefore, the eigenfaces are computed by first finding the eigenvectors,                  , of the M by M matrix L:

The eigenvectors,                  , of the matrix   are then expressed by a linear combination of the difference face images,                  , weighted by :

In practice, a smaller set of M'(M'<M) eigenfaces is sufficient for face identification. Hence, only M' significant eigenvectors of L, corresponding to the largest M' eigenvalues, are selected for the eigenface computation

Page 17: Structural information

Thus further data compression can be obtained. M' is determined by a threshold,   , of the ratio of the eigenvalue summation:

In the training stage, the face of each known individual,    , is projected into the face space and an M'-dimensional vector,    , is obtained:

where    is the number of face classes

Page 18: Structural information

A distance threshold, , that defines the maximum allowable distance from a face class as well as from the face space, is set up by computing half the largest distance between any two face classes:

In the recognition stage, a new image, , is projected into the face space to obtain a vector, :

The distance of to each face class is defined by

Page 19: Structural information

For the purpose of discriminating between face images and non-face like images, the distance, , between the original image,   , and its reconstructed image from the eigenface space,    , is also computed:

where

These distances are compared with the threshold given in equation (8) and the input image is classified by the following rules: •IF THEN input image is not a face image; •IF AND THEN input image contains an unknown face; •IF AND THEN input image contains the face of individual .

Page 20: Structural information

Experimental results

The eigenface-based face recognition method was tested on the ORL face database. 150 images of 15 individuals, were selected for experiments.

Page 21: Structural information

In the training stage, three images of each individual were used as the training samples, forming a training set totalling 45 images

The average face of the training set

Experimental results

Page 22: Structural information

The first 15 eigenfaces corresponding to the 15 largest eigenvalues.

Experimental results

Page 23: Structural information

Recognition rate depends on training images – when single view images are used for training recognition is much worse

Recognition rate

Experimental results

Page 24: Structural information

Faces with calm expressions in the training stage and faces of the same individual but with various expressions in the testing stage

Training images

Test images

lower imagesare projectionsin the face space

Experimental results

Page 25: Structural information

CONCLUSIONS

Eigenfaces method treat images globally, no localinformation is used. Compression is done on global level. The method requires lots of computations but results are good.

Explanation of good results:

images are represented as combinations of ”simple” imagesand the system is trained on them.

Page 26: Structural information

• THERE ARE MANY OTHER METHODS FOR OBJECT RECOGNITION AND REPRESENTATION. THEY CAN BE CLASSIFIED AS

- STRUCTURAL DESCRIPTIONS (WE MENTIONED ALREADY CHAIN CODES)

- TRANSFORM METHODS- TRAINING/LEARNING METHODS

BUT THERE ARE ALSO METHODBASED ON CLEVER TRICKS WHICHWORK VERY WELL… NEXT

Page 27: Structural information

• A TRANSFORM METHOD

HERE WE TRY TO TRANSFORM THE

PICTURE (OR OBJECT INFORMATION)

TO SOME OTHER DOMAIN TO GET

INFORMATION IN MORE CONVENIENT

FORM.

Page 28: Structural information

• THE METHOD OF MOMENTS

MOMENTS of ORDER p,q ARE DEFINED AS

....2,1,0,

),(

qp

dxdyyxfyxm qppq

MOMENT OF ORDER 1 FOR PHYSICAL OBJECTS WILL BE CENTER OF GRAVITY,IT IS OF COURSE NOT DEPENDENT HOW THE OBJECTIS LOCATED - IT IS THUS INVARIANT FOR LOCATION

Page 29: Structural information

• CENTRAL MOMENTS

x y

qppq

qppq

yxfyyxx

imagesdigitalfor

m

my

m

mx

where

qp

dxdyyxfyyxx

),()()(

,

....2,1,0,

),()()(

00

01

00

10

Page 30: Structural information

• HIGHER ORDER CENTRAL MOMENTS

102020 mxm

010202 mym

210203030 23 xmmxm

21002111212 22 ymmxmym

20120111221 22 xmmymxm

... AND SO ON...

101111 mym

Page 31: Structural information

• NEXT, NORMALIZED CENTRAL MOMENTS

ARE CREATED:

00

pqpq

AND INVARIANT MOMENTS:

02201 211

202202 4)(

OTHER MOMENTS ,....,, 543 CAN BE DEFINED TOO

Page 32: Structural information

• THESE MOMENTS ARE INVARIANT FOR

TRANSLATION, ROTATION, AND SCALE

CHANGE

THUS WHEN MOMENTS ARE CALCULATED, THEY WILL NOT CHANGE

WHEN OBJECT ROTATES OR CHANGES

SIZE. THIS IS VERY DESIRABLE FEATURE.

HOWEVER, MOMENTS ARE SENSITIVE FOR

NOISE AND ILLUMINATION CHANGE

Page 33: Structural information

• EXAMPLE: ROTATED AND SCALED OBJECT

HERE MOMENTS CALCULATION IS SHOWN, PLEASE NOTED THAT FOR TRANSFORMED PICTURE THEMOMENTS ARE CONSTANT

Page 34: Structural information

• PRACTICAL METHODS FOR DEALING WITH VISUAL OBJECTS:

- THEY ARE BASED ON SOME TRICKS

WHICH RESULT THAT THEY WORK

VERY WELL FOR SPECIFIC PROBLEM BUT THEY ARE NOT GENERAL

WE ILLUSTRATE THIS ON EXAMPLE OF PRACTICAL FACE TRACKING SYSTEM

Page 35: Structural information

• WHAT IS FACE TRACKING?

THERE IS CAMERA IN FRONT OF PC

AND SOFTWARE WHICH ALLOWS TO MARK THE FACE LOCATION AND POSITION OF USER SITTING AT THE DISPLAY

HERE WE DESCRIBE A METHOD AND SYSTEM FOR

FACE TRACKING WHICH IS QUITE SIMPLE,

ROBUST AND RUNS IN REAL TIME ON PC!

THE METHOD IS BASED ON FACE COLOR

HISTOGRAM STATISTICS AND MOMENTS

Page 36: Structural information

HERE IS THE BLOCK

DIAGRAM OF FACE TRACKING

ALGORITHM.

FIRST THE COLOR IMAGE IS CONVERTEDTO HUE, SATURATION, INTENSITY.NEXT SKIN COLOR HISTOGRAM IS CALCULATEDFINALLY MOMENTS ARE CALCULATED AD WINDOWSIZE IS ADJUSTEDITERATIVELY

Page 37: Structural information

• SKIN COLOR HISTOGRAM

COLOR = HUE IN THE HSI

REPRESENTATION

PEOPLE HAVE THE SAMESKIN COLOR (HUE) ONLY SATURATION IS DIFFERENT

SATURATIONLEVELS CHANGE

HERE IS THE DISTIRBUTIONOF PLACES CORRESPONDINGTO FACE ”COLOR”

Page 38: Structural information

FIRST WE SELECT WINDOW OF CERTAIN SIZE.NEXT CALCULATE ZEROTH AND FIRST MOMENTS IN THIS WINDOW

x y

yxIm ),(00

x yx y

yxyImyxxIm ),(),( 0110

COLOR IS GOOD FEATURE IF WE HAVE A COLOR CAMERA.HAVING FACE COLOR DISTRIBUTION WE CAN TREATIT AS TWO-DIMENSIONAL FUNCTION I(x,y) AND CALCULATE:

Page 39: Structural information

NEXT NEW CENTER OF THE WINDOW

IS CALCULATED

00

01

00

10

m

my

m

mx cc

AFTER ITERATING THIS CALCULATIONTHE ALGORITHM WILL CONVERGE TOSPECIFIC POSITION

Page 40: Structural information

HOW THE WINDOW SIZE IS SELECTED?

IT DEPENDS ON THE SIZE OF FACE.

THUS IT IS ADJUSTED ITERATIVELY

STARTING WITH SIZE 3

WE THEN SELECT WINDOW SIZE TO BE

2m0/max pixel value

BY THIS, THE WINDOW POSITION AND SIZE

IS CONTINUOUSLY ADAPTED UNTIL

IT WILL STABILIZE

THIS CAN THUS BE USED FOR FACE

TRACKING

Page 41: Structural information

• THIS PROCESS IS

ILLUSTRATED HERE , START IS FROM SMALL

WINDOW SIZE, THE SIZE IS

ADJUSTED AND CENTER OF THE WINDOW IS MOVED UNTIL IT STABILIZES

HERE THE FACE HAS MOVED,IN THE NEXT PICTURE THEWINDOW WILL ALSO MOVETO NEW POSITION

Page 42: Structural information

• THIS ALGORITHM IS SURPRISINGLY

ROBUST

NOISE DOES NOT HARM IT

AND AS WE CAN SEEIT IS ROBUST AGAINSTDISTRACTORS:ANOTHER FACE ONTHE LEFTHAND ON THE RIGHT

Page 43: Structural information

• THE METHOD CAN BE ALSO USED FOR

EVALUATION OF HEAD ROLL, WIDTH

AND LENGTH

ROLL

Page 44: Structural information

• PARAMETERS FOR HEAD POSITION

CAN BE CALCULATED BASED ON THE

SYMMETRY OF LENGTH L AND WIDTH W

Page 45: Structural information

THIS SYSTEM CAN BE USED FOR FACE

TRACKING E.G. FOR INTERFACE TO

COMPUTER GAMES

Page 46: Structural information

ANOTHER EXAMPLE:AMBULATORY VIDEO

COMPUTER WITH CAMERA

WEARABLE BY USER

Page 47: Structural information

THE GOAL IS TO BUILD COMPUTER

WHICH WILL KNOW WHERE THE USER IS

The user is wearing small camera attached e.g. to

head. The camera produces circular picture

which are not very good but good enough

Page 48: Structural information

HOW TO RECOGNIZE WHERE THE USER IS ? (E.G. ROOM, STREET)

FIRST, SPLIT VIDEO INTO LIGHT

INTENSITY I AND CHROMINANCES IN

VERY APPROXIMATE WAY:

I=R+G+B Cr=R/I Cg=G/I

SECOND, SEGEMENT THEPICTURE INTO REGIONS,CALCULATE PARAMETERS FOR EACH,MEAN AND COVARIANCE

Page 49: Structural information

• FOR EACH ENVIRONMENT

THERE WILL BE DIFFERENT STATISTICAL

DISTRIBUTIONS OF SIGNALS , WE CAN USE

THEM TO FIND TO WHICH CLASS

RECORDED VIDEO BELONGS

Page 50: Structural information

FOR 2 HOURS OF RECORDING

RESULTS ARE VERY GOOD

Label Correlation Coeff.

Office 0.9124

Lobby 0.7914

Bedroom

0.8620

Cashier

0.8325

Page 51: Structural information

OVERALL CONCLUSION

• WE ARE LACKING GENERAL SOLUTION TO OBJECT REPRESENTATION AND RECOGNITION PROBLEMS WHICH WOULD BE AS EFFECTIVE THE BIOLOGICAL SYSTEMS

• THERE ARE MANY APPROACHES FOR SOLUTION, WE PRESENTED APPROACH BASED ON STATISTICS OF QUANTIZED BLOCK TRANSFORM FEATURES

• THERE ARE APPROACHES BASED ON CLEVER TRICKS WHICH WORK WELL FOR SPECIFIC PROBLEMS