On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy...

43
On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology National Research Council Canada www.cv.iit.nrc.ca/~dmitry www.perceptual-vision.com

Transcript of On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy...

Page 1: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

On Facial Recognition in Video

VIVA Seminar, U. of Ottawa, August 28, 2003

Dr. Dmitry GorodnichyComputational Video Group

Institute for Information Technology National Research Council Canada

www.cv.iit.nrc.ca/~dmitry

www.perceptual-vision.com

Page 2: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

2. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Internet, tendencias & tecnología

La nariz utilizada como mouse

En el Instituto de Tecnología de la Información, en Canadá, se desarrolló un sistema llamado Nouse que permite manejar softwares con movimientos del rostro. El creador de este programa, Dmitry Gorodnichy, explicó vía e-mail a LA NACION LINE cómo funciona y cuáles son sus utilidades

Si desea acceder a más información, contenidos relacionados, material audiovisual y opiniones de nuestros lectores ingrese en : http://www.lanacion.com.ar/03/05/21/dg_497588.asp Copyright S. A. LA NACION 2003. Todos los derechos reservados.

Page 3: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

3. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Talk overview

• On vision: – computer and biological ones

• On variety & hierarchy of facial recognition tasks in video:– FD, FT, FL, FR, and other F*

• On face detection and tracking (FD, FT) [AVBPA’03] - demo

– Good colour spaces and skin models– Power and deficiency of wavelets

• On face localization (FL) - demo

– Power of convex-shape tracking [FGR’02] . NouseTM

• On motion / change detection:– Non-linear change detection– Second-order change detection [VIIP’03]. Eye blink detection

• Eye-centered canonical face representation [AVBPA’03]

• On memorizing and recognizing faces using associative memory- demo

• Perceptual Vision System: applications, licenses and further development

Page 4: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

4. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

What makes video special ?

Constraints:

- Real-time processing is required.

- Low resolution: 160x120 images or mpeg-decoded.

- Low-quality: week exposure, blurriness, cheap lenses.

Importance:

- Video is becoming ubiquitous. Cameras are everywhere.

- For security, computer–human interaction, video-conferencing, entertainment …

Essence:

- It is inherently dynamic!

- It has parallels with biological vision!

NB: Living organisms are very successful in tracking, detection and

recognition.

Page 5: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

5. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

The way nature does it

• Try to recognize this face

Page 6: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

6. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

The way nature does it (cntd)

• What about this one?

Page 7: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

7. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Localization first. Then recognition

What did you do? – - First you detected face-looking regions.

- Then, if they were too small or badly orientated, you did nothing.

- Otherwise, you turned your face – right?

- …to align your eyes with the eyes in the picture.

- …since this was the coordinate system in which you stored the face.

• This is what biological vision does.

– Localization (and tracking) of the object precedes its recognition

– These tasks are performed by two different parts of visual cortex.

So, why computer vision should not do the same?

Page 8: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

8. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

These mesmerizing eyes

Did you notice that you’ve started examining this

Slide by looking at the eyes (or circles) at left?- These pictured are sold commercially to capture infants attention.

Now imagine that the eyes blinked …

- For sure you’ll be looking at them!

No wonder, animals and humans look at each other’s eyes.

- This is apart from psychological reasons.

Besides, there two of them, which creates a

hypnotic effect

(which is due to the fact that the saliency of a pixel just attended is

inhibited to avoid attending it again soon.)

Page 9: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

9. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Eyes are the reference

• Eyes are the most salient features on a face.

• Besides, there two of them, which (besides creating a hypnotic affect )

makes the excellent reference frame out of them

• Finally, they also the best (and the only) stable landmarks on a face which

can be used a reference.

Intra-ocular distance (IOD) makes

a very convenient unit of measurement!

Page 10: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

10. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Lessons from biological vision

• Images are of very low resolution except in the fixation point.

• Brain process the sequences of images rather than one image. - Bad quality

of images is compensated by the abundance of images.

• The eyes look at points which attract visual attention.

• Saliency is: in a) motion, b) colour, c) disparitydisparity, d) intensity.

• These channels are processed independently in brain

(Think of a frog catching a fly or a bull running on a torero)

• Bottom-up (image driven) visual attention is very fast and precedes top-

down (goal-driven) attention: 25ms vs 1sec.

Page 11: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

11. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Lessons from biological vision (cntd)

• Intensity means: frequencies, orientation, gradient .

• Animals & humans perceive colour non-linearly.

• Colour & motion are used for segmentation.

• Intensity is used for recognition.

Page 12: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

12. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Hierarchy of Face Processing Tasks

““Something yellow moves”Something yellow moves”

Face Segmentation

Facial Event Recognition

Face Memorization

Face Detection

Face Tracking(crude)

Face Classification

Face Localization(precise)

Face Identification

“It’s a face”

“It’s at (x,y,z,”

“Lets follow it!”

“It’s face of a child”“S/he smiles, blinks”

“Face unknown. Store it!” “It’s Mila!”

“I look and see…”

Page 13: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

13. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Applicability of 160x120 video

• According to face anthropometrics

• Tested with

Perceptual User interfaces

Face size

½ image ¼ image 1/8 image 1/16 image

In pixels 80x80 40x40 20x20 10x10

Between eyes-IOD 40 20 10 5

Eye size 20 10 5 2

Nose size 10 5 - -

FS b

FD b -

FT b -

FL b - -

FER b -

FC b -

FM / FI - -

– goodb – barely applicable

- – not good

Page 14: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

14. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Face Detection and Tracking

Page 15: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

15. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Face Detection and Tracking (lights off)

Page 16: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

16. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Face Detection and Tracking (lights on)

Page 17: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

17. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Face Localization

• Global facial cues: skin colour, head shape, head motion– good for FD & FT– not good for FL

• Local features have to be used for FL.

• Features are conventionally thought of as visually distinctive (ie with large DI(f) ).– Hence, the commonly used features are edge-based:

corners of brows, eyes, lips, nostrils etc.– They however are

• not robust • not always visible

Solution: Convex-shape features

Page 18: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

18. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

NouseTM (Nose as Mouse) Technology

Just think of your nose as a chalk or a joystick handle!

"It is a convincing demonstration of the potential uses of cameras as natural interfaces." - The Industrial Physicist, Feb. 2003

• Based on tracking the convex-shape nose feature.

• Head motion- and scale- invariant & sub-pixel precision

• Enables precise hand-free 2D control

• Allows aiming and drawing with the nose.

Page 19: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

19. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

NouseTM : range of tracking & precision

‘No’ motion

‘Yes’ motion

Robustness to rotation Robustness to scale

Test: The user rotates his head only! (the shoulders do not move)

Page 20: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

20. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

NouseTM : range and speed of tracking

Page 21: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

21. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

NouseTM for Stereotracking

Page 22: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

22. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Second-order change detection

• Detecting change in a change [Gorodnichy’03]

• Non-linear change detection deals with changes due illumination changes [ Durucan’02]

Page 23: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

23. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Eye Blink Detection

• Previously very difficult in moving heads

• With second-order change detection became possible

• Is currently used to enable people with brain injury face-to-face communication [AAATE’03]

Page 24: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

24. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Building eye-centered face models

Surprised to see how nicely eyes divide a face in equal blocks?

But it’s true - Tested with 1500 faces from BioID face database and multiple

experiments with perceptual user interfaces [Nouse’02, BlinkDet’03].

2. IOD

.IOD

24

2.IODAnthropometrics of face:

Page 25: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

25. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Which part of the face is the most informative?What is the minimal size of a recognizable face?

1. By studying previous work: [CMU, MIT, UIUC, Fraunhofer, MERL, …]

2. By examining averaged faces:

3. By computing statistical relationship between face pixels in 1500 faces from the BioID Face Database:

9x9 12x12 16x16 24x24

Size 24 x 24 is sufficient for face memorization & recognition and is optimal for low-quality video and for fast processing.

Page 26: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

26. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Eye-centered face models

Canonical face model suitable for on-line Face Memorization and

Recognition in video

d

24

2. .IOD

Procedure: after the eyes are located, the face is extracted from video and resized to the canonical 24x24 form, in which it is memorized or recognized.

Canonical face model suitable for Face Recognition in documents [Identix’02]

Page 27: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

27. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Canonical eye-centered face model

Page 28: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

28. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Recognition with Associative Memory

• When the eyes are detected, and a face is converted to a

canonical representation, it is easy to memorize to recognize

• We use Pseudo-Inverse Associative Memory for on-line

memorization and storing of faces in video.

• Converting 24x24 face to binary feature vector:

A) Vi =Ii - Iave ,

B ) Vi,j =sign(Ii - Ij ),

C ) Vi,j =Viola(i,j,k,l ),

D ) Vi,j =Haar(i,j,k,l )

Page 29: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

29. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

What is Memory?

The one that

stores data:

and

retrieves it:

Page 30: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

30. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Neural Network - a tool to do that

• A fully connected network of N neurons Yi, which evolves in time according to the update rule:

until it reaches a stable state (attractor).

• Patterns can be stored as attractors

-> Non-iterative learning - fast learning

-> Process only active neurons - fast retrieval

How to find the best weight matrix C ?How to find the best weight matrix C ?

Page 31: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

31. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Pseudo-Inverse Learning Rule

• Obtained from stability condition CV=eV

C = VV+ or

(Widrow-Hoff’s rule is its approximation)

• • Its dynamics can be studied theoretically.

C=VV+, Cii=D*Cii, D=0.15 Complete retrieval from 8% noise of M=0.5N patterns from 2% noise of M=0.7N patterns

Page 32: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

32. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Pseudo-Inversse Associative MemoryWhat's in name or what it is

These neural networks are referred to as

•pseudo-inverse networks - for using Moore-Penrose pseudoinverse V+ in computing the synapces

•projection networks - for synaptic (weight) matrix C=VV+ being the projection

•Hopfield-like networks - for being binary and fully-connected in the stage of learning

•recurrent networks - for evolving in time, based on external input and internal memory

•attractor-based networks - for storing patterns as attractors (i.e. stable states of the network)

•dynamic systems - for allowing the dynamic systems theory to be applied

•associative memory - for being able to memorize, recall and  forget patterns, just as much as humans do.

Page 33: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

33. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Pseudo-Inversse Associative MemoryEight reasons to like PINN memory

• It evolved from the network of formal neurons as defined by Hebb in 1949 and has many parallels with biological memory mechanisms.

• It provides an analytical (close-form) solution, the approximation of which is a well-known Widrow-Hoff delta learning rule.

• The performance of the network can be very nicely analytically examined and improved.

• It can update the memory on fly in real time, storing patterns with a non-iterative learning rule.

• It can store continuous stream of data.

• All this makes the network very suitable for on-line memorization and recognition, and also for preprocessing binary patterns, as noise removing filter.

• Finally, there's a free code which you can compile and try yourself!

At PINN website: www.cv.iit.nrc.ca/~dmitry/pinn

Page 34: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

34. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Perceptual Vision System

Goal: To detect, track and recognize

face and facial movements of the

user.

x y , z PUI

monitor

binary eventON

OFFrecognition /

memorizationUnknown User!

Page 35: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

35. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Multi-channel video processing framework

colour calibration

face identificationface identification

nose tracking (precise)

blink detection

face tracking (crude)

face detection

“click” event x y ( z )

d ep th in te n s ity co lo ur m otion

V ide o d a ta

face classificationface classification face memorizationface memorization

Page 36: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

36. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Procedure

The face is • detected: using motion at far range (non-linear

change detection), using colour at close range (non-linear colour mapping to perceptual uniform space)

• than tracked until convenient forrecognition: using blink detection and nose tracking

• than localized and transformed to the canonical 24x24 representation,

• than recognized using the PINN associative memory trained pixel differences.

After each blink, eyes and nose positions are retrieved. If they form an equilateral triangle (i.e.face is parallel to image plane), then face is extracted and recognized / memorized.

Page 37: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

37. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Experiments

E.g. images retrieved from blink (at left) are recognized as the right image

In experiments:

With 63 faces from BioID database and 9 faces of our lab users (all of which are

shown) stored, the system recognizes our users after a single (or several) blinks.

It also updates the face memory on-fly.

Page 38: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

38. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Demos

Page 39: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

39. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Memorization

Page 40: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

40. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Memorization (cntd)

Page 41: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

41. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Recognition (from video)

Page 42: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

42. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Further developments

• Research -wise. Very many– Combining with audio– With panaramic cameras– With multiple cameras– Perceptual face gesture learner for handicaped– …

• License/application –wise. Very many– Talk to me…

Page 43: On Facial Recognition in Video VIVA Seminar, U. of Ottawa, August 28, 2003 Dr. Dmitry Gorodnichy Computational Video Group Institute for Information Technology.

43. Facial Recognition in Video (Dr. Dmitry Gorodnichy)

Intelligent Vision: Goal & Examples

• Goal: To Look AND See

• Case study: Face in Video