mmsp.1

46
MULTIMEDIA SIGNAL PROCESSING MMSP SGN-5016 Irek Defée Tietotalo TF 316 [email protected]

description

multimedia signal

Transcript of mmsp.1

Page 1: mmsp.1

MULTIMEDIA

SIGNAL PROCESSING

MMSP

SGN-5016

Irek Defée

Tietotalo TF 316

[email protected]

Page 2: mmsp.1

Course info

• Lectures: Room TB 219

Tue ja Wed 10.15-12

• Exercises mandatory

• Exam written

Page 3: mmsp.1

Course info

• Course Web page

http:/www.cs.tut.fi/~defee/mulsp.html

• Course material is regulary updated,

please use only the updated material

Page 4: mmsp.1

Petri Hirvonen

[email protected]

http://www.cs.tut.fi/~hirvone2/5016_exercises.htm

Exercises for SGN-5016

Multimedia Signal Processing

Page 5: mmsp.1

Exercises

• TC303

• Group1: 10:15-12:00, TC 303 25.02

• Group2: 10:15-12:00, TC 303 26.02

• You can participate in one or both of the exercise groups

if there is space, is not attend one group

• A written report is returned by e-mail after each exercise.

• The details about the report are included in the exercise material.

Page 6: mmsp.1

WHAT IS THIS COURSE ABOUT???

1. WHAT IS MULTIMEDIA (MM) ?

2. WHAT IS THE TOPIC OF MULTIMEDIA

SIGNAL PROCESSING?

(THIS AREA IS NOT WELL DEFINED YET)

MULTIMEDIA SIGNAL PROCESSING

Page 7: mmsp.1

WHAT IS MULTIMEDIA?

• COMPOSED OF MULTI+MEDIA

MEDIA = MEDIUM OF COMMUNICATION

WE COMMUNICATE NATURALLY:

VISUALLY, BY SPEECH, BY TOUCH…

WE COMMUNICATE BY TECHNOLOGY:

RADIO (MOBILE PHONES), TV, PRESS,

CINEMA, BOOKS

Page 8: mmsp.1

• PEOPLE USE VARIOUSCOMMUNICATION

MEDIA: SPEECH, VISION, TOUCH….

IN THE PAST WHEN PEOPLE

COMMUNICATED THEY HAD TO USE

THOSE MEDIA DIRECTLY.

IN PRESENT CIVILISATION THERE ARE

MANY TECHNOLOGIES WHICH

EXTEND HUMAN COMMUNICATION

Page 9: mmsp.1

PRODUCER

OF

INFORMATION

HUMAN

RECEIVER

OF

INFORMATION

HUMAN

COMMUNICATION MEDIUM NATURAL

(E.G. VOICE, TOUCH): WE USE SPECIFIC

PHYSICAL MEDIUM E.G. AIR PLUS PRODUCTION

SPECIALLY ENCODED SIGNALS FOR CONVEYING

INFORMATION

COMMUNICATION MEDIUM INDIRECT VIA

TECHNOLOGY (E.G. CINEMA, RADIO, PRESS, TV)

GENERAL MODEL OF HUMAN COMMUNICATION

Page 10: mmsp.1

• MORE RECENT IS A MODEL OF

HUMAN – MACHINE

COMMUNICATION, OR EVEN

MACHINE-MACHINE COMMUNICATION

WHEN WE USE COMPUTERS, WE

COMMUNICATE WITH MACHINE,

THE COMMUNICATION MEDIA ARE:

TOUCH/GESTURE <-> KEYBOARD, MOUSE

VISION <-> DISPLAY

HEARING <-> SOUND

Page 11: mmsp.1

• HUMANS CAN USE SEVERAL DIFFERENT

MEDIA FOR COMMUNICATION

E.G. SPEECH, TOUCH, VISUAL SYSTEM

HUMANS OFTEN USE SEVERAL

MEDIA SIMULTANEOUSLY OR IN OTHER

WORDS MULTIPLE MEDIA =MULTIMEDIA

FOR EXAMPLE: WHEN WE TALK WITH

SOMEBODY WE USE GESTURES, FACE

EXPRESSIONS

Page 12: mmsp.1

• IN FACT PEOPLE PREFER TO USE

MULTIPLE MEDIA = MULTIMEDIA

- WE CAN USE SINGLE MEDIA, E.G.

SPEECH WHEN TALKING ON THE PHONE

BUT SEEING EACH OTHER WHEN

TALKING ”ENHANCES” THE CONTACT

- WE CAN LISTEN TO THE RADIO, E.G.

NEWS, BUT TV IS PREFERRED EVEN IF

WE JUST SEE A PERSON READING THE

NEWS

- MULTIMEDIA IS MORE NATURAL FOR

PEOPLE

Page 13: mmsp.1

• THERE IS ANOTHER USE OF WORD

”MEDIA”, IN THE SENSE OF

MEDIA INDUSTRY

MEDIA INDUSTRY IS DEALING WITH

PRODUCING, DISTRIBUTING AND

SELLING INFORMATION ADDRESSING

HUMAN MEDIA SYSTEM

MULTIMEDIA INFORMATION IS VERY

IMPORTANT FOR THE INDUSTRY

THERE ARE MANY ENGINEERING

PROBLEMS IN DEALING WITH

MULTIMEDIA INFORMATION

Page 14: mmsp.1

• WHAT IS MULTIMEDIA SIGNAL

PROCESSING (MMSP) ?

IT IS ABOUT PROCESSING

COMMUNICATION AND UTILIZATION

OF INFORMATION USED BY HUMANS

ONE CAN CONSIDER THREE

SCENARIOS OF USAGE:

1. HUMAN-HUMAN

2. HUMAN – MACHINE

3. MACHINE - MACHINE

Page 15: mmsp.1

WHY MULTIMEDIA SIGNAL PROCESSING

IS POSSIBLE? THIS IS BECAUSE WE HAVE

MEANS FOR DIGITAL REPRESENTATION

AND PROCESSING OF ANY TYPE OF

INFORMATION.

IF WE TALK ON THE PHONE, LISTEN TO

THE MUSIC FROM MP3PLAYER, WATCH

MOVIE FROM DVD DISC, TAKE PICTURE

WITH CAMERA, WE KNOW THAT

INFORMATION IS REPRESENTED BY BITS

AND PROCESSED DIGITALLY

Page 16: mmsp.1

WHAT WE NEED ARE ALGORITHMS

HOW TO PROCESS THE SIGNALS

DIGITALLY

MULTIMEDIA SIGNAL PROCESSING

IS ABOUT ALGORITHMS FOR THE

PROCESSING OF SIGNALS WHICH ARE

USED BY HUMANS FOR COMMUNICATION

WITH OTHER PEOPLE OR MACHINES OR

DEALING WITH THE WORLD AROUND

Page 17: mmsp.1

• WHAT ARE THE MEDIA SIGNALS?

MEDIA SIGNALS ARE THOSE SIGNALS

WHICH ARE ACCESSIBLE TO THE HUMAN

INFORMATION PROCESSING SYSTEM

ONE OF THE ISSUES IN MULTIMEDIA

SIGNAL PROCESSING IS WHAT TYPE OF

SIGNALS AND WHAT KIND OF

COMBINATIONS OF SIGNALS CAN BE

USED. FOR EXAMPLE: ACOUSTICAL

SIGNALS: SOUNDS, SPEECH-LANGUAGE,

MUSIC

WE CONVERT THOSE SIGNALS TO

DIGITAL FORMAT AND USE

Page 18: mmsp.1

• EXAMPLE: DIGITAL MUSIC (CD, MP3, DVD, INTERNET RADIO)

• EXAMPLE: DIGITAL VIDEO (DVD, BLUE RAY, INTERNET TV)

THESE ARE SYSTEMS FOR TRANSFERRING

CONTENT PRODUCED BY ARTISTS TO

PEOPLE. THESE SYSTEMS USE SPECIFIC

DIGITAL ENCODING AND COMPRESSION

OF INFORMATION TO RECORD THE

CONTENT.

THE QUESTION IS HOW TO MAKE THIS.

Page 19: mmsp.1

BUT HAVING SUCH SYSTEMS A NEW

PROBLEM EMERGES:

HOW TO PROTECT MEDIA INFORMATION

UNAUTHORIZED USE?

(FOR EXAMPLE ILLEGAL COPYING?)

How to represent media information in

most pleasing way?

Examples are High Definition technologies:

- Flat Displays

- HD DVD, Blue Ray discs, HDTV

Page 20: mmsp.1

• THE SECOND MAIN ASPECT OF MMSP

2. HUMAN-MACHINE COMMUNICATION

HOW TO MAKE INTERACTION WITH

COMPUTERS (AND OTHER MACHINES)

MORE NATURAL? NATURAL MEANS E.G. MORE

SIMILAR TO HUMAN-HUMAN INTERACTION,

MORE INTUITIVE, MORE PLEASING,

ATTRACTIVE….

Page 21: mmsp.1

THAT INCLUDE ALSO HOW TO MAKE

MACHINES MORE INTELLIGENT:

• FOR EXAMPLE , INSTEAD OF TYPING

WE COULD TALK TO COMPUTERS AND

INSTEAD OF COMPUTERS PRINTING ON

SCREEN ANSWERS THEY WOULD TALK

TO US.

OR, IF COMPUTERS WOULD SEE US

USING CAMERAS, THEY POSSIBLY

COULD REACT MORE LIKE PEOPLE.

BUT TODAY WE STILL USE KEYBOARD

AND MOUSE, WHY?

Page 22: mmsp.1

• WE USE KEYBOARD AND MOUSE

BECAUSE WE DO NOT HAVE BETTER

TECHNOLOGY: WE DO NOT KNOW HOW

TO PROCESS SPEECH AND VISUAL

INFORMATION AS EFFECTIVELY AS

PEOPLE ARE ABLE TO DO

• BUT WE MAY THINK OF COMPUTERS

WITH CAMERAS AND MICROPHONES

WHICH WILL BE ABLE TO DO SO

• THIS MAY BECOME POSSIBLE BECAUSE

OF FAST PROGRESS IN DEVELOPMENT OF

ALGORITHMS AND PROCESSORS

Page 23: mmsp.1

• THIS PROGRESS CAN BE ILLUSTRATED ON

MANY EXAMPLES

- COMPARE PC TODAY AND 10 years AGO

(TODAY WE HAVE MULTICORE

PROCESSORS AND THE NUMBER OF CORES

IS GROWING FAST)

- COMPARE MOBILE DEVICE TODAY AND

MOBILE PHONE 10 years AGO

(TODAY THE TELEPHONE FUNCTION IS

JUST ONE ADDITION TO MULTIPLE MEDIA

PROCESSING: MUSIC, VIDEO, CAMERA,

TOUCH, ORIENTATION)

EXTRAPOLATE THIS TO THE NEXT 10 years!

Page 24: mmsp.1

WE CAN EXPECT IN THE FUTURE:

• COMPUTERS, MOBILE, AND ALL KIND

OF OTHER DEVICES WILL BE MORE AND

MORE CLEVER (=INTELLIGENT?)

• THESE SYSTEMS WILL BE RELYING

ON INCREASINGLY SOPHISTICATED

MULTIMEDIA SIGNAL PROCESSING

CAPABILITIES

Page 25: mmsp.1

• WE HAVE THUS TWO MAIN AREAS TO

COVER IN MMSP:

1. MEDIA INFORMATION PROCESSING

IN MULTIMEDIA SYSTEMS

2. MEDIA COMPUTER INTERFACE FOR

HUMAN-COMPUTER INTERACTION

THESE ARE THE TOPICS OF

THE MMSP COURSE

Page 26: mmsp.1

• Please note however that our Multimedia Signal

Processing course is matched to the study program

at TUT, especially to the Multimedia Major

• We have many courses specialized in single media

processing: Digital Audio, Image Processing, Video

Processing, Video Compression, Pattern

Recognition

• We avoid overlapping with those courses. We are

also not going into algorithms which were proposed

by researchers but they are not in wider use yet,

this is covered in other courses and seminars

• In other universities they may not have so many

specialized courses, the course content is different

Page 27: mmsp.1

• There is one absolutely basic observation:

• MANY MULTIMEDIA SIGNAL PROCESSING

TASKS ARE ALREADY IMPLEMENTED IN

BIOLOGICAL SYSTEMS, ESPECIALLY IN

THE HUMAN INFORMATION PROCESSING

SYSTEM

• FOR EXAMPLE: VISUAL AND ACOUSTICAL

COMMUNICATION BETWEEN PEOPLE,

USING VISUAL INFORMATION IN

RECOGINIZING OBJECTS. BIOLOGICAL

SYSTEMS DO IT PERFECTLY BUT WE DO

NOT KNOW HOW, THAT IS ALGORITHMS

Page 28: mmsp.1

IN THE FIRST PART OF THIS COURSE

WE SHALL COVER BASIC KNOWLEDGE

RELATED TO

HUMAN INFORMATION PROCESSING

THIS SYSTEM PROCESSESS MEDIA

INFORMATION AND IT DOES IT IN

FANTASTIC WAY. IF WOULD KNOW HOW

IT MAKES IT, IT COULD HELP US TO

MAKE BETTER MEDIA INFORMATION

PROCESSING (BETTER MMSP ALGORITHMS)

Page 29: mmsp.1

BUT BEFORE WE GO FURTHER LET US MAKE

SOME MEDIA TECHNOLOGY OVERVIEW,

WHERE MULTIMEDIA SIGNAL PROCESSING

WILL BE USEFUL IN THE FUTURE

Page 30: mmsp.1

MULTIMEDIA SIGNAL PROCESSING

ALLOWS FOR NEW CLASSESS OF DEVICES

AND SYSTEMS:

MORE SOPHISTICATED COMMUNICATION,

MORE ADVANCED INTERFACES

THEY ARE ILLUSTRATED NEXT

Page 31: mmsp.1

Mobile Multimedia Devices Examples

Page 32: mmsp.1

WHAT THESE MOBILE DEVICE EXAMPLES

SHOW TO US?

-DEVICES HAVE MULTIPLE SENSORS AND

MULITPLE MEDIA PROCESSING CAPABILITIES

- TAKE ONE EXAMPLE - TOUCH

Device is controlled by fingers, e.g. picture size

or even playing guitar

Page 33: mmsp.1

What is still missing?

Maybe makeup, but this is a joke

Page 34: mmsp.1

ANOTHER EXAMPLE: DIGITAL CAMERAS

Digital cameras perform a lot of processing

for best picture quality. But recent cameras

have new features related to analysis of

visual information.Face Detection automatically detects a face in the frame and

adjusts focus, exposure, contrast, and skin complexion so it

turns out perfectly.Face Recognition – a feature that “remembers” faces from

previous shots. When a familiar face is recorded several times,

the camera will prompt the users to register the face. Once

registered, if the face appears into the frame again, the camera

will display the name specified for that person and prioritize

focus and exposure for the face.

To make such feature an algorithm for

face detection and recognition is needed

working fast and reliably

Page 35: mmsp.1

COMPLETELY NEW TYPES OF DEVICES ARE

POSSIBLE: EXAMPLE Wii

Wii by Nintendo

Contollers have

motion sensors

Game & fitness accessories

Dancing pad Balance board

Sports game Music performance

Page 36: mmsp.1

AIBO DOG – PERSONAL ROBOT WITH SENSES

Completely New Types of Devices

Page 37: mmsp.1

IT HAS SENSES:

MICROPHONE,

CAMERA, TEMPERATURE,

DISTANCE, ACCELERATION,

BALANCE, TOUCH

IT HAS INSTINCTS

AND BEHAVIORS

Page 38: mmsp.1

"Is this a real cat?" A robot cat you can bond with like a real pet --

NeCoRo is born

Completely New Types of Devices

Page 39: mmsp.1

Omron ready to test demand for robo-cat

Page 40: mmsp.1

Equipped with Omron's proprietary MaC (Mind and Consciousness) technology, feelings are generated according to recognition feedback, which is dependent on configurations based on psychological concepts, leading to cognitive decisions and actions determined by these feelings (applicable patent acquired)

Feelings of satisfaction, anger, and uneasiness generated based on recognition feedback

Desires to sleep or be cuddled generated according to physiological rhythms

Via a learning function, personality traits such as selfishness and the need for attention will change in response to the owner

Page 41: mmsp.1

PERSONAL ROBOTS

START APPEARING ...

Page 42: mmsp.1

Fujitsu has developed a new miniature

humanoid robot, named HOAP-1,

designed for wide application in research

and development of robotic technologies.

Fujitsu Automation will begin domestic

sales of the robot from today and hopes to

sell 100 units within three years.

Weighing 6kg and standing 48cm tall, the

light and compact HOAP-1 and

accompanying simulation software can be

used for developing motion control

algorithms in such areas as two-legged

walking, as well as in research on human-

to-robot communication interfaces.

The basic simulation software and user-

developed programs are designed to run

on RT-Linux on an operating command

PC, which communicates with the the

robot through a USB interface. The robot's

internal sensors and actuators (motors)

also use USB interface and can be easily

expanded according to needs

Page 43: mmsp.1

The two-legged walking

technology developed by

Honda represents a unique

approach to the challenge of

autonomous locomotion. Using

the know-how gained from

these prototypes, research and

development began on new

technology for actual use.

ASIMO represents the fruition

of this pursuit.

Page 45: mmsp.1

• Progress of technology is fast: Even the old

television is changing, in 2010 a three

dimensional television, 3D TV, will start

3D TV set

Glasses

And also a first TV controlled

by hand gestures will be

available (but very expensive)

Page 46: mmsp.1

What we see from these examples?

• We can see that devices are developing to

have

- More complexity

- More intelligence

- More natural interaction with people

To add even more such features one needs

algorithms for multimedia signal processing,

many of these algorithms should have

capabilities similar to biological systems.