ICS 542 Multimedia Computing

70
ICS 542 Multimedia Computing Spring Semester 2007 - 2008 (072) King Fahd University of Petroleum & Minerals Information & Computer Science Department

description

ICS 542 Multimedia Computing . Spring Semester 2007 - 2008 (072) King Fahd University of Petroleum & Minerals Information & Computer Science Department. Important Preliminaries. Instructor: Dr. Wasfi Al-Khatib وصفي الخطيب Office: (22) 133-1 Office hours: TBA By Appointment. - PowerPoint PPT Presentation

Transcript of ICS 542 Multimedia Computing

Page 1: ICS 542 Multimedia Computing

ICS 542Multimedia Computing

Spring Semester 2007 - 2008 (072)

King Fahd University of Petroleum & MineralsInformation & Computer Science Department

Page 2: ICS 542 Multimedia Computing

• Instructor: Dr. Wasfi Al-Khatib الخطيب وصفي• Office: (22) 133-1• Office hours:

– TBA– By Appointment.

• Phone: 1715• email: [email protected]

Important Preliminaries

Page 3: ICS 542 Multimedia Computing

General Description• This course gives an overview of the different

issues in the representation and management of multimedia data in the context of content-based retrieval. The emphasis will be on textual data, video data, audio data, multimedia documents and multimedia datamining. However, students interested in other related topics will have the opportunity to explore them through their projects and possibly research paper presentations.

Page 4: ICS 542 Multimedia Computing

Feedback Slip

Page 5: ICS 542 Multimedia Computing

Course Material• Prerequisites: Instructor Consent.• Text Book

– Ze-Nian Li & Mark S. Drew, Fundamentals of Multimedia, ISBN 0-13-061872-1, Prentice-Hall, 2004.

• References– Guojun Lu, Multimedia Database Management Systems,

Artech House..– Michael Berry & Murray Browne, Understanding Search

Engines: Mathematical Modeling and Text Retrieval, ISBN 0-89871-437-0, SIAM, 1999.

• Select research papers and articles

Page 6: ICS 542 Multimedia Computing

Grading Policy (1)

Four Quizzes 16%Homeworks and Research

Paper Presentations 20%

One Major Exam 20%

Project 44%

Page 7: ICS 542 Multimedia Computing

Grading Policy (2)

• The project's 40% is distributed as follows:– Project Proposal: 1%– Project Progress Report 1: 3%– Project Progress Report 2: 5%– Final Project Report and Prototype: 30%– In Class Project Presentation: 5%

Page 8: ICS 542 Multimedia Computing

Important DatesTask Date [and Time] Location Weight

Quiz #1 Monday February 25, 2008 In class 4%

Quiz #2 Monday March 10, 2008 In class 4%

Project Proposal Saturday March 15, 2008 Blackboard 1%

Major Exam Thursday March 27, 2008 TBA 20%

Quiz #3 Monday, April 7, 2008 In class 4%

Progress Report #1 Wednesday April 9, 2008 Blackboard 3%

Quiz #4 Monday April 28, 2008 In class 4%

Progress Report #2 Wednesday May 7, 2008 Blackboard 5%

Project Submission Monday June 2, 2008 Blackboard 30%

Project Presentation Wednesday June 4, 2008 TBA 5%

Page 9: ICS 542 Multimedia Computing

Remarks (1)• Homeworks are due by the time announced on

Blackboard. Late homeworks are NOT accepted.• Quizzes: 25-40 minute. Each covers material

covered since the last quiz or major exam.• The Major exam will cover material covered from

the beginning of the semester until the last lecture before the day of the exam.

Page 10: ICS 542 Multimedia Computing

Remarks (2)

• Project proposal, progress report 1, progress report 2, and final report Late Submission Policy:– First day costs 10%.– Multiply previous day cost by 2 to get the

percentage cost of the current day.– You will receive zero points if not submitted

before the fifth day.• Format of each of the above deliverables

will be given later.

Page 11: ICS 542 Multimedia Computing

Research Paper Presentations Policy (1)

• Each student will present one research papers that will be pre-assigned two weeks before the presentation time.

• Each presentation will take between 20-30 minutes total (including questions and answers)

• Handouts for the presentations must be available by 4:00pm on Blackboard the same day of presentation.

• I expect you to “master” the paper.

Page 12: ICS 542 Multimedia Computing

Research Paper Presentations Policy (2)

• The presentation must include the following:– Title and author of the paper.– Brief summary: describe the main points of the paper.– Strengths: List 2-3 positive contributions of the paper.– Weaknesses: List 2-3 weaknesses of the paper.– Expected Research Impact: Possible utilization of the work in

different and/or other areas.– Overall Assessment: Was it worth reading? (Excellent, Very

Good, Good, Fair, Poor)

Page 13: ICS 542 Multimedia Computing

Your 24-Hour Right• One has 24 hours to object to the grade of a quiz

or a major exam starting from the end of the class time in which the graded exam papers have been distributed.– i.e. if you were not present in class during the

distribution of exam papers, you lose this right.• If for some reason you cannot see me in person,

within this period, send me an email requesting an appointment. The email, though, should be sent within the 24-hour time period.

Page 14: ICS 542 Multimedia Computing

Code of Ethics

• KFUPM regulations and standards will be enforced

Page 15: ICS 542 Multimedia Computing

The Term Multimedia

• Multi– Prefix– From Latin “Multus” meaning numerous

• Media– Root– Plural form of the Latin word “Medium”.– “Medium” is a noun meaning middle or center.

Page 16: ICS 542 Multimedia Computing

The Term Media

• Media: A means to distribute and present information– – – – – –

Page 17: ICS 542 Multimedia Computing

Text Data vs. Multimedia Data

Text Multimedia

Page 18: ICS 542 Multimedia Computing

Attributes of Media

• Perception Media• Representation Media• Presentation Media• Storage Media• Transmission Media• Information Exchange Media• Presentation Spaces and Presentation Values• Presentation Dimensions

Page 19: ICS 542 Multimedia Computing

Brief History of Multimedia SystemsYear Events

prior industrial Revolution

Written Letters, Books, Poetry, Bulletin boards

Late 1890s Radio was introducedEarly 1900s Movie was introduced1940s Television was introduced1960s Concept of hypertext systems was developed

Early 1980s Personal computer was introduced1980-present Several digital audio, image, and video coding standards have

been developed.1983 Internet is born, TCP/IP protocol was established. Audio CD

was introduced.1990 Tim Berners-Lee proposed the WWW. HTML was developed.1993-present Several Web browsers, hypertext languages were developed.Mid 1990s High Definition Television standard was established.

Page 20: ICS 542 Multimedia Computing

Multimedia is still at its infancy

• Cannot avoid fuzziness in scope, multiplicity of definitions and non-stabilized terminology.

Page 21: ICS 542 Multimedia Computing

Great Impact of Multimedia

• Integrating all media in the computer allows using the existing computer power to represent information interactively.

• This can be stored in mass storage devices.• This can, also, be transmitted over

computer networks.

Page 22: ICS 542 Multimedia Computing

Terminology

• Monomedia object: Object containing data of a single type

• Multimedia object: Object containing data of multiple media

• Hypertext document: Nonlinear text document (with links)

• Hypermedia document: Nonlinear multimedia document

Page 23: ICS 542 Multimedia Computing

Issues in Multimedia Database Management Systems (1)

• Development of formal semantic modeling techniques for multimedia information.

• Design of powerful indexing, searching, and organization methods.

• Development of models for specifying the media synchronization/integration requirements.

Page 24: ICS 542 Multimedia Computing

Issues in Multimedia Database Management Systems (2)

• Designing formal multimedia query languages.• Development of efficient data-placement schemas

for physical storage management.• Design and development of suitable architecture

and operating system support.• Management of distributed multimedia databases.

Page 25: ICS 542 Multimedia Computing

Reference Architecture

MultimediaQuery

Interface

NavigationTool

MediaEditing

InteractiveIcons

Multimedia Information/Media Composition &Integration Meta-Model/Query Processing

VideoDBMS

RecordsText

Images Audio Video

MonomediaDatabase

ManagementLayer (1)

AudioDBMS

ImageDBMS

TextDBMS

Multimedia ObjectManagement andQuery Processing

Layer (2)

User InterfaceInteractiveLayer (3)

Page 26: ICS 542 Multimedia Computing

Our Focus

• Information Extraction from Multimedia Data, aka Content-Based Retrieval (Projects)– Which information to extract?– How to extract it?

• Multimedia Data Representation and Indexing Techniques (Lectures and Projects)

Page 27: ICS 542 Multimedia Computing

Outline of the Course

• Multimedia datatypes, their representations and properties: text, audio, image, and video.

• Mono-media indexing and Retrieval (Text, Image, Audio and Video).

• Multimedia document indexing and retrieval.• Multimedia Compression.• Multimedia Datamining.… + select techniques from research papers.

Page 28: ICS 542 Multimedia Computing

Multimedia Data Types & FormatsHow computers “see” or store the data?

– Raw data.– Structured data.– Compressed.

Page 29: ICS 542 Multimedia Computing

Text

• Plain Text• Structured Text

– Latex– XML

• Compressed Text– Huffman Coding– Run-Length Coding– LZW Coding

Page 30: ICS 542 Multimedia Computing

Example Latex Document\documentclass[ece,plain]{puthesis}\usepackage{rotating,graphicx,tabularx,epsfig,psfig,subfigure,alg,amssymb,hhline}\title{Data Modeling and Querying in Video Databases}\author{Wasfi Al-Khatib}{Al-Khatib, Wasfi}\degree{Doctor of Philosophy}{Ph.D.}{May}{2001}\majorprof{Arif Ghafoor}\begin{document}\include{front}\include{ch1}\include{ch2}\include{ch3}\include{ch4}\include{ch5}\include{ch6}\bibliography{../refs/refs}\include{vita}\end{document}

Page 31: ICS 542 Multimedia Computing

Example XML Document 1<?xml version="1.0"?><Schema xmlns="urn:schemas-microsoft-com:xml-data" xmlns:dt="urn:schemas-microsoft-com:datatypes"> <ElementType name="first-name" content="textOnly"/> <ElementType name="last-name" content="textOnly"/> <ElementType name="name" content="textOnly"/> <ElementType name="price" content="textOnly" dt:type="fixed.14.4"/> <ElementType name="author" content="eltOnly" order="one"> <group order="seq"> <element type="name"/> </group> <group order="seq"> <element type="first-name"/> <element type="last-name"/> </group> </ElementType> <ElementType name="title" content="textOnly"/>

Page 32: ICS 542 Multimedia Computing

Example XML Document 1 (Cont.)<AttributeType name="genre" dt:type="string"/> <ElementType name="book" content="eltOnly"> <attribute type="genre" required="yes"/> <element type="title"/> <element type="author"/> <element type="price"/> </ElementType> <ElementType name="bookstore" content="eltOnly"> <element type="book"/> </ElementType></Schema>

Page 33: ICS 542 Multimedia Computing

Example XML Document 2<?xml version='1.0'?><!-- This file represents a fragment of a book store inventory

database --><bookstore> <book genre="autobiography"> <title>The Autobiography of Benjamin Franklin</title> <author> <first-name>Benjamin</first-name> <last-name>Franklin</last-name> </author> <price>8.99</price> </book>

…</bookstore>

Page 34: ICS 542 Multimedia Computing

Computer Graphics

• Vector Graphics• Bitmap Images• Combining vectors and

bitmaps

Page 35: ICS 542 Multimedia Computing

Models of Bitmapped Graphics

• Modeled by an array of small picture elements (pixels).– Intensity or color of pixels are stored in a pixel-

based graphics file.• Bitmapped graphics does not use bitmaps

(except for pure monochrome images). It use pixel maps

Page 36: ICS 542 Multimedia Computing

Models of Vector Graphics• Stored as a mathematical description of a

collection of individual lines, curves and shapes making up the image– e.g. line = two end points– For example, xfig in unix, Adobe Illustrator in Windows.

• Displaying a vector image– requires some computation to be performed in order

to interpret the model and generate an array of pixels to be displayed

– The process of interpreting the vector description is known as rasterizing

Page 37: ICS 542 Multimedia Computing

Vector Graphics

• Scalable• Resolution independent• No background• Cartoon-like• Inappropriate for photo-realistic images• Metafiles contain both raster and vector

data

Page 38: ICS 542 Multimedia Computing

Xfig Example

Page 39: ICS 542 Multimedia Computing

# FIG 3.2LandscapeCenterInchesLetter100.00Single-21200 26 4200 4425 6825 45754 0 0 50 0 0 12 0.0000 4 135 510 4200 4567 include\0014 0 0 50 0 0 12 0.0000 4 90 210 5100 4545 src\0014 0 0 50 0 0 12 0.0000 4 135 225 5775 4567 bin\0014 0 0 50 0 0 12 0.0000 4 135 285 6525 4567 data\001-62 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 4575 1950 3825 26252 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 4650 1950 5325 26252 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 5400 3000 5400 35252 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 5400 3825 4500 43502 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 5400 3825 5250 43502 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 5400 3825 5850 43502 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 5400 3825 6600 43502 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 3825 3000 3825 35252 1 0 1 0 7 50 0 -1 0.000 0 0 -1 0 0 2 5400 3825 7275 42754 0 0 50 0 0 12 0.0000 4 165 855 4200 1875 CS400_600\0014 0 0 50 0 0 12 0.0000 4 180 570 5100 2850 Projects\0014 0 0 50 0 0 12 0.0000 4 135 885 3450 2850 Homeworks\0014 0 0 50 0 0 12 0.0000 4 180 375 5250 3750 proj1\0014 0 0 50 0 0 12 0.0000 4 135 315 3675 3750 hw1\0014 0 0 50 0 0 12 0.0000 4 135 165 7275 4575 all\001

Page 40: ICS 542 Multimedia Computing

Computer Animations

• Animation is produced by sequential rendering of frames of graphics.

Page 41: ICS 542 Multimedia Computing

Audio• Audio is a wave resulting from air pressure

disturbance that reaches our eardrum generating the sound we hear.– Humans can hear frequencies in the range 20-20,000

Hz.• ‘Acoustics’ is the branch of physics that studies

sound

Page 42: ICS 542 Multimedia Computing

Characteristics of Audio

• Audio has normal wave properties– Reflection– Refraction– Diffraction

• A sound wave has several different properties:– Amplitude (loudness/intensity)– Frequency (pitch)– Envelope (waveform)

Page 43: ICS 542 Multimedia Computing

Audio Amplitude• Audio amplitude is often expressed in decibels

(dB)• Sound pressure levels (loudness or volume) are

measured in a logarithmic scale (deciBel, dB) used to describe a ratio– Suppose we have two loudspeakers, the first playing a

sound with power P1, and another playing a louder version of the same sound with power P2, but everything else (how far away, frequency) is kept the same.

– The difference in decibels between the two is defined to be

10 log10 (P2/P1) dB

Page 44: ICS 542 Multimedia Computing

Audio Amplitude• In microphones, audio is captured as analog signals

(continuous amplitude and time) that respond proportionally to the sound pressure, p.

• The power in a sound wave, all else equal, goes as the square of the pressure.– Expressed in dynes/cm2.

• The difference in sound pressure level between two sounds with p1 and p2 is therefore 20 log10 (p2/p1) dB

• The “acoustic amplitude” of sound is measured in reference to p1 = pref = 0.0002 dynes/cm2.– The human ear is insensitive to sound pressure levels below

pref.

Page 45: ICS 542 Multimedia Computing

Audio AmplitudeIntensity Typical Examples

0 dB Threshold of hearing20 dB Rustling of paper25 dB Recording studio (ambient level)40 dB Resident (ambient level)50 dB Office (ambient level)

60 - 70 dB Typical conversation80 dB Heavy road traffic90 dB Home audio listening level

120 - 130 dB Threshold of pain140 dB Rock singer screaming into microphone

Page 46: ICS 542 Multimedia Computing

Audio Frequency• Audio frequency is the number of high-to-low pressure

cycles that occurs per second.– In music, frequency is referred to as pitch.

• Different living organisms have different abilities to hear high frequency sounds– Dogs: up to 50KHz– Cats: up to 60 KHz– Bats: up to 120 KHz– Dolphins: up to 160KHz– Humans:

• Called the audible band.• The exact audible band differs from one to another and deteriorates

with age.

Page 47: ICS 542 Multimedia Computing

Audio Frequency• The frequency range of sounds can be divided into

– Infra sound 0 Hz – 20 Hz– Audible sound 20 Hz – 20 KHz– Ultrasound 20 KHz – 1 GHz– Hypersound 1 GHz – 10 GHz

• Sound waves propagate at a speed of around 344 m/s in humid air at room temperature (20 C)– Hence, audio wave lengths typically vary from 17 m

(corresponding to 20Hz) to 1.7 cm (corresponding to 20KHz).

• Sound can be divided into periodic (e.g. whistling wind, bird songs, sound from music) and nonperiodic (e.g. speech, sneezes and rushing water).

Page 48: ICS 542 Multimedia Computing

Audio Frequency

• Most sounds are combinations of different frequencies and wave shapes. Hence, the spectrum of a typical audio signal contains one or more fundamental frequency, their harmonics, and possibly a few cross-modulation products.– Fundamental frequency– Harmonics

• The harmonics and their amplitude determine the tone quality or timbre.

Page 49: ICS 542 Multimedia Computing

Audio Envelope

• When sound is generated, it does not last forever. The rise and fall of the intensity of the sound is known as the envelope.

• A typical envelope consists of four sections: attack, decay, sustain and release.

Page 50: ICS 542 Multimedia Computing

Audio Envelope

• Attack: The intensity of a note increases from silence to a high level

• Decay: The intensity decreases to a middle level.• Sustain: The middle level is sustained for a short period of

time• Release: The intensity drops from the sustain level to zero.

Page 51: ICS 542 Multimedia Computing

Audio Envelope

• Different instruments have different envelope shapes– Violin notes have slower attacks but a longer

sustain period.– Guitar notes have quick attacks and a slower

release

Page 52: ICS 542 Multimedia Computing

Audio Signal Representation

• Waveform representation– Focuses on the exact representation of the

produced audio signal.• Parametric form representation

– Focuses on the modeling of the signal generation process.

– Two major forms• Music synthesis (MIDI Standard)• Speech synthesis

Page 53: ICS 542 Multimedia Computing

Waveform Representation

Audio Capture

Sampling & Digitization

Storage or Transmission Receiver Digital to

Analog

Playback (speaker)

Audio Source

Human Ear

Audio Generation and Playback

Page 54: ICS 542 Multimedia Computing

Digitization

• To get audio (or video for that matter) into a computer, we must digitize it (convert it into a stream of numbers).

• This is achieved through sampling, quantization, and coding.

Page 55: ICS 542 Multimedia Computing

Example Signal

Amplitude

Page 56: ICS 542 Multimedia Computing

Sampling

• Sampling: The process of converting continuous time into discrete values.

Page 57: ICS 542 Multimedia Computing

Sampling Process

1. Time axis divided into fixed intervals2. Reading of the instantaneous value of the

analog signal is taken at the beginning of each time interval (interval determined by a clock pulse)

3. Frequency of clock is called sampling rate or sampling frequency• The sampled value is held constant for the next

time interval (sampling and hold circuit)

Page 58: ICS 542 Multimedia Computing

Sampling Example

Amplitude

Page 59: ICS 542 Multimedia Computing

Quantization

• The process of converting continuous sample values into discrete values.– Size of quantization interval is called

quantization step.– How many values can a 4-bit quantization

represent? 8-bit? 16-bit?• The higher the quantization, the resulting sound

quality .............

Page 60: ICS 542 Multimedia Computing

Quantization Example

Amplitude

Page 61: ICS 542 Multimedia Computing

Coding

• The process of representing quantized values digitally

Page 62: ICS 542 Multimedia Computing

Analog to Digital Conversion

Amplitude

Page 63: ICS 542 Multimedia Computing

Question

• What determines the quality of the digitization process?

Page 64: ICS 542 Multimedia Computing

Determining the Sampling Rate

• Suppose we are sampling a sine wave. How often do we need to sample it to figure out its frequency?

Page 65: ICS 542 Multimedia Computing

Sampling Theorem

• If the highest frequency contained in an analog is B and the signal is sampled at a rate F > 2B, then the signal can be exactly recovered from its sample values.

• F=2B is called the Nyquist Rate.

Page 66: ICS 542 Multimedia Computing

Quantization Levels

• Determines amplitude fidelity of the signal relative to the original analog signal.

• Quantization error (noise) is the maximum difference between the quantized sample values and the analog signal values.

• The digital signal quality relative to the original signal is measured by the signal to noise ratio (SNR).

Page 67: ICS 542 Multimedia Computing

Basic Types of a Digital Signal

• Unit Impulse Function [n]

• Unit Step Function u[n]

Page 68: ICS 542 Multimedia Computing

Sinc Function

Page 69: ICS 542 Multimedia Computing

Sinc Function

• To plot the sinc function in Matlabx = linspace(-5,5); y = sinc(x); plot(x,y);

Page 70: ICS 542 Multimedia Computing

MIDI Interface

• Musical sound can be generated, unlike other types of sounds.

• Therefore, the Musical Instrument Digital Interface standard has been developed– The standard emerged in its final form in

August 1982– A music description language in binary form

• A given piece of music is represented by a sequence of numbers that specify how the musical instruments are to be played at different time instances.