ISP 433/633 Week 5 Multimedia IR. Goals –Increase access to media content –Decrease effort in...

32
ISP 433/633 Week 5 Multimedia IR
  • date post

    20-Dec-2015
  • Category

    Documents

  • view

    213
  • download

    0

Transcript of ISP 433/633 Week 5 Multimedia IR. Goals –Increase access to media content –Decrease effort in...

ISP 433/633 Week 5

Multimedia IR

Multimedia IR

• Goals– Increase access to media content– Decrease effort in media handling and

reuse– Improve usefulness of media content

• Technology– Feature Extraction– Metadata

Type of Multimedia

• 1D– Audio (speech, music, sound effects, etc.)– MIDI

• 2D– Photographs– Graphics

• 3D– Video (2D + Time)– Animation (2D + Time)– Computer graphic models

• 4D– Computer graphic model animation (3D + Time)

Typical Queries

• Audio– Search for songs by humming

• Graphics– Search for diagrams by sketching

• Image– Check if company logo appears in the program as

contracted

• Video– Detect unusual movement in a surveillance video

Challenge of Multimedia IR

• Often Unstructured

• Content is difficult to analyze and compare– Computers don’t understand the content of

multimedia– Need to specify content and semantics

• Large storage requirement

Type of Queries (features)

• Attribute– Speaker of an audio, size of a video, color

distribution of an image etc.– Have an exact match

• Structural– “all the objects containing one image and

one video clip”, “image displays companied by a jingle sound”

• Semantic– “images with the logo of Ford company”

Spatial Query

• Queries about the spatial relationships (intersection, containment, boundary, adjacency, proximity) of entities geometrically defined and located in space

• Used in GIS– Georeferenced data

Type of Spatial Queries

• Point-in-polygon– What we have in (x,y) region?

• Distance and Buffer Zone Queries– What cities lie within 40 miles of

the border of Northern and Southern Ireland?

• Path Queries– What is the shortest route from

San Francisco to Los Angeles?

YY

XX

More Spatial queries

• Multimedia Queries : Use non-map georeferenced information.– What are the names

of farmers affected by flooding in Monterey and Santa Cruz Counties? p123p123

p127p127

Spatial Indexing and Access

• F-dimensional space– Reduce the problem into searching points

in a multi-dimensional feature space

• Feature functions– Map an object into a point in feature space

• Distance Feature 2

Feature 1

Object A

Object B

Matches

• Whole matches– All the objects within a certain distance

from query

• Sub-pattern match– Parts of objects within a certain distance

from query

• Nearest neighbors match

• All Pairs match

R-tree

• Minimum bounding rectangle (MBR)

Feature Exaction

• How to represent object with numerical feature values?

• Feature function– Perverse the distance between objects– Capture the characteristics of objects

• Don’t want too many dimensions;

• Much is in research– MDS, DSP, machine vision

Color Image

• Color Histogram – 256 dimension

• RGB average – 3 dimension

Example Feature Extraction Product

• Smart Fire Alert (Fastcom technologies)

Problem of Automatic Feature Extraction

• Mismatch between percepts and concepts– Similar Percepts / Dissimilar Concepts

Clown Nose Red Sun

Problem of Automatic Feature Extraction

• Dissimilar Percepts / Similar Concepts

A Car Another Car

Metadata

• Content representation of the media

• Creation (annotation)– During capture– After capture

• Use metadata to manipulate media– Storage– Indexing– Search

Multimedia Content Description Interface (MPEG-7)

• Create standardized multimedia description framework

• Support range of abstraction levels from low-level signal characteristics to high-level semantic information

Descriptionproduction

Standard description Descriptionconsumption

Boundaries of theMPEG-7 standard

MPEG

• Moving Picture Experts Group (MPEG)• Working group of ISO/IEC in charge of the development

of standards for coded representation of digital audio and video

• Established in 1988, the group has produced – MPEG-1

• Standard on which such products as Video CD and MP3 are based– MPEG-2

• Standard on which such products as Digital Television set top boxes and DVD are based

– MPEG-4• Standard for multimedia for the fixed and mobile web

– MPEG-7• Standard for description and search of audio and visual content

– MPEG-21• "Multimedia Framework" standard has started in June 2000

Application of MPEG-7

DescriptionGeneration

MPEG7Description

MPEG7Coded

DescriptionEncoder Decoder

Search /QueryEngine

MPEG-7 Structure

TagsTags

<scene id=1><time> ....<camera>..

<annotation</scene>

InstantiationInstantiation

TagsTags

<scene id=1><time> ....<camera>..

<annotation</scene>

InstantiationInstantiation

Descriptors:Descriptors:(Syntax & semantic(Syntax & semanticof feature representation)of feature representation)

D7

D2

D5

D6D4

D1

D9

D8

D10

101011 0

Encoding&

Delivery

101011 0

Encoding&

Delivery

D3

LanguageDescription Definition extensionextension

DefinitionDefinitionLanguage

Description Definition extensionextension

DefinitionDefinition

Description SchemesDescription Schemes

D1

D3D2

D5D4D6

DS2

DS3

DS1

DS4StructuringStructuring

Description SchemesDescription Schemes

D1

D3D2

D5D4D6

DS2

DS3

DS1

DS4

Description SchemesDescription Schemes

D1

D3D2

D5D4D6

DS2

DS3

DS1

DS4StructuringStructuring

MPEG-7 Top Level Hierarchy

MPEG-7 Conceptual Description

MPEG-7 Still Image Description

MPEG-7 Video Segments Example

MPEG-7 Segment Relationship Graph

Some MPEG-7 Application Types

Descr.DB

DecoderExtract.

MatchList

Descr.

MediaDB

SearchTool

CodingScheme

Query

Extraction from Media Search / Retrieval

• Others– Transcoding– Description Filtering

Example Application

• IBM VideoAnn - assists authors in the task of annotating video sequences with MPEG-7 metadata

More Example Applications

• 3D Murale - 3D Measurement & Virtual Reconstruction of Ancient Lost Worlds of Europe (EU:

IST Project)

• Real-time video identification - monitors broadcast TV programs and identifies its contents (NEC)

• Virage - a digital asset management system for processing, indexing, storing and publishing video

• Content providers adopting MPEG-7: emusic.com

Challenges

• Creating metadata– Represent action sequences and higher level

narrative structures– Integrate legacy metadata (keywords, natural

language)– Gather more and better metadata at the point of

capture (develop metadata cameras)– Develop “human-in-the-loop” indexing algorithms

and interfaces

• Using metadata– Integrate linguistic and other query interfaces