ISP 433/633 Week 5 Multimedia IR. Goals –Increase access to media content –Decrease effort in...
-
date post
20-Dec-2015 -
Category
Documents
-
view
213 -
download
0
Transcript of ISP 433/633 Week 5 Multimedia IR. Goals –Increase access to media content –Decrease effort in...
Multimedia IR
• Goals– Increase access to media content– Decrease effort in media handling and
reuse– Improve usefulness of media content
• Technology– Feature Extraction– Metadata
Type of Multimedia
• 1D– Audio (speech, music, sound effects, etc.)– MIDI
• 2D– Photographs– Graphics
• 3D– Video (2D + Time)– Animation (2D + Time)– Computer graphic models
• 4D– Computer graphic model animation (3D + Time)
Typical Queries
• Audio– Search for songs by humming
• Graphics– Search for diagrams by sketching
• Image– Check if company logo appears in the program as
contracted
• Video– Detect unusual movement in a surveillance video
Challenge of Multimedia IR
• Often Unstructured
• Content is difficult to analyze and compare– Computers don’t understand the content of
multimedia– Need to specify content and semantics
• Large storage requirement
Type of Queries (features)
• Attribute– Speaker of an audio, size of a video, color
distribution of an image etc.– Have an exact match
• Structural– “all the objects containing one image and
one video clip”, “image displays companied by a jingle sound”
• Semantic– “images with the logo of Ford company”
Spatial Query
• Queries about the spatial relationships (intersection, containment, boundary, adjacency, proximity) of entities geometrically defined and located in space
• Used in GIS– Georeferenced data
Type of Spatial Queries
• Point-in-polygon– What we have in (x,y) region?
• Distance and Buffer Zone Queries– What cities lie within 40 miles of
the border of Northern and Southern Ireland?
• Path Queries– What is the shortest route from
San Francisco to Los Angeles?
YY
XX
More Spatial queries
• Multimedia Queries : Use non-map georeferenced information.– What are the names
of farmers affected by flooding in Monterey and Santa Cruz Counties? p123p123
p127p127
Spatial Indexing and Access
• F-dimensional space– Reduce the problem into searching points
in a multi-dimensional feature space
• Feature functions– Map an object into a point in feature space
• Distance Feature 2
Feature 1
Object A
Object B
Matches
• Whole matches– All the objects within a certain distance
from query
• Sub-pattern match– Parts of objects within a certain distance
from query
• Nearest neighbors match
• All Pairs match
Feature Exaction
• How to represent object with numerical feature values?
• Feature function– Perverse the distance between objects– Capture the characteristics of objects
• Don’t want too many dimensions;
• Much is in research– MDS, DSP, machine vision
Problem of Automatic Feature Extraction
• Mismatch between percepts and concepts– Similar Percepts / Dissimilar Concepts
Clown Nose Red Sun
Metadata
• Content representation of the media
• Creation (annotation)– During capture– After capture
• Use metadata to manipulate media– Storage– Indexing– Search
Multimedia Content Description Interface (MPEG-7)
• Create standardized multimedia description framework
• Support range of abstraction levels from low-level signal characteristics to high-level semantic information
Descriptionproduction
Standard description Descriptionconsumption
Boundaries of theMPEG-7 standard
MPEG
• Moving Picture Experts Group (MPEG)• Working group of ISO/IEC in charge of the development
of standards for coded representation of digital audio and video
• Established in 1988, the group has produced – MPEG-1
• Standard on which such products as Video CD and MP3 are based– MPEG-2
• Standard on which such products as Digital Television set top boxes and DVD are based
– MPEG-4• Standard for multimedia for the fixed and mobile web
– MPEG-7• Standard for description and search of audio and visual content
– MPEG-21• "Multimedia Framework" standard has started in June 2000
Application of MPEG-7
DescriptionGeneration
MPEG7Description
MPEG7Coded
DescriptionEncoder Decoder
Search /QueryEngine
MPEG-7 Structure
TagsTags
<scene id=1><time> ....<camera>..
<annotation</scene>
InstantiationInstantiation
TagsTags
<scene id=1><time> ....<camera>..
<annotation</scene>
InstantiationInstantiation
Descriptors:Descriptors:(Syntax & semantic(Syntax & semanticof feature representation)of feature representation)
D7
D2
D5
D6D4
D1
D9
D8
D10
101011 0
Encoding&
Delivery
101011 0
Encoding&
Delivery
D3
LanguageDescription Definition extensionextension
DefinitionDefinitionLanguage
Description Definition extensionextension
DefinitionDefinition
Description SchemesDescription Schemes
D1
D3D2
D5D4D6
DS2
DS3
DS1
DS4StructuringStructuring
Description SchemesDescription Schemes
D1
D3D2
D5D4D6
DS2
DS3
DS1
DS4
Description SchemesDescription Schemes
D1
D3D2
D5D4D6
DS2
DS3
DS1
DS4StructuringStructuring
Some MPEG-7 Application Types
Descr.DB
DecoderExtract.
MatchList
Descr.
MediaDB
SearchTool
CodingScheme
Query
Extraction from Media Search / Retrieval
• Others– Transcoding– Description Filtering
Example Application
• IBM VideoAnn - assists authors in the task of annotating video sequences with MPEG-7 metadata
More Example Applications
• 3D Murale - 3D Measurement & Virtual Reconstruction of Ancient Lost Worlds of Europe (EU:
IST Project)
• Real-time video identification - monitors broadcast TV programs and identifies its contents (NEC)
• Virage - a digital asset management system for processing, indexing, storing and publishing video
• Content providers adopting MPEG-7: emusic.com
Challenges
• Creating metadata– Represent action sequences and higher level
narrative structures– Integrate legacy metadata (keywords, natural
language)– Gather more and better metadata at the point of
capture (develop metadata cameras)– Develop “human-in-the-loop” indexing algorithms
and interfaces
• Using metadata– Integrate linguistic and other query interfaces
Multimedia IR demos
• QBIC– http://www.hermitagemuseum.org/fcgi
-bin/db2www/qbicSearch.mac/qbic?selLang=English