MPEG-7 Audio Overview

18
MPEG-7 Audio MPEG-7 Audio Overview Overview Ichiro Fujinaga Ichiro Fujinaga MUMT 611 MUMT 611 McGill University McGill University

description

MPEG-7 Audio Overview. Ichiro Fujinaga MUMT 611 McGill University. Content. MPEG-7 overview Objectives and scope Main elements and organization MPEG-7 audio Low-level features High-level features and tools. Introduction. (formally) Multimedia Content Description Interface - PowerPoint PPT Presentation

Transcript of MPEG-7 Audio Overview

Page 1: MPEG-7 Audio Overview

MPEG-7 Audio MPEG-7 Audio OverviewOverview

Ichiro FujinagaIchiro Fujinaga

MUMT 611MUMT 611

McGill UniversityMcGill University

Page 2: MPEG-7 Audio Overview

MUMT611 Fujinaga 2 / 18

ContentContent MPEG-7 overviewMPEG-7 overview

Objectives and scopeObjectives and scope Main elements and organizationMain elements and organization

MPEG-7 audioMPEG-7 audio Low-level featuresLow-level features High-level features and toolsHigh-level features and tools

Page 3: MPEG-7 Audio Overview

MUMT611 Fujinaga 3 / 18

IntroductionIntroduction (formally) Multimedia Content Description (formally) Multimedia Content Description

InterfaceInterface MPEG-1, 2, 4: Content coding and representationMPEG-1, 2, 4: Content coding and representation MPEG-7: Metadata (1998-2001)MPEG-7: Metadata (1998-2001)

standardized descriptions and description schemes of structures and content of multimedia

a language to specify such descriptions and description schemes

Interoperable interface that defines syntax and Interoperable interface that defines syntax and semanticssemantics Modalities: audio, visual, or multimediaModalities: audio, visual, or multimedia Aspects: media, meta, structural, or semanticAspects: media, meta, structural, or semantic Applications: searching, filtering, navigationApplications: searching, filtering, navigation

Page 4: MPEG-7 Audio Overview

MUMT611 Fujinaga 4 / 18

ScopeScope The goal is to provide The goal is to provide

interoperability among multimedia interoperability among multimedia applications in applications in GenerationGeneration ManagementManagement DistributionDistribution ConsumptionConsumption

Page 5: MPEG-7 Audio Overview

MUMT611 Fujinaga 5 / 18

Application domainsApplication domains Broadcast media selection (radio channel, TV channel)Broadcast media selection (radio channel, TV channel) Digital libraries (film, video, audio and radio archives)Digital libraries (film, video, audio and radio archives) E-Commerce (personalized advertising)E-Commerce (personalized advertising) Education (repositories of multimedia courses, multimedia Education (repositories of multimedia courses, multimedia

search for support material)search for support material) Home Entertainment (management of personal multimedia Home Entertainment (management of personal multimedia

collections, including manipulation of content, e.g. karaoke). collections, including manipulation of content, e.g. karaoke). Journalism (searching speeches of a certain politician using Journalism (searching speeches of a certain politician using his name, his voice or his face)his name, his voice or his face)

Multimedia directory services (yellow pages)Multimedia directory services (yellow pages) Surveillance and remote sensingSurveillance and remote sensing

Page 6: MPEG-7 Audio Overview

MUMT611 Fujinaga 6 / 18

Components (XML)Components (XML) MPEG-7 SystemsMPEG-7 Systems MPEG-7 Description Definition LanguageMPEG-7 Description Definition Language MPEG-7 VisualMPEG-7 Visual MPEG-7 AudioMPEG-7 Audio MPEG-7 Multimedia Description SchemesMPEG-7 Multimedia Description Schemes Reference Software: the eXperimentation Model Reference Software: the eXperimentation Model

(test)(test) MPEG-7 Conformance MPEG-7 Conformance (syntax checking)(syntax checking) MPEG-7 Extraction and use of descriptions MPEG-7 Extraction and use of descriptions

(technical report)(technical report)

Page 7: MPEG-7 Audio Overview

MUMT611 Fujinaga 7 / 18

Other StandardsOther Standards SMPTESMPTE EBUEBU TV-AnytieTV-Anytie DIG-35DIG-35 Dublin CoreDublin Core OCLC/RLGOCLC/RLG

Page 8: MPEG-7 Audio Overview

MUMT611 Fujinaga 8 / 18

MPEG-7 ObjectivesMPEG-7 Objectives

Information Information aboutabout the content the content Form: Form: e.g. the coding format usede.g. the coding format used

Conditions for accessing the material:Conditions for accessing the material: Intellectual property rights / priceIntellectual property rights / price

Classification: Classification: e.g. parental ratinge.g. parental rating

Links to other relevant materialsLinks to other relevant materials Context: Context: e.g. “Olympic Games 1996, final of 200 meter e.g. “Olympic Games 1996, final of 200 meter

hurdles, men”hurdles, men”

Information Information presentpresent in the content: in the content: Combination of low-level and high-level Combination of low-level and high-level

descriptorsdescriptors

Page 9: MPEG-7 Audio Overview

MUMT611 Fujinaga 9 / 18

Where do the Where do the descriptions come from?descriptions come from?

PreservationPreservation of existing descriptive data of existing descriptive data through the production/deliverythrough the production/delivery

Generated automatically by Generated automatically by capture capture devicesdevices (e.g. time or GPS location in a camera)(e.g. time or GPS location in a camera)

ExtractedExtracted automatically & semi- automatically & semi-automaticallyautomatically

ManuallyManually produced produced (e.g. for legacy material (e.g. for legacy material such as existing film archives)such as existing film archives)

Page 10: MPEG-7 Audio Overview

MUMT611 Fujinaga 10 / 18

Main Elements of MPEG-7Main Elements of MPEG-7

Description Tools: Description Tools: ( textual / binary )( textual / binary ) Descriptors (D): define the syntax and the Descriptors (D): define the syntax and the

semantics of each semantics of each featurefeature (metadata element) (metadata element) Description Schemes (DS): Description Schemes (DS): relationshipsrelationships between between

componentscomponents Description Definition Language (DDL):Description Definition Language (DDL):

Define the syntax of the MPEG-7 Description ToolsDefine the syntax of the MPEG-7 Description Tools Creation , extension ,and modification of DSs Creation , extension ,and modification of DSs

System tools:System tools: Storage and transmission, synchronization of Storage and transmission, synchronization of

descriptions with content, multiplexing of descriptions with content, multiplexing of descriptions, etc.descriptions, etc.

Page 11: MPEG-7 Audio Overview

MUMT611 Fujinaga 11 / 18

Main Elements Main Elements of MPEG-7of MPEG-7

Salembier and Avaro (2001)

Page 12: MPEG-7 Audio Overview

MUMT611 Fujinaga 12 / 18

Description ToolsDescription Tools Creation and production processes: (director, title)Creation and production processes: (director, title) Usage: (broadcast schedule)Usage: (broadcast schedule) Storage featuresStorage features Structural information: (spatial-temporal components)Structural information: (spatial-temporal components)

SegmentationsSegmentations Low-level features: (sound timbres, melody description)Low-level features: (sound timbres, melody description) Conceptual information: (objects and events, Conceptual information: (objects and events,

interactions)interactions) Navigation and access: (summaries, variations)Navigation and access: (summaries, variations) Collections of objectsCollections of objects User-content interactions: (user preferences, usage User-content interactions: (user preferences, usage

history)history)

Page 13: MPEG-7 Audio Overview

MUMT611 Fujinaga 13 / 18

MPEG-7 AudioMPEG-7 Audio Audio provides structures—building Audio provides structures—building

upon some basic structures from the upon some basic structures from the MDS—for describing audio content.MDS—for describing audio content.

Low-level featuresLow-level features audio features that cut across many audio features that cut across many

applicationsapplications High-level features and toolsHigh-level features and tools

more specific to a set of applicationsmore specific to a set of applications

Page 14: MPEG-7 Audio Overview

MUMT611 Fujinaga 14 / 18

Low-level FeaturesLow-level Features Two low-level descriptor types Two low-level descriptor types (for sample and segment)(for sample and segment)

Scalar : (e.g. power or fundamental frequency)Scalar : (e.g. power or fundamental frequency) Vector : (e.g. spectra)Vector : (e.g. spectra)

Hierarchical, consistent interfaceHierarchical, consistent interface Any descriptor inheriting from these types can be Any descriptor inheriting from these types can be

instantiated, describing a segment with a single summary instantiated, describing a segment with a single summary value or a series of sampled values, as the application value or a series of sampled values, as the application requires.requires.

Scalable series Scalable series (hierarchical re-sampling)(hierarchical re-sampling) Progressively down-sample the data contained in a series Progressively down-sample the data contained in a series

(application-oriented)(application-oriented)

Page 15: MPEG-7 Audio Overview

MUMT611 Fujinaga 15 / 18

Low-level FeaturesLow-level Features

Salembier and Avaro (2001)

Page 16: MPEG-7 Audio Overview

MUMT611 Fujinaga 16 / 18

High-level FeaturesHigh-level Features

Exchange some generality for descriptive Exchange some generality for descriptive richness:richness: a smaller set of audio features (as compared to visual a smaller set of audio features (as compared to visual

features) that may canonically represent a sound features) that may canonically represent a sound without domain-specific knowledge.without domain-specific knowledge.

Audio SignatureAudio Signature (DS) (DS)

Musical Instrument TimbreMusical Instrument Timbre MelodyMelody General Sound Recognition and IndexingGeneral Sound Recognition and Indexing Spoken ContentSpoken Content

Page 17: MPEG-7 Audio Overview

MUMT611 Fujinaga 17 / 18

Recent DevelopmentRecent Development New audio description tools specified New audio description tools specified

(MPEG-7 version 2): (MPEG-7 version 2): Audio signal qualityAudio signal quality Audio tempoAudio tempo Chord patternChord pattern Rhythm patternRhythm pattern Multi-channelMulti-channel

Page 18: MPEG-7 Audio Overview

MUMT611 Fujinaga 18 / 18

ReferencesReferences

Chang, S., T. Sikora, and A. Puri, 2001. Chang, S., T. Sikora, and A. Puri, 2001. OOverview verview of MPEG-7 Standard. of MPEG-7 Standard. IEEE Transactions on IEEE Transactions on Circuits and Systems for Video TechnologyCircuits and Systems for Video Technology 11 (6): 11 (6): 688-95.688-95.

Matinez, J. 2004. Matinez, J. 2004. MPEG-7 Overview.MPEG-7 Overview. http://www.chiariglione.org/mpeg/standards/mpehttp://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htmg-7/mpeg-7.htm

Quackenbush, S. and A. Lindsay. 2001. Overview of MPEG-7 audio. IEEE Transactions on Circuits and Systems for Video Technology 11 (6): 725-9.

Salembier, P., andSalembier, P., and O. Avaro. 2000O. Avaro. 2000. . MPEG-7: MPEG-7: Multimedia Content Description interface.Multimedia Content Description interface. http://gps-tsc.upc.es/imatge/_Philippe/demo/MPEG21_MPEG7.pdf