MPEG-7 Audio Overview
description
Transcript of MPEG-7 Audio Overview
MPEG-7 Audio MPEG-7 Audio OverviewOverview
Ichiro FujinagaIchiro Fujinaga
MUMT 611MUMT 611
McGill UniversityMcGill University
MUMT611 Fujinaga 2 / 18
ContentContent MPEG-7 overviewMPEG-7 overview
Objectives and scopeObjectives and scope Main elements and organizationMain elements and organization
MPEG-7 audioMPEG-7 audio Low-level featuresLow-level features High-level features and toolsHigh-level features and tools
MUMT611 Fujinaga 3 / 18
IntroductionIntroduction (formally) Multimedia Content Description (formally) Multimedia Content Description
InterfaceInterface MPEG-1, 2, 4: Content coding and representationMPEG-1, 2, 4: Content coding and representation MPEG-7: Metadata (1998-2001)MPEG-7: Metadata (1998-2001)
standardized descriptions and description schemes of structures and content of multimedia
a language to specify such descriptions and description schemes
Interoperable interface that defines syntax and Interoperable interface that defines syntax and semanticssemantics Modalities: audio, visual, or multimediaModalities: audio, visual, or multimedia Aspects: media, meta, structural, or semanticAspects: media, meta, structural, or semantic Applications: searching, filtering, navigationApplications: searching, filtering, navigation
MUMT611 Fujinaga 4 / 18
ScopeScope The goal is to provide The goal is to provide
interoperability among multimedia interoperability among multimedia applications in applications in GenerationGeneration ManagementManagement DistributionDistribution ConsumptionConsumption
MUMT611 Fujinaga 5 / 18
Application domainsApplication domains Broadcast media selection (radio channel, TV channel)Broadcast media selection (radio channel, TV channel) Digital libraries (film, video, audio and radio archives)Digital libraries (film, video, audio and radio archives) E-Commerce (personalized advertising)E-Commerce (personalized advertising) Education (repositories of multimedia courses, multimedia Education (repositories of multimedia courses, multimedia
search for support material)search for support material) Home Entertainment (management of personal multimedia Home Entertainment (management of personal multimedia
collections, including manipulation of content, e.g. karaoke). collections, including manipulation of content, e.g. karaoke). Journalism (searching speeches of a certain politician using Journalism (searching speeches of a certain politician using his name, his voice or his face)his name, his voice or his face)
Multimedia directory services (yellow pages)Multimedia directory services (yellow pages) Surveillance and remote sensingSurveillance and remote sensing
MUMT611 Fujinaga 6 / 18
Components (XML)Components (XML) MPEG-7 SystemsMPEG-7 Systems MPEG-7 Description Definition LanguageMPEG-7 Description Definition Language MPEG-7 VisualMPEG-7 Visual MPEG-7 AudioMPEG-7 Audio MPEG-7 Multimedia Description SchemesMPEG-7 Multimedia Description Schemes Reference Software: the eXperimentation Model Reference Software: the eXperimentation Model
(test)(test) MPEG-7 Conformance MPEG-7 Conformance (syntax checking)(syntax checking) MPEG-7 Extraction and use of descriptions MPEG-7 Extraction and use of descriptions
(technical report)(technical report)
MUMT611 Fujinaga 7 / 18
Other StandardsOther Standards SMPTESMPTE EBUEBU TV-AnytieTV-Anytie DIG-35DIG-35 Dublin CoreDublin Core OCLC/RLGOCLC/RLG
MUMT611 Fujinaga 8 / 18
MPEG-7 ObjectivesMPEG-7 Objectives
Information Information aboutabout the content the content Form: Form: e.g. the coding format usede.g. the coding format used
Conditions for accessing the material:Conditions for accessing the material: Intellectual property rights / priceIntellectual property rights / price
Classification: Classification: e.g. parental ratinge.g. parental rating
Links to other relevant materialsLinks to other relevant materials Context: Context: e.g. “Olympic Games 1996, final of 200 meter e.g. “Olympic Games 1996, final of 200 meter
hurdles, men”hurdles, men”
Information Information presentpresent in the content: in the content: Combination of low-level and high-level Combination of low-level and high-level
descriptorsdescriptors
MUMT611 Fujinaga 9 / 18
Where do the Where do the descriptions come from?descriptions come from?
PreservationPreservation of existing descriptive data of existing descriptive data through the production/deliverythrough the production/delivery
Generated automatically by Generated automatically by capture capture devicesdevices (e.g. time or GPS location in a camera)(e.g. time or GPS location in a camera)
ExtractedExtracted automatically & semi- automatically & semi-automaticallyautomatically
ManuallyManually produced produced (e.g. for legacy material (e.g. for legacy material such as existing film archives)such as existing film archives)
MUMT611 Fujinaga 10 / 18
Main Elements of MPEG-7Main Elements of MPEG-7
Description Tools: Description Tools: ( textual / binary )( textual / binary ) Descriptors (D): define the syntax and the Descriptors (D): define the syntax and the
semantics of each semantics of each featurefeature (metadata element) (metadata element) Description Schemes (DS): Description Schemes (DS): relationshipsrelationships between between
componentscomponents Description Definition Language (DDL):Description Definition Language (DDL):
Define the syntax of the MPEG-7 Description ToolsDefine the syntax of the MPEG-7 Description Tools Creation , extension ,and modification of DSs Creation , extension ,and modification of DSs
System tools:System tools: Storage and transmission, synchronization of Storage and transmission, synchronization of
descriptions with content, multiplexing of descriptions with content, multiplexing of descriptions, etc.descriptions, etc.
MUMT611 Fujinaga 11 / 18
Main Elements Main Elements of MPEG-7of MPEG-7
Salembier and Avaro (2001)
MUMT611 Fujinaga 12 / 18
Description ToolsDescription Tools Creation and production processes: (director, title)Creation and production processes: (director, title) Usage: (broadcast schedule)Usage: (broadcast schedule) Storage featuresStorage features Structural information: (spatial-temporal components)Structural information: (spatial-temporal components)
SegmentationsSegmentations Low-level features: (sound timbres, melody description)Low-level features: (sound timbres, melody description) Conceptual information: (objects and events, Conceptual information: (objects and events,
interactions)interactions) Navigation and access: (summaries, variations)Navigation and access: (summaries, variations) Collections of objectsCollections of objects User-content interactions: (user preferences, usage User-content interactions: (user preferences, usage
history)history)
MUMT611 Fujinaga 13 / 18
MPEG-7 AudioMPEG-7 Audio Audio provides structures—building Audio provides structures—building
upon some basic structures from the upon some basic structures from the MDS—for describing audio content.MDS—for describing audio content.
Low-level featuresLow-level features audio features that cut across many audio features that cut across many
applicationsapplications High-level features and toolsHigh-level features and tools
more specific to a set of applicationsmore specific to a set of applications
MUMT611 Fujinaga 14 / 18
Low-level FeaturesLow-level Features Two low-level descriptor types Two low-level descriptor types (for sample and segment)(for sample and segment)
Scalar : (e.g. power or fundamental frequency)Scalar : (e.g. power or fundamental frequency) Vector : (e.g. spectra)Vector : (e.g. spectra)
Hierarchical, consistent interfaceHierarchical, consistent interface Any descriptor inheriting from these types can be Any descriptor inheriting from these types can be
instantiated, describing a segment with a single summary instantiated, describing a segment with a single summary value or a series of sampled values, as the application value or a series of sampled values, as the application requires.requires.
Scalable series Scalable series (hierarchical re-sampling)(hierarchical re-sampling) Progressively down-sample the data contained in a series Progressively down-sample the data contained in a series
(application-oriented)(application-oriented)
MUMT611 Fujinaga 15 / 18
Low-level FeaturesLow-level Features
Salembier and Avaro (2001)
MUMT611 Fujinaga 16 / 18
High-level FeaturesHigh-level Features
Exchange some generality for descriptive Exchange some generality for descriptive richness:richness: a smaller set of audio features (as compared to visual a smaller set of audio features (as compared to visual
features) that may canonically represent a sound features) that may canonically represent a sound without domain-specific knowledge.without domain-specific knowledge.
Audio SignatureAudio Signature (DS) (DS)
Musical Instrument TimbreMusical Instrument Timbre MelodyMelody General Sound Recognition and IndexingGeneral Sound Recognition and Indexing Spoken ContentSpoken Content
MUMT611 Fujinaga 17 / 18
Recent DevelopmentRecent Development New audio description tools specified New audio description tools specified
(MPEG-7 version 2): (MPEG-7 version 2): Audio signal qualityAudio signal quality Audio tempoAudio tempo Chord patternChord pattern Rhythm patternRhythm pattern Multi-channelMulti-channel
MUMT611 Fujinaga 18 / 18
ReferencesReferences
Chang, S., T. Sikora, and A. Puri, 2001. Chang, S., T. Sikora, and A. Puri, 2001. OOverview verview of MPEG-7 Standard. of MPEG-7 Standard. IEEE Transactions on IEEE Transactions on Circuits and Systems for Video TechnologyCircuits and Systems for Video Technology 11 (6): 11 (6): 688-95.688-95.
Matinez, J. 2004. Matinez, J. 2004. MPEG-7 Overview.MPEG-7 Overview. http://www.chiariglione.org/mpeg/standards/mpehttp://www.chiariglione.org/mpeg/standards/mpeg-7/mpeg-7.htmg-7/mpeg-7.htm
Quackenbush, S. and A. Lindsay. 2001. Overview of MPEG-7 audio. IEEE Transactions on Circuits and Systems for Video Technology 11 (6): 725-9.
Salembier, P., andSalembier, P., and O. Avaro. 2000O. Avaro. 2000. . MPEG-7: MPEG-7: Multimedia Content Description interface.Multimedia Content Description interface. http://gps-tsc.upc.es/imatge/_Philippe/demo/MPEG21_MPEG7.pdf