Internet Streaming Media Metadata Interchange
with MPEG-7
Eric Rehm
CTO, singingfish.com
Thomson multimedia4 May 2001, Hong Kong
Overview
• Brief look at Singingfish
• Indexing Internet streaming media
• Automating metadata delivery and processing
• Case Study: Using XSL to transform MSNBC schema to MPEG-7
singingfish.com
• Wholly-owned subsidiary of Thomson Multimedia• B2B Streaming Media Search Service• Pay per query business model• Over 15 M streams indexed• Live with customers since Jan 2000
– InfoSpace: Metacrawler, Dogpile– Inside Internet AG: Swiss-Search, Austria-Search
• Involved with MPEG-7 standards development since Sept 1999
Service Model
Indexing Streaming Media
• High quality metadata improves relevancy of multimedia search results
• Crawl….or…work directly with multimedia “Content Producers” to acquire quality metadata
• Solution: Implement FTP push/pull of metadata– Automated processing upon FTP close
– Support bulk or incremental operations: add, update, delete, reset
– Future: SOAP or other W3C XML protocol
Design
Content Producer Program Metadata Engine
C ontent P roducerX M L Feed
Java FTP Servlet
W ork flowR D BM S
SearchIndex
Pro
mot
ion
Pro
mo
tio
n
Prom otionFTP
SchedulerX SL
Engine
Im porter
Prom otionJDBC
SingingfishSearch Engine
Query / Response
Development Goals
• Single metadata schema interface to a database– Control development costs
– Partition engineering and content development
• Adapt to any “content partner” metadata– XML, CSV, Excel, Virage VDF, ….
– Transform “content partner” metadata to MPEG-7 via:
• Custom applications (CSV, Excel) MPEG-7
• Proprietary XML schemas XSL MPEG-7
Case Study
Create XSL transformation
• From:– MSNBC "Partner XML Format"
• To:– MPEG-7 Description
Experimental Results
File lines chars elemnts attrs
MSNBC Partner XML Example
73 1199 58 16
MPEG-7 Result 263 4471 151 74
• XSL Stylesheet: 370 lines of lightly commented code
Discussion
• Basic MPEG-7 Tools• Semantic Encoding of MSNBC Keywords into
MPEG-7 Structured Annotation DS (Who, What, Where, When, Why, How)
• Encoding Controlled Terms using namespaces• Encoding Streaming Media Validity with the
Availability DS• Extending an MPEG-7 DS
MSNBC Video Distribution Entry
tdy_fletcher_mideast_001023Keywords: Israel, palestinian, Yasser Arafat
Top News Order: 12Peace hopes slip fartherThe slim hopes for peace in the Mideast are rapidly fading, NBC’s Martin Fletcher reports Monday from the outskirts of Jerusalem.
Today’s show•Barak, Sharon talk coalition
•What’s on Today
•What’s on Weekend Today
•What’s on Today
MSNBC <article>
<article storyorder="12" pubdate="10/23/2000 8:02:00 AM" source="Today show" topnews="12">
<filename>tdy_fletcher_mideast_001023</filename>
<duration>00:01:09</duration>
<headline>Peace hopes slip farther</headline>
<description>The slim hopes for peace in the Mideast are rapidly fading, NBC&#146;s Martin Fletcher reports Monday from the outskirts of Jerusalem.</description>
<keywords>Israel, palestinian, Yasser Arafat</keywords>
...</article>
MPEG-7 link to stream
<MediaInformation><MediaProfile>
<MediaInstance> <MediaLocator>
<MediaUri>http://www.msnbc.com/news/asx/video/28/tdy_fletcher_mideast_001023.asx
</MediaUri> </MediaLocator></MediaInstance>
<MediaProfile><MediaInformation>
<headline>
<headline>Peace hopes slip farther</headline>
<CreationInformation><Creation>
<Title> <xsl:value-of
select="headline"/></Title>
</Creation></CreationInformation>
< description>, <keywords>
<Abstract><FreeTextAnnotation>
<xsl:value-of select="description"/> </FreeTextAnnotation>
<description>The slim hopes for peace in the Mideast ...</description>
<keywords>Israel, palestinian, Yasser Arafat</keywords>
<KeywordAnnotation><Keyword>Israel</Keyword><Keyword>Yasser Arafat</Keyword
</KeywordAnnotation></Abstract>
Enhanced <keywords>
<keywords>Israel, palestinian, Yasser Arafat</keywords>
<Abstract><Who>
<Name>Yasser Arafat</Name></Who><WhatObject>
<Name>palestinian</Name></WhatObject><Where>
<Name>Israel</Name></Where>
</Abstract>
Encoding Controlled Terms
1. Singingfish.com Genres are described in one namespace (urn:sf:genre).
2. MSNBC Genres are described in another namespace (urn:msnbc:category )
Encoding Controlled Terms<categories>
<category id="News"><topics>
<topic>International</topic></topics>
</category></categories>
<Genre href="urn:msnbc:category:{category[1]/@id}">
<Term type="NT" termId="{category[1]/topics/topic[1]}"/>
</Genre>
<xsl:variable name=“sfCategory"
select="singingfish:mapper.map(string(category[1]/@id))">
<Genre href=“urn:sf:{$sfCategory}“ />
Extending an MPEG-7 DS
m peg7:UsageInform ationType
+Rights : m peg7:RightsType+FinancialResults : m peg7:FinancialType+Availability : m peg7:AvailabilityType+UsageRecord : UsageRecordType
sf:UsageInform ationType
sf:PublicationType
+Publisher : m peg7:AgentType+PublicationLocation : m peg7:LocationType+Publication : m peg7:Tim eType+Rights : m peg7:RightsType
0..1
<complexType name="PublicationType"> <complexContent> <extension base="mpeg7:DSType">
<sequence> <element name="Publisher"
type="mpeg7:AgentType" minOccurs="0"/> <element name="PublicationLocation"
type="mpeg7:PlaceType" minOccurs="0"/> <element name="PublicationDate"
type="mpeg7:TimeType" minOccurs="0"/> <element name="Rights"
type="mpeg7:RightsType" minOccurs="0"/> </sequence> </extension> </complexContent></complexType>
<complexType name="PublicationType"> <complexContent> <extension base="mpeg7:DSType">
<sequence> <element name="Publisher"
type="mpeg7:AgentType" minOccurs="0"/> <element name="PublicationLocation"
type="mpeg7:PlaceType" minOccurs="0"/> <element name="PublicationDate"
type="mpeg7:TimeType" minOccurs="0"/>
<element name="Rights" type="mpeg7:RightsType" minOccurs="0"/>
</sequence> </extension> </complexContent></complexType>
Extending an MPEG-7 DS
Extending an MPEG-7 DS
<complexType name="UsageInformationType">
<complexContent>
<extension base="mpeg7:UsageInformationType">
<sequence>
<element name="Publication“ type="sf:PublicationType"
minOccurs="0"/>
</sequence>
</extension>
</complexContent>
</complexType>
<complexType name="UsageInformationType">
<complexContent>
<extension base="mpeg7:UsageInformationType">
<sequence>
<element name="Publication“ type="sf:PublicationType"
minOccurs="0"/>
</sequence>
</extension>
</complexContent>
</complexType>
Extending an MPEG-7 DS<UsageInformation xsi:type="sf:UsageInformationType">... <Publication> <Publisher xsi:type="mpeg7:OrganizationType"> <NameTerm href=“urn:sf:publisher:MSNBC”/>
</Publisher> <PublicationLocation> <Country>us</Country>
<Region>wa</Region> </PublicationLocation> <PublicationDate>
<TimePoint>2000-10-23T14:20:00</TimePoint> </PublicationDate>
</Publication></UsageInformation>
Summary
• Quality search depends on quality metadata– MPEG-7 standards ease development costs– Controlled vocabularies
• MPEG-7 MDS can be used to interoperate
• XML Schema allows controlled extensions
Thank you
singingfish.com
Optional MPEG-7 Background Slides
MPEG-7 Basics
• ISO/IEC 15928 Multimedia Content Description Interface• Comprehensive set of audiovisual description tools. • Enabled by key Internet standards:
– W3C: XML, XML Schema– IETF standards: URI, URN, URL for resource naming and location
• Harmonized with other emerging metadata standards:– Dublin Core, MPEG-21, NewsML, SMPTE Metadata Dictionary,
TV-Anytime, and more.
• Text and compressed binary encodings – Both encodings have streaming add, delete, update features for
delivery over real-time transports: MPEG-2, MPEG-4, IP, etc.
• International Standard in October 2001– Ballot period begins 14 March 2001
Basic elements
Datatype &structures
Link & medialocalization Basic DSs
Basic elementsBasic elements
Schematools
Time, Duration, Medialocators
Time, Duration, Medialocators
Textual Annotation (free text, structured annotation, syntactic
dependency, etc.)Controlled vocabularies,Agent, Place, Graph, etc.
Textual Annotation (free text, structured annotation, syntactic
dependency, etc.)Controlled vocabularies,Agent, Place, Graph, etc.
Content Management & Description
Content descriptionContent description
Content managementContent management
Creation &production
Media ContentUsage
Conceptualaspects
Structuralaspects
Title, Creator, Creation location & date, Purpose, Classification, Genre, etc.
(Author generated)
Title, Creator, Creation location & date, Purpose, Classification, Genre, etc.
(Author generated)
Format, Coding, Instances, Identification, Transcoding
Hint, etc.(Several instances)
Format, Coding, Instances, Identification, Transcoding
Hint, etc.(Several instances)
Rights holder, Access rights, Usage Record, Financial aspects,
etc. (Evolution)
Rights holder, Access rights, Usage Record, Financial aspects,
etc. (Evolution)
Datatype &structures
Link & medialocalization
Basic DSsSchematools
Viewpoint of the structure: Segments• Spatial / temporal structure• Audio, video low-level Ds
• Elementary semantic information.
Viewpoint of the structure: Segments• Spatial / temporal structure• Audio, video low-level Ds
• Elementary semantic information.
Content Management & Description (Conceptual aspects)
Content descriptionContent description
Content managementContent management
Creation &production
Media ContentUsage
Conceptualaspects
Structuralaspects
Datatype &structures
Link & medialocalization
Basic DSsSchematools
Viewpoint of conceptual notions• Events, objects, abstract concepts, and
their relation
Viewpoint of conceptual notions• Events, objects, abstract concepts, and
their relation
Navigation and Access
Navigation &Navigation &AccessAccess
Summary
Variation
Content descriptionContent description
Content managementContent management
Creation &production
Media ContentUsage
Conceptualaspects
Structuralaspects
Efficient support of : discovery, browsing, navigation, visualization /
sonification
Efficient support of : discovery, browsing, navigation, visualization /
sonification
Datatype &structures
Link & medialocalization
Basic DSsSchematools
Navigation and Access
Navigation &Navigation &AccessAccess
Summary
Variation
Content descriptionContent description
Content managementContent management
Creation &production
Media ContentUsage
Conceptualaspects
Structuralaspects
Datatype &structures
Link & medialocalization
Basic DSsSchematools
Substitution of the original contentAdaptation to terminal, network, or
user preferences
Substitution of the original contentAdaptation to terminal, network, or
user preferences
Content Organization
Navigation &Navigation &AccessAccess
Summary
Variation
ModelContent organizationContent organization
Content descriptionContent description
Content managementContent management
Creation &production
Media ContentUsage
Conceptualaspects
Structuralaspects
Probability ModelStatistical functions and structures to describe
sample of AV content and classes of descriptors.Analytic model:
Definition of cluster, classes and models to associate a semantic label to a set of data.
Probability ModelStatistical functions and structures to describe
sample of AV content and classes of descriptors.Analytic model:
Definition of cluster, classes and models to associate a semantic label to a set of data.
Description and organization of collection of documents
Description and organization of collection of documents
Collection &Classification
Datatype &structures
Link & medialocalization
Basic DSsSchematools
User Interaction
Navigation &Navigation &AccessAccess
Summary
Variation
AnalyticModel
Content organizationContent organization
Content descriptionContent description
Content managementContent management
Creation &production
Media ContentUsage
Conceptualaspects
Structuralaspects
Collection &Classification
UserUserInteractionInteraction
User preferences
Datatype &structures
Link & medialocalization
Basic DSsSchematools
User identification and preferences:
Filtering, search and browsing
User identification and preferences:
Filtering, search and browsing
User preferences
Usage HistoryUsage History
MPEG-7 DDL
• XML Schema• Data type extensions
– MIME type, ISO country, region, currency codes– ISO Character set codes– Revised time data types to support arbitrary fractional
seconds denominator for per-frame positioning• 2001-05-01T15:23:46N11F30 (11th frame @ 30 FPS)
• Type-centric approach using root abstract types– Control available global elements– Allow extension via name spaces and <extension>
mechanism
Basic Derivation of MPEG-7 Types
<complexType name="Mpeg7RootType" abstract="true"><complexContent>
<restriction base="anyType"/></complexContent>
</complexType>
<complexType name="DSType" abstract="true"> <complexContent> <extension base="mpeg7:Mpeg7RootType"> <sequence> <element name="Header" type="mpeg7:HeaderType" minOccurs="0" maxOccurs="unbounded"/> </sequence> <attribute name="id" type="ID" use="optional"/> </extension> </complexContent></complexType>
Creation Description Scheme
<complexType name="CreationType"> <complexContent>
<extension base="mpeg7:DSType"> <sequence>
<element name="Title" type="mpeg7:TitleType maxOccurs="unbounded"/>… <element name="Creator“
type="mpeg7:CreatorType“ minOccurs="0" maxOccurs="unbounded"/>…
</sequence> </extension> </complexContent></complexType>
Top Related