A Proposal for a Video Modeling for Composing Multimedia Document Cécile ROISIN - Tien TRAN_THUONG...
-
Upload
clement-park -
Category
Documents
-
view
215 -
download
0
Transcript of A Proposal for a Video Modeling for Composing Multimedia Document Cécile ROISIN - Tien TRAN_THUONG...
A Proposal for a Video Modeling for Composing Multimedia Document
Cécile ROISIN - Tien TRAN_THUONG - Lionel VILLARD
Presented by: Tien TRAN THUONG
Project OPERA - INRIA
Grenoble - France
Work Context
Need: composition of semantic video fragments with other basic media elements (image, text, sound, ...)
Theme: Multimedia Document (Madeus) Authoring system for multimedia structured
documents Basic media: sound, video, text, image, etc. Document composed by relations
Temporal Synchronization Example
INRIA’s positions document Pictures &
Titles synchronized
with video parts
Video Presentation
Video Frames
Logical organization of document
InriaIntroduction
Video PresentationBuildingsOverview
Image Image
Rocq.Picture
RhôneAlpesPicture
Text Text
Rocq.Title
RhôneAlpesTitle
Rocq.appears
R.A.appears
Locations ofINRIA’s units
Rennesappears
Lorraineappears
S.A.appears
Locations of INRIA’s units ConclusionIntroduction
Rocq. appears Ren. appears S.A. app . . .Lorraine appears. . . RA app.
Time line view of the document
Time
Rocquencourt Title & Picture
Rennes Title & Picture
Sophia-Antipolis Title & Picture
Raw video
Lorraine Title & Picture
Rhône-Alpes Title & Picture
Texts grow up
Video fragments
Spatial layout of text follow video object document
Location of the video object region that is moving region in the video region
Document Region
(Left, Top, Width, Height)
Text Region
(Width, Height)
Video Region
(Left, Top, Width, Height)Ok
Right-Top-Align
. . ... . .... . .
Video Object Region {x(t), y(t)}
Objective and plan of that work
Research and development on the video modeling for the description of the video content relevant to multimedia applications: Video modeling: video description for multimedia
composition, Multimedia application: our VideoMadeus is an
editing and presentation system.
Video Description
Dublin core: the semantic indexing schema for video content description.
MPEG-7: the future standard tools will enable to define the semantic schemas for description of the audiovisual information.
Video => Analysis -> Description -> Applications
Scheme of audiovisual applications
Our video modeling for composing multimedia document.
Methodology
Specification of a modeling for the description of video content: Multi-level structuration, temporal and spatial relations, actions interactive on the video elements.
Specification in XML Experimentation in Madeus (VideoMadeus)
Video Content Description
Video Content Description
Multi-level Structuration
Video
Structure Structural Description
Semantic
Semantic Description
Thesaurus
Thesaurus
<!--XML schema for the description of VideoContent-->
<!ELEMENT VideoContent (MetaInfo, MediaInfo, Summary, Structure,
Semantic, Thesaurus)>
<!ELEMENT Structure (Sequence+, Relation?)>
<!ELEMENT Semantic (VideoObject*, EventSemantics*)>
<!ELEMENT Thesaurus (ReferenceDictionary*, UserDictionary*)>
Raw video
Occ.1 Occ.2 Occ.3 Occ.4
NabilIrene
Structure
Semantic
Thesaurus
Researcher
Video Structure Description
Motivation: for composition, the basis is to have the Structure description level.
Semantic and Thesaurus are more necessary for retrieval applications or as a support for structuration level.
First step is Structure description
High Level Description
Video
Video StructureVideo Structure
Sequences
Sequences
Scenes
Scenes
Shots
Shots
<!--XML schema for the description high level
structure -->
<!ELEMENT Structure (Sequence+, Relation?)>
<!ELEMENT Sequence (Scene+,Relation?)>
<!ELEMENT Scene (Shot+,Event*, Relation?)>
<!ELEMENT Shot (Transition?,Event*,Occurrence*,
Background?, Relation?)>
Shot Content Description
Shot Content
Shot
Transition
Trans.
SpatialLayout
SpatialLayout
Reference
Event
Event
Semantic
Index
Background
Background
Occurrence OccurrenceCameraWork
Camerawork
<!-- XML Shot Description -->
<!ELEMENT Shot (Transition?,Event*,Occurrence*,
Background?, Relation?) >
<!ELEMENT Transition EMPTY >
<!ELEMENT Event EMPTY>
<!ELEMENT Background (Region+)>
<!ELEMENT Occurrence (Region+, Trajectory?,
Occurrence*) >
<!ELEMENT SpatialLayout (2DBStringDS+) >
<!ELEMENT CameraWork (CameraMotion?) >
*
Occurrence Content Description
Occurrence Content
Occurrence Trajectory
Trajectory
RegionsRegion
Occurrences
Occurrence
Texture
Contour
Contour
Texture Centroid Region
Color
Color
Regions
<!-- XML Occurrence description -->
<!ELEMENT Occurrence (Trajectory*, Region+, Occurrence*)>
<!ELEMENT Trajectory …>
<!ELEMENT Region (Contour+, Color*, Texture*, Centroid, Region*)>
<!ELEMENT Contour … >
<!ELEMENT Color … >
<!ELEMENT Texture … >
<!ELEMENT Centroid … >
<!ELEMENT Region … >
Model summary
The model focuses on the description of video elements useful for composing a multimedia document (shot, scene, occurrence, event, relation, etc.)
It has a XML specification that makes it independent and easy to apply to multimedia applications (ex. our VideoMadeus).
Madeus Architecture
JAVA Xerces JMF
OUTILS
Editor/Presentation Tools
EXECUTION View
TIME LINE View
HIERARCHICAL View
VIDEO STRUCTURED View . . .
PARSERS LOGIC
STRUCTURATION
TEMPORAL
STRUCTURATION
SPATIAL
STRUCTURATION
EVENT
MANAGEMENT
MODEL MANAGEMENT
MADEUS
Madeus document
To extend Madeus to VideoMadeus, video content description is handled both in composition and in presentation parts.
SAVE
Internal Document
Madeus Document Model
Structured document organized according to the dimensions: Logical, temporal, spatial.
Madeus Document
Actor
Content
Temporal
Spatial
Logical
<Madeus>
<Content> … </Content>
<Actor> . . . </Actor>
<Temporal> . . . </Temporal>
<Spatial> . . . </Spatial>
</Madeus>
Madeus Document
Actor
Content
Temporal
Spatial
Content that describes the content information of the document Actor that defines how this basic information in the content part is
used in the document (style information, link, etc.) Temporal for the synchronization between document parts Spatial for layout specification
Relations Temporal relations (Allen extension)
meets, starts, equals, during, overlaps, parmin,etc. Spatial relations
left_align, right_align, center_v, center_h, top_align, bottom_align, etc.
<Temporal> …
<Relations>
<start Interval1=« a » Interval2=« b » />
<meet Interval1=« b » Interval2=« d » /> …
<Relations>
</Temporal>
<Spatial> …
<Relations>
<left-align Region1=« b » Region2=« d » />
…
<Relations>
</Spatial>
d
Overview of VideoMadeus
Video edition View
Structure ViewSemantic ViewThesaurus View
Element Management
EditPlaySearch
Execution View
TemporalSpatial
SynchronizationManagement
HyperlinkFollow-upEraseDisplay, etc...
Behavior ManagementSynchronization
Video
Index on video
Requested descriptions
Modified description
Requested descriptions
XML Description of video content
Data ManagementInternal Structure
(MODEL)
Parser
Modify
Editing and Presentation Tools
VideoMadeus document<Madeus>
<Content> . . .
<VideoContentDS> . . .
<Scene ID = « MyScene » ... > . . . </Scene>
</VideoContentDS>
</Content>
<Actor> . . .
<VideoElement ID=«SceneVideo» Content = «MyScene » . . . > . . .</VideoElement>
</Actor>
<Temporal>
<Interval ID=“ScenceInt” Actor=“SceneVideo” Duration=“...” … />
<Relations> . . . </Relations>
</Temporal>
<Spatial>
<Region ID=“ScenceReg” Actor=“SceneVideo” Height =“288” Width=“352” … />
<Relations> . . . </Relations>
</Spatial>
</Madeus>
Editing features Editing of the video description
shot detection (automatic or manual) extract manually video objects, events, spatialLayout, etc.
Creating of semantic groups (manual) group shots in a scene, group scenes in a sequence detection occurrences of a character (group occurrences in
objects) creation of the other semantic indexing classifying of the video elements (thesaurus)
scenario editing (composing) Set temporal and spatial relations between video element
and other media Set actions on the video elements
Conclusion
Provide support for deeper access into video data in the multimedia authoring system: temporal/spatial synchronization with the other
media elements (image, text, sound, etc.), actions on the video elements (hyperlink, follow-up,
erasing, etc.) Develop experimentally the video editing view to
help the user create and modify descriptions of video data in accordance with our video model.
Perspectives More experimentation for spatial synchronization, Extension and experimentation of the semantic
parts (Semantic and Thesaurus) -> semantic queries,
Use the MPEG-7 tools to specify our video model, Develop the video content description editing tool:
Integration and adaptation of the video analyzing algorithms for generating more automatically possible the video elements,
Timeline editing view for video structure, etc. Semantic queries for playing a part of video through
network.
Video content description in Madeus document
<?xml version="1.0"?><Madeus Name="DocMadeus" Version="2.0" Width="800" Height="600"> <Content>
<VideoContent ID="InriaInfoco" … > <Structure ID="InriaInfocoStruc" … >
<Sequence ID="Seq" Start_Time ="0" Stop_Time ="76.69" … > <Scene ID="Scene1" Start_Time ="0" Stop_Time ="4.91" … > </Scene> <Scene ID="Scene2" Start_Time ="4.91" Stop_Time ="11.09" … > <Shot ID="Shoti" Start_Time ="4.91" Stop_Time ="8.71" …
/> <Shot ID="Shotii" Start_Time ="8.71" Stop_Time ="11.09"
… /> </Scene> <Scene ID="Scene3" Start_Time ="11.09" Stop_Time ="29.07" … > … </Scene> …
</Sequence> </Structure> …
</VideoContent > <VideoContent ID="InriaGen" … > … </VideoContent> …
</Content>
…
</Madeus>
Video element definition
The operations can be defined in the instance of the described video: Hyperlink, Tracking, Erasing, Jumping, etc.
<?xml version="1.0"?>
<Madeus Name="DocMadeus" Version="2.0" Width="800" Height="600">
<Content> . . . </Content>
<Actor>
<VideoElement ID=«WesternScene» Content=«WesternDS.Seq.Scene1»
TypeRenderer=«LightWeight». . . >
<VideoObject ID=«VO1» Object = «Shot2.ActorOcc1» Actions=«Follow-up;Hyrperlink;...»
HRef =«file:///C:/Users/ttran/Multimedia/Madeus/opera.html» />
. . .
</VideoElement>. . .
</Actor>
<Temporal> . . . </Temporal>
<Spatial> . . . </Spatial>
</Madeus>
Temporal part of Inria introduction document
<?xml version="1.0"?>
<Madeus Name="DocMadeus" Version="2.0" Width="800" Height="600"> ...
<Temporal>
<T-Group ID="Temporal" Duration="pref:20s min:15s max:22s">
<!-- Interval of three hypertexts -->
<Interval ID="ControlOperaInterval" Actor="ControlOperaInfo" Duration="pref:20s min:15s max:22s"/>
…
<!-- Interval of the video element -->
<Interval ID="MovieInriaURScene4" Actor ="InriaURScene4" Fill="freeze" Duration="pref:20s min:15s max:22s"/>
<!-- Interval of the texts -->
<Interval ID="txtInriaURScene4Shotii" Actor ="TxtInriaURScene4" Duration="pref:20s min:15s max:22s" />
...
<Interval ID="txtInriaURScene4Shotvi" Actor ="TxtInriaURScene4Shotvi" Duration="pref:20s min:15s max:22s" />
<!-- Interval of the images -->
<Interval ID="ImgInriaURScene4Shotii" Actor ="ImgRoc" Duration="pref:20s min:15s max:22s" />
...
<Interval ID="ImgInriaURScene4Shotvi" Actor ="ImgRA" Duration="pref:20s min:15s max:22s"/>
<Relations>
<!-- Equals relations of the texts with the video elements -->
<Equals Interval1="MovieInriaURScene4.Shotii" Interval2="txtInriaURScene4Shotii.SizeAnimation" />
...
<Equals Interval1="MovieInriaURScene4.Shotvii" Interval2="txtInriaURScene4Shotvii.SizeAnmationi" />
<!-- Start relations of the images with the video elements -->
<Starts Interval1="MovieInriaURScene4.Shotii" Interval2="ImgInriaURScene4Shotii" />
...
<Starts Interval1="MovieInriaURScene4.Shotvi" Interval2="ImgInriaURScene4Shotvi" />
</Relations>
</T-Group>
</Temporal> <Spatial> … </Spatial>
</Madeus>
Spatial part of Spatio-Temporal Relation Demo document
<?xml version="1.0"?>
<Madeus Name="DocMadeus" Version="2.0" Width="800" Height="600">
...
<Spatial>
<S-Group ID="TOTOSpatial">
<!-- Video region -->
<Region ID= "WesternVideoRegion" Actor="WesternVideo" Left="206" Top="140" Height="288" Width="352" Depth="1"/>
<!-- Three hypertext regions-->
<Region ID= " LinkOperaInfoRegion" Actor ="ControlOperaInfo" Left="236.0" Top="492.0" Width="210.0" Depth="2.0"/>
<Region ID="LinkAutoCitroenRegion" Actor ="ControlAutoCitroen" Left="36.0" Top="492.0" Width="210.0" Depth="2.0"/>
<Region ID="LinkSTRST" Actor ="ControlSpatioTemp" Left="472.0" Top="492.0" Width="236.0" Depth="2.0"/>
<Region ID="TxtOperaIntroRegion" Actor ="TxtOperaIntro" Left="168" Top="46" Height="42" Width="429" Depth="2.0"/>
<!-- Regions of the text following the video object-->
<Region ID="TxtMotionRegion" Actor ="TxtMotion" Height="16.0" Width="69" Depth="2.0"/>
<Relations>
<Top_align Region1="WesternVideoRegion.Shot1.Obj" Region2="TxtMotionRegion" />
</Relations>
</S-Group >
</Spatial>
</Madeus>