CS257 Modelling Multimedia Information LECTURE 6

33
CS257 Modelling Multimedia CS257 Modelling Multimedia Information Information LECTURE 6 LECTURE 6

description

CS257 Modelling Multimedia Information LECTURE 6. Introduction. See beginning of Lecture 5…. Queries to Video Databases. - PowerPoint PPT Presentation

Transcript of CS257 Modelling Multimedia Information LECTURE 6

CS257 Modelling Multimedia InformationCS257 Modelling Multimedia Information

LECTURE 6LECTURE 6

IntroductionIntroduction

• See beginning of Lecture 5…

Queries to Video DatabasesQueries to Video Databases

• Users may want to query for a particular event involving particular people, e.g. “find me video with Bill hitting Tom” – why not use a list of keywords [hit, Bill, Tom] for query and to represent film content?

Need more structured descriptions of what’s happening (both for queries and for video metadata), i.e. who is doing what to whom with what and why. [More on this in PART 1]

Queries to Video DatabasesQueries to Video Databases

• User may want to specify a temporal sequence of events, e.g. “find me video where this happens then this happens while that happens”

[More on this in PART 2]

Queries to Video DatabasesQueries to Video Databases

• How to express queries / How to describe content – can be considered two sides of the same coin; both require dealing with the same kinds of issues

Creating Metadata for Video DataCreating Metadata for Video Data

• Content-descriptive metadata for video often needs to be manually annotated

• However, in some cases the process can be automated (partially) by:– Video segmentation– Feature recognition, e.g. to detect faces, explosions, etc.– Extracting keywords from time-aligned collateral texts,

e.g. subtitles and audio description

Overview of LECTURE 6Overview of LECTURE 6

• PART 1: Need to be able to formally describe video content in terms of objects and events in order to make a query to a video database, e.g. specify who is doing what. Subrahmanian’s Video SQL

• PART 2: May wish to specify temporal and / or causal relationships between events, e.g. X happens before Y, A causes B to happen Allen’s temporal logic Roth’s system for video browsing by causal links

• LAB –Bring coursework questions;

PART 1:PART 1: Querying Video ContentQuerying Video Content

Four kinds of retrieval according to Subrahmanian (1998)

Segment Retrieval: “find all video segments where an exchange of a briefcase took place at John’s house”

Object Retrieval: “find all the people in the video sequence (v,s,e)”

Activity Retrieval: “what was happening in the video sequence (v,s,e)”

Property-based Retrieval: “find all segments where somebody is wearing a blue shirt”

Querying Video ContentQuerying Video Content

• Subrahmanian (1998) proposes an extension to SQL in order to express a user’s information need when querying a video database– Based on video functions

• Recall that SQL is a database query language for relational databases; queries expressed in terms of:SELECT (which attributes)FROM (which table)WHERE (these conditions hold)

Subrahmanian’sSubrahmanian’sVideo FunctionsVideo Functions

FindVideoWithObject(o)

FindVideoWithActivity(a)

FindVideoWithActivityandProp(a,p,z)

FindVideoWithObjectandProp(o,p,z)

Subrahmanian’sSubrahmanian’sVideo Functions (continued)Video Functions (continued)

FindObjectsInVideo(v,s,e)

FindActivitiesInVideo(v,s,e)

FindActivitiesAndPropsInVideo(v,s,e)

FindObjectsAndPropsInVideo(v,s,e)

A Query Language for VideoA Query Language for Video

SELECT may containVid_Id : [s,e]

FROM may containvideo : <source>

WHERE condition allows statements liketerm IN func_call

(term can be variable, object, activity or property value

func_call is a video function)

EXAMPLE 1EXAMPLE 1

“Find all video sequences from the library CrimeVidLib1 that contain Denis Dopeman”

SELECT vid : [s,e]FROM video : CrimeVidLib1WHERE

(vid,s,e) IN FindVideoWithObjects(Denis Dopeman)

EXAMPLE 2EXAMPLE 2

“Find all video sequences from the library CrimeVidLib1 that show Jane Shady giving Denis Dopeman a suitcase”

EXAMPLE 2EXAMPLE 2

SELECT vid : [s,e]FROM video : CrimeVidLib1WHERE (vid,s,e) IN FindVideoWithObjects(Denis Dopeman) AND

(vid,s,e) IN FindVideoWithObjects(Jane Shady) AND

(vid,s,e) IN FindVideoWithActivityandProp(ExchangeObject, Item, Briefcase) AND

(vid,s,e) IN FindVideoWithActivityandProp(ExchangeObject, Giver, Jane Shady) AND

(vid,s,e) IN FindVideoWithActivityandProp(ExchangeObject, Receiver, Denis Dopeman)

EXAMPLE 3EXAMPLE 3

“Which people have been seen with Denis Dopeman in CrimeVidLib1”

EXAMPLE 3EXAMPLE 3

SELECT vid : [s,e], ObjectFROM video : CrimeVidLib1WHERE(vid,s,e) IN FindVideoWithObject(Denis Dopeman) ANDObject IN FindObjectsInVideo(vid,s,e) ANDObject = Denis Dopeman ANDtype of (Object, Person)

Exercise 6-1Exercise 6-1

Given a video database of old sports broadcasts, called SportsVidLib, express the following users’ information needs using the extended SQL as best as possible. You should comment on how well the extended SQL is able to capture each user’s information need and discuss alternative ways of expressing the information need more fully.

•Bob wants to see all the video sequences with Michael Owen kicking a ball

•Tom wants to see all the video sequences in which Vinnie Jones is tackling Paul Gascoigne

•Mary wants to see all the video sequences in which Roy Keane is arguing with the referee, because Jose Reyes punched Gary Neville, while Thierry Henry scores a goal, and then Roy Keane is sent off.

Bob wants to see all the video sequences Bob wants to see all the video sequences with Michael Owen kicking a ballwith Michael Owen kicking a ball

Tom wants to see all the video sequences in which Tom wants to see all the video sequences in which Vinnie Jones is tackling Paul GascoigneVinnie Jones is tackling Paul Gascoigne

Mary wants to see all the video sequences in Mary wants to see all the video sequences in which Roy Keane is arguing with the referee, which Roy Keane is arguing with the referee,

because Jose Reyes punched Gary Neville, while because Jose Reyes punched Gary Neville, while Thierry Henry scores a goal, and then Roy Keane Thierry Henry scores a goal, and then Roy Keane

is sent off.is sent off.

Think about…Think about…

What metadata would be required in order to execute these kinds of video query?

How could this be stored and searched most efficiently?

Part 2: Enriching Video Data Part 2: Enriching Video Data Models and QueriesModels and Queries

• More sophisticated queries to video databases can be supported by considering:– Temporal relationships between video intervals– Causal relationships between events

Need to be able to describe temporal relationships between intervals formally and make inferences about temporal sequences…

Temporal Relationships Temporal Relationships between Intervalsbetween Intervals

• Allen’s (1983) work on temporal logic is often discussed in the video database literature (and in other computing disciplines)

• 13 temporal relationships that describe the possible temporal relationships that can hold between temporal intervals (e.g. intervals or events in video) these can be used to formulate video queries

• A transitivity table allows a system to infer the relationship between A r C, if A r B and B r C are known (where r stands for one temporal relationship, and A, B, C are intervals)

SEE MODULE WEB-PAGE FOR EXTRA NOTES ON THIS

X equal Y = = XXXXXYYYYY

X before Y < > XXXX YYYY

X meets Y m mi XXXXYYYYX overlaps Y o oi XXXXX

YYYYYX during Y d di XXX

YYYYYYYYYX starts Y s si XXXX

YYYYYYYY X finishes Y f fi XXXXX

YYYYYYYYYY

Temporal Relationships Temporal Relationships between Intervalsbetween Intervals

• Crucial aspect of Allen’s work is the transitivity table that enables inferences to be made about temporal sequences

• Inferences take the form:

If A r B, and B r C, then r1, r2, r3… may hold between A and C

For example:

If A < B and B < C, then A < C

Another ExampleAnother Example• If A “contains” B, and B < C then what

relationships can hold between A and C?

BBBBB ?CC? ?CCCC? ?CCCCC?

AAAAAAAAAAAAA?CCCCC?

?CCCCC?

Possibilities: A < C ; A “overlaps” C; A “meets C”; A “contains” C; A “is finished by C”

Modelling the Relationships between Modelling the Relationships between Entities and Events in FilmEntities and Events in Film

• Some temporal relationships might be interpreted as causal relationships

• Roth (1999) proposed the use of a semantic network to represent the relationships between entities and events in a movie – including causal relations

• The user can then browse between scenes in a movie, e.g. if they are watching the scene of an explosion, they may browse to the scene in which a bomb was planted, via the semantic network (extra note on semantic network will be on the module website).

Organising and Querying Video Organising and Querying Video ContentContent

• Should consider… – Which aspects of the video are likely to be of

interest to the users who access the video archive?

– How to store relevant information about the video efficiently?

– How to express and process queries?– What scope of automatic content extraction?

EXERCISE 6-2EXERCISE 6-2• For an video database application domain of your

choosing write five video queries that use some of Allen’s 13 temporal relationships

• If event A is ‘before (<)’ event B, and event B is ‘during’ event C, then what relationships could hold between A and C?

• How do you think such reasoning about temporal could be used in a video database?

LECTURE 6:LECTURE 6:LEARNING OUTCOMESLEARNING OUTCOMES

After the lecture, you should be able to:• Express a user’s query to a video database

using Subrahmanian’s VideoSQL and discuss the limitations of this formalism

• Explain how and why temporal and causal relationships between events are represented in metadata for video databases

OPTIONAL READINGOPTIONAL READING

Dunckley (2003), pages 38-39; 393-395.For details of the extended video SQL, see:Subrahmanian (1998). Principles of Multimedia Databases

- pages 191-195. IN LIBRARY ARTICLE COLLECTIONFor temporal relationships:Allen (1983). J. F. Allen, ‘Maintaining Knowledge About Temporal

Intervals.’ Communications of the ACM 26 (11), pp. 832-843. Especially Figure 2 for the 13 relationships and Figure 4 for the full transitivity table. [In Library – on shelf]

For causal relationships:Roth (1999). Volker Roth, ‘Content-based retrieval from digital video.’

Image and Vision Computing 17, pp. 531-540. [Available online through library eJournals]