Automatic Video Tagging using Content Redundancy

Automatic Video Tagging using Content Redundancy

Stefan Siersdorfer1 , Jose San Pedro2, Mark Sanderson2

1L3S Research Center, Germany 2University of Sheffield, UK

SIGIR 2009

2009. 11. 06.

Summarized and Presented by Hwang Inbeom, IDS Lab., Seoul National University

Copyright 2009 by CEBT

Large Amount of Data on YouTube

Traffic to/from YouTube accounts for over 20% of the web total Comprising 60% of on-line watched videos

Growing beyond human perception

Necessity to provide effective knowledge mining and retrieval tools

2


Knowledge Mining and Retrieval

Making use of human annotation: Folksonomy Provides relevant results at a relatively low cost

Applications

– Topic detection and tracking

– Information filtering

– Document ranking

– Etc.

However, content-based retrieval techniques are not mature enough Folksonomy-based techniques outperform content-based techniques

3


Problem: Poorly Annotated YouTube Videos

Hard to annotate videos Intellectually expensive process

Time consuming job

Low-quality tags Often very sparse

Lack consistency

Present numerous irregularities

Difficult to provide retrieval and knowledge extraction relying on tex-tual features

4


Motivation

Significant amount of near-duplicate videos Over 25% near-duplicate videos detected in search results

Has been considered as a problem of online videos

Authors have seen this redundancy as a feature Linkage between two different videos

Exploit redundancies to obtain richer video annotations

5


PageRank-like Graph of Videos

6


PageRank-like Graph of Videos

7

Overlap GraphGO = (VO, EO)


Edge in Graph

8

Means video i and j has redundant visual information

Three types of links Duplicate videos

Part-of relationship

Overlapping

Video iVideo j


Related Work: VisualRank (WWW 2008)

Builds a graph of images using visual similarity between two im-ages

Runs PageRank algorithm to re-rank images

9


Automatic Tagging

Different approach with that of VisualRank Aims to enrich annotations

Not to improve search result

Three methods Simple neighbor-based tagging

Overlap redundancy aware tagging

TagRank: Context-based tag propagation in video graphs

10


Simple Neighbor-based Tagging

Transforms GO

Into the directed graph G’O(V’O, E’O) of overlapping videos

Weighting function of (i,j) describes to what degree video j is covered by video i

11

Video iVideo j

w(vi, vj)

w(vj, vi)


Simple Neighbor-based Tagging (contd.)

Gets tag t’s relevance score for a video from information of adjacent videos Weighted sum of influences of overlapping videos tagged by t

Counts only adjacent videos’ tags

12

0

1),( jvtI

Oij Evv

ijji vvwvtIvtrel'),(

),(),(),(

if vj is tagged with t

otherwise


An Example

13

t

t

t

t

t’s relevance score


Overlap Redundancy Aware Tagging

Potential high increase of relevance score if a video has multiple re-dundant overlaps

Contribution of same tag is reduced by relaxation parameter

14

),( 11 vtIw

),( 22 vtIw ),( 33 vtIw ),( 22 vtIw ),( 332 vtIw


TagRank

Tag weight propagates through the overlap graph

Relevance scores are computed in matrix form

TR converges into a certain value: solved with power iteration method

Start power iteration with original tagging information and limited num-ber of iteration

– To keep original tag relevance

– To prevent TR(t) converging into uniform value

15

t


Evaluation

Two kinds of evaluation: Machine-oriented and human-oriented view Data organization with automatically generated tags

– Classification

– Clustering

User-based evaluation

16


Data Collection

38,283 videos: initial set C Returned videos with top 500 general queries

Together with related videos given with results

Redundancy analysis

Over 35% of videos (VO) overlap with one or more other videos

17


Data Organization

Classification with 7 YouTube categories

Each of them is containing over 900 videos in VO

Binary classification with SVM

– Feature vectors constructed with original tags/automatically generated tags

Four strategies

– BaseOrig: Only considering user-provided tags

– NTag: Simple Neighbor-based tagging

– RedNTag: Overlap redundancy aware tagging

– TagRankΓ: TagRank with Γ iterations

18


Data Organization

Clustering k-Means clustering

Partition videos into k categories

Neighbor-based tagging and overlap redundancy aware tagging out-perform baseline and TagRank methods in both experiments

19


User-based Evaluation

Assessors rate new tags with web interface Increasingly higher average score when considering tags having higher au-

totag relevance score

20


Conclusions

Content redundancy in social sharing systems can be used to obtain richer annotations

Additional information obtained by automatic tagging can largely im-prove automatic organization of content There is information gain for users also

Future work Authors plan to generalize this work to consider different domains

– Photos in Flickr

– Text in Delicious

Analysis and generation of deep tags

– Tags linked to a small part of larger media source

21


Discussion

Good idea and good formalization

Would be better if performance of TagRank were good Considering only neighbors is too naïve method

How can we deal with overhead of visual processing?

Would it be scalable enough to apply it to all videos in YouTube?

22

Automatic Video Tagging using Content Redundancy

Documents

Transcript of Automatic Video Tagging using Content Redundancy