Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.
-
date post
21-Dec-2015 -
Category
Documents
-
view
223 -
download
7
Transcript of Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.
![Page 1: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/1.jpg)
Video Search Engines and
Content-Based Retrieval
Steven C.H. Hoi
CUHK, CSE
18-Sept, 2006
![Page 2: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/2.jpg)
Outline
Video Search Engines
Content-Based Video Retrieval
![Page 3: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/3.jpg)
Video Search Engines
A survey of state-of-the-arts
![Page 4: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/4.jpg)
Introduction
Who are doing video search engines?
Top text search engines5.6 billion searches
07/2006
![Page 5: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/5.jpg)
Introduction Google
![Page 6: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/6.jpg)
Introduction Yahoo
![Page 7: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/7.jpg)
Introduction MSN/Live Search
![Page 8: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/8.jpg)
Introduction YouTube
![Page 9: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/9.jpg)
Business Models Web Advertising
Site Volume, or keyword customized Video Ads
Disable controls (MSN) Subscription
MLB, Real Download to own
iTunes, Movie Rental
Limited time, number of plays Other
Desktop Media Search Media player (jukebox) Media Monitoring Media Asset Management
![Page 10: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/10.jpg)
Types of video Sites Content Originators
Major Broadcasters Affiliates, Local News Major League Baseball
Syndication, Aggregation, “Internet Broadcasters” Rental, purchase, advertising, subscription MSN, Google, iTunes ROO Media, FeedRoom
Movie and Video Download Share portals
Consumer content, blogs YouTube, Putfile, Vsocial, Google, Akimbo
Traditional Search Engines (Crawl) / “RSS” Yahoo, Blinkx
Other Public (Internet Archive) Media Monitoring, asset management systems
![Page 11: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/11.jpg)
Video Search Challenges
![Page 12: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/12.jpg)
Current Video Search Engines
Metadata File type and context Media file attributes
Size, length Structured global metadata
RSS content description
Content Content Indexing
Search within a video Full text of dialog Image or video content
Automated Content Indexing
![Page 13: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/13.jpg)
Current Video Search Engines
Content Search Engines
Keyword search with transcripts from speech recognition
![Page 14: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/14.jpg)
Content-Based Video Search Engine
Architecture
![Page 15: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/15.jpg)
Content-Based Video Search Engine
Video Processing
![Page 16: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/16.jpg)
Content-Based Video Search Engine
Research ChallengesSpeech RecognitionShot Boundary DetectionVideo Story Segmentation Concept DetectionMulti-modal Fusion for Ranking
Text/ASR, Audio/Speech, Visual, etc.
![Page 17: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/17.jpg)
Content-Based Retrieval
Our Research ProblemLearning to rank video shots for automatic
content-based search tasks !
ChallengesMulti-Modal Information FusionSmall Sample Learning (a few pos. & no neg.)Learning on large-scale datasets
![Page 18: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/18.jpg)
Multi-modal and Multi-scale Ranking Framework
Main IdeasRepresenting video structures by graphsUsing semi-supervised learning to address
small labeled sample learning problemFusing Multi-modal information by Harmonic
learning over graphsMulti-scale ranking for achieving efficient
performance on large-scale datasets
![Page 19: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/19.jpg)
Multi-modal and Multi-scale Ranking Framework
Graph-based Modeling
StoryText
Shot
![Page 20: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/20.jpg)
Multi-modal and Multi-scale Ranking Framework
Semi-Supervised Learning on GraphTo find an optimal real-valued function
g: VR on the graph GTo minimize a quadratic energy function:
Using Gaussian field and Harmonic property of Spectral Graph Theory (J. Zhu’s ICML’03), a harmonic function g can be found:
![Page 21: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/21.jpg)
Multi-modal and Multi-scale Ranking Framework
Semi-Supervised Learning on GraphLet
The solution of the harmonic function g can be expressed in matrix operations:
![Page 22: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/22.jpg)
Multi-modal and Multi-scale Ranking Framework
Multi-Modal Fusion over GraphTo combine text information into SSL on visual
modality, we consider the text inputs as the attached nodes on the visual graph:
Visual - g
Text - f
![Page 23: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/23.jpg)
Multi-modal and Multi-scale Ranking Framework
ChallengesNumber of examples in database: N is large
For examples:TRECVID 2005: Rep. Key-Frames N = 45,765TRECVID 2006: Rep. Key-Frames N = 79,487
How to do Semi-Supervised Learning?!
![Page 24: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/24.jpg)
Multi-modal and Multi-scale Ranking Framework
Multi-Scale RankingLearning ranking through multi-scale rerankingEach stage is associated with different
computational costsIn our solution, four ranking stages include:
Ranking by Text Retrieval using Language ModelsRe-ranking by NN fusing Text and VisualRe-ranking by SVM fusing Text and VisualRe-ranking by multi-modal Semi-supervised Learning
![Page 25: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/25.jpg)
Top M related Stories
Text
Top N2 related Shots
Text + Visual NN
SVM/KLR
Top N3 related Shots
Top N4 related Shots
SSR
Video Stories
Video Shots
Top N1 related Shots
Text Processing
VideoProcessing
User’s Queryreturn top K shots
Multi-modal Fusion
Mu
lti-sc
ale
Ra
nk
ing
Image Processing
Raw
Video C
lips / Stream
s
Semi-Supervised Ranking
Supervised Ranking
![Page 26: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/26.jpg)
Benchmark Evaluations
DatasetTRECVID 2005Test: 140 video clips, 45,765 rep. key frames24 queriesA query example:
<videoTopic num="0152">
<textDescription text="Find shots of Hu Jintao, president of the People's Republic of China" /> </videoTopic>
![Page 27: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/27.jpg)
Benchmark Evaluations Text-only Retrieval
No Pseudo-Relevance Feedback (No-PRF)
With Pseudo-Relevance Feedback (PRF)
Evaluation of Language Models
0
0.02
0.04
0.06
0.08
0.1
MA
P No-PRF
PRF Language Models TF-IDF Okapi KL-JM KL-DIR KL-ABS
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
0.1
Text-only Results
MA
P
IBM
Columbia
TRECVID-Max
CUHK
![Page 28: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/28.jpg)
Benchmark Evaluations Visual Features
Color Grid Color Moment 3*3 grid, 81-dimensions
Edge Edge Direction Histogram 36 bin+1, 37-dimensions
Texture Gabor Moments 5*8=40, 3 moments,120
dimensions
238 dimensions in total
Normalized Comparison
0
0.1
0.2
0.3
0.4
0.5
0.6
0 20 40 60 80 100 120
GCM
EDH
Gabor
GCM+Gabor+EDH
COREL Benchmark Photos
![Page 29: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/29.jpg)
Benchmark Evaluations
Multi-modal Retrieval (Text + Visual)Text-only retrievalText + NN (Text + Visual)Text + SVM (Text + Visual)MMMS (Text + Visual)
![Page 30: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/30.jpg)
Benchmark Evaluations
MAP Num_Ret Improvement
Text 0.0903 1669 0%
Text+NN 0.1034 1705 +14.51%
Text+SVM 0.1083 1764 +19.93%
MMMS 0.1157 1764 +28.13%
Average Performance on TRECVID 2005 Dataset
Evaluation Results
![Page 31: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/31.jpg)
Benchmark Evaluations
0.095
0.1
0.105
0.11
0.115
0.12M
AP
IBM (T+V+M)
CUHK-MMMS
Columbia (V+T+M)
IBM (V+T)
Average performance of 24 queries
Comparison with other approaches
![Page 32: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/32.jpg)
Related Work
IBM Solution SVM + NN + Multiple Instance Learning
Columbia solutionInformation-Theoretical Clustering Approach
CMU SolutionQuery-Class Dependent Weighting Ranking
![Page 33: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/33.jpg)
Conclusion
A tutorial of video search engines Research contributions
A Unified framework of Multi-Modal and Multi-Scale Ranking for video retrieval
Graph-based Modeling of video structuresSemi-Supervised Learning for Multimodal
RankingMaking SSL practical for large-scale problemsPromising empirical results…
![Page 34: Video Search Engines and Content-Based Retrieval Steven C.H. Hoi CUHK, CSE 18-Sept, 2006.](https://reader035.fdocuments.in/reader035/viewer/2022062714/56649d6a5503460f94a48126/html5/thumbnails/34.jpg)
Future Work
Research is in progress, tough ahead…
Any suggestions or comments?