IR 670 Youtube Search Optimization
description
Transcript of IR 670 Youtube Search Optimization
IR 670 Youtube Search Optimization
Spring 2010 - Prof. Caverlee
Spencer Huang, Jue Yi, Chia-Chun Lin,
Youtube Search Problem
• When performing a Youtube search, regardless login or not, does not take a Youtube user’s preference into consideration (apply to Google video search also).
• Even if there is such a tool, no idea what’s in the box.
Notations
• UserEntity(u) - (t in u.fav) + (t in u.upload) + (t in
u.subcrption u’) u’, where u’ are terms in u’.fav and u’.upload; u could have multiple u’(s) that u subscribes to.
• VideoEntity(v) - (t in v.title) + (t in v.description) + (t
in v.category) + (t in v.tag)
• User would input Youtube Id and query (at least), which would enable the generation of above entities.
Approach and Feature
• Reorder via Vector-Space with stop words and with result retaining
• Support features from original Youtube/Google video search.
• Add New features such as by author and able/disable restrict content.
Crawl UserEntity
from Youtube Id and
VideoEntity of query
Stop words removal,
construct tf-idf for U and V, cache U for
re-use
Cosine-Similarity and
reorder
Display optimized
result
Results• Interface
• Ex: Say I
have 2 favorite
videos for Youtube
Id….
Results (cont.)
Evaluation and Analysis
Evaluation and Analysis (cont.)
Problems encountered
• Youtube caps quota for deployed GAE, too much requests to Youtube. Works for local GAE.
• Needs stop words.• Optimization is directly related to the
activities of a Youtube Id.
Possible To do Features and QA
• Support for Asian languages• Interface for feature descriptor: No need
of Youtube account.• Try it out: http://irytso.appspot.com/
End