Final Presentation V3

WEB RECOMMENDER FINAL REPORTWei Chen

Yue (Jenny) Cui

OUTLINE

Motivation Problem Statement Goals for this project (solution) Requirements Design

Framework design Algorithm design Evaluation design

Results SE Techniques used in this project Lessons Learned

MOTIVATION

Background People use the web to browse information There are too much information on the web To facilitate this web-browsing process

Fast Accurate

Existing solution: search engines User types in a query Search engine returns relevant pages

OUTLINE




PROBLEM STATEMENT

What is a Web Recommender? A web-browsing tool Recommends relevant web pages to the user

while he/she is reading a page on the web. Why is it important?

Provides a convenient way to browse the web Automatically recommends relevant information Less effort to make and type in queries Reserves the benefit from the state-of-the-art search

engine

PROBLEM STATEMENT (CONT.)

Why is it hard? Making queries from a web page: keyword

summarization Search engines are not perfect: post-processing People have different reading goals: how to

define relevance?

OUTLINE




GOALS FOR 2009 SPRING SEMESTER

Provide a software framework for Web Recommendation

Provide basic recommendation algorithms Basic services Baselines for future research on Web

Recommendation A tutorial: how to develop your own algorithm

based on our framework? Propose an evaluation prototype

OUTLINE




REQUIREMENTS

Functional Requirements Given a web page as input, the system should be

able to find a list of relevant web pages Provide three recommendation algorithms

Baseline HTML-Structure-based Semantic-based

A simple GUI for evaluation

REQUIREMENTS (CONT.)

Non-functional Requirements Results can be retrieved in 5 seconds

OUTLINE




DESIGN: CLASS DIAGRAM

<<interface>>SearchEngine

search(Query q) : List<Page>

StructureFeatureRecommender

recommend(Page p): List<Page>

SemanticFeatureRecommender

recommend(Page p): List<Page> YahooSearch


BasicRecommender


<<interface>>WebRecommender


GoogleSearch


<<interface>> Stemmer

stem(String s) : String

<<interface>> HTMLStripper

strip(Page p) : String

<<interface>> HTMLParser

parse(Page p) : ParseTree

Util

<<interface>> StopwordRemover

remove(String s) : String

<<interface>>QueryTermFilter

filterQueryTerms(List<String> keyTerms) : List<String>

FrequencyFilter

filterQueryTerms(List<String> keyTerms) : List<String>

Classes in Util package are singletons

QueryFormulator

form(List<String> finalTerms) : Query

<<interface>>

OrQueryFormulator

form(List<String> finalTerms) : Query

PorterStemmer

stem(String s) : String

NaiveHTMLStripper

strip(Page p) : String

SmartParser

parse(Page p) : ParseTree

GenericStopwordRemover

remove(String s) : String

DESIGN: SEQUENCE DIAGRAM

:StructureFeatureRecommender

:YahooSearch

create(YahooSearch ys)

recommend(p)

search(q)

List<Page> pages

Extract features to form key terms.

:EvaluationGUI

:QueryFormulator

create()

create()

and(List<String> finalTerms)

Query q

:HTMLParser

:Stemmer

parse(p)

ParsingResult pr

stem(pr)

Stemming Result sr

List<Page> pages

:FrequencyFiltercreate()

filterQueryTerms(List<String> keyTerms)

requestRecommends(Page p)

List<String> finalTerms

OUTLINE




ALGORITHM: BASELINE

Algorithm Strip off HTML tags (e.g. </html>) Remove non-word tokens (e.g. “/**/”) Remove stop words (e.g. “the”)

Example Input page:

http://en.wikipedia.org/wiki/Entropy Output query:

Entropy, free, encyclopedia, Jump, search, article

ALGORITHM: HTML STRUCTURE

Algorithm Parse HTML page Extract text content from node <title> and <a> Remove stop words (e.g. “the”) Select the 10 most frequent words

Example Input page:

http://en.wikipedia.org/wiki/Entropy Output query:

ISBN, edit, entropy, thermodynamics, Entropy, energy, system, law, heat, thermodynamic

ALGORITHM: SEMANTIC FEATURES

Algorithm Baseline + named entities with highest

frequency (top 5) Example

Input page: http://en.wikipedia.org/wiki/Entropy

Output query: ISBN, University, Press, Boltzmann, John

OUTLINE




EVALUATION Evaluation form:

Evaluation criteria:Modified Average Precision

Example: AveP = (0+1/2+2/3) /3 = 0.389

N

rrelrPAveP

N

r))()((

1

Input page Recommended page Relevancehttp://en.wikipedia.org/wiki/Natural_language_processing

http://research.microsoft.com/jump/50176 0

http://nlp.stanford.edu/ 1

http://www.aaai.org/aitopics/html/natlang.html 1

TEST DATA SELECTION Input pages from 5 topics:

“Harry Porter” “Waterboarding” “Wei Chen@CMU homepage” “Entropy (thermodynamics)” “How to make Sushi”

Dimensions Popular vs. Unpopular (“Harry Porter”, “Wei Chen”) Ambiguous vs. Unambiguous (“Entropy”, “Sushi”) New vs. Old (“Waterboarding”, “Entropy”) Procedural vs. Conceptual (“How to”, “Entropy”) Technological vs. Mass media (“Entropy”, “Harry

Porter”)

OUTLINE




TEST DATA

We evaluate on 5 topics and 3 algorithms. We have total of 15 categories. Each category has 5 recommended WebPages.

We have total of 5 evaluators. Each of them scored 75 web pages.

Entropy Harry Potter waterboarding Wei Chen Make Sushi Average on AlgorithmBaseline 0.519 0.9032 0.507 0.7444 0.1274 0.5602Semantic 0.6926 0.9686 0.1738 0.457 0 0.4584Structure 1 0.982 0.8564 0.713 0.7444 0.85916Average on Topic 0.7372 0.951266667 0.5124 0.638133333 0.2906

AVERAGE PRECISION

AVERAGE ON ALGORITHMS

AVERAGE ON TOPICS

KAPPA

Anthony Hideki Jenny Shilpa WeiAnthony 1 0.761905 0.839744 0.86688 0.784483Hideki 0.76191 1 0.656085 0.731183 0.568528Jenny 0.83974 0.656085 1 0.813632 0.676724Shilpa 0.86688 0.731183 0.813632 1 0.709609Wei 0.78448 0.568528 0.676724 0.709609 1

•We can achieve very good inter-coder agreement, if we revise our score criteria. •We all seem to agree with Anthony (maybe we should ask him to revise our score criteria).

CONCLUSION Topics play an important role in the evaluation

results. The more popular and resourceful the topic is the better the evaluation results are. The time sensitive topic has the highest invalid page rate.

At this point we cannot make any conclusion about our algorithms. Only Structure algorithm seems better.

We don’t know what makes the difference in the evaluation results of the three algorithms. We need to design a new experiment to analyze query terms which the three algorithms generated in order to answer this question.

We should include the condition in which it uses the human generated query terms as our control condition in our new experiment.

OUTLINE




SE TECHNIQUES Iterative Process at each stage Design

Iteration 1: Initial design of framework Composite-pattern based evaluation design

Iteration 2: Added query formulator and query filter Simplified evaluation design

Implementation: Iteration 1:

Initial implementation of framework Implemented evaluation component based on composite

pattern Iteration 2:

Implemented query formulator and query filter Implemented simplified version of evaluation GUI

SE TECHNIQUES (CONT.)

Evaluation: Iteration 1:

Pilot study Weighted average relevance score

Iteration 2: 5 raters, 5 input pages Modified average precision

Used Wiki to coordinate Test-driven method for implementation

OUTLINE




WHAT CHANGED OVER THE SEMESTER

Our evaluation GUI went through several rounds of changes. Relation database composite pattern Excel

We planned to use our evaluation GUI in our final evaluation, but we were unable to do it because the speed of loading a webpage is too slow in the GUI. We switched to use Excel files for the evaluation.

WHAT WOULD WE CHANGE IN OUR APPROACH IN THE FUTURE We want to improve our risk analysis: one

tricky thing about risk analysis is that it is unexpected. We didn’t expect that speed will be a problem of our GUI.

Evaluation took more time than we had thought. We want to allow more time for evaluation, because we need time for pilot study before we conduct the experiment. Then we can have detailed and systematic analysis of the algorithms and improve our algorithms based on the analysis.

Time management: We should start evaluation early so that we can improve our algorithms based on evaluation results.

ACKNOWLEDGEMENTS

Thanks Dr. Nyberg, Dr. Tomasic, Shilpa and Hideki for valuable comments and suggestions on our project through out the semester

Thanks our raters for the evaluation task Thanks our classmates for helpful discussions

Final Presentation V3

Technology

Transcript of Final Presentation V3