Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005...

Post on 13-Dec-2015

215 views 1 download

Tags:

Transcript of Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005...

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.1

Chapter 3 : The Problem of Web Navigation

• User’s often get “lost in hyperspace” when– Following links on web pages, or– Jumping to and from search engine results.

• Machine learning can provide a sound basis for improving web intreraction.

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.2

Getting lost in hyperspace

Figure 3.1: The navigation problem

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.3

Getting lost in hyperspace

Figure 3.2: Being lost in hyperspace

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.4The Naïve Bayes Classifier:

Automatic classification of web pages can widen the scope and size of web directories

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.5

Trails should be First-Class Objects

Figure 3.3: Example web site

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.6

Trails should be First-Class Objects

Figure 3.4: Four trails within a web site

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.7

Trails should be First-Class Objects

Figure 3.5: Query results for “mark research”

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.8

Trails should be First-Class Objects

Figure 3.6: Relevant trail for “mark research”

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.9

Markov chains

• Markov chains have been extensively studied by statisticians and have been applied in a wide variety of areas.

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.10

The probabilities of following links

Figure 3.7: Markov chain for example web site

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.11

The probabilities of following links

Figure 3.8: Two trails in the Markov chain

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.12

The probabilities of following links

Figure 3.9: Probabilities of the four trails

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.13

The relevance of links

Figure 3.10: Scoring web pages

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.14

The relevance of links

Figure 3.11: Constructing a chain from scores

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.15

Conflict Between Web Site Owner and Visitor

• The web site owner has objectives related to the business model of the site, e.g. selling products in an e-commerce site.

• The objectives of visitors are related to their information needs, e.g. gathering information in an e-commerce site.

• Web site owners would like to identify their visitors (e.g. via cookies), while visitors may prefer to remain anonymous.

Mark Levene, An Introduction to Search Engines and Web Navigation © Pearson Education Limited 2005

Slide 3.16

Conflict Between Semantics of Web Site and Business Model

• E.g. the objective of an e-commerce site is to convert visitors into customers.

• But to keep visitors satisfied a web site must provide solutions to users’ information needs.

• There must be a balance between web site navigability and the business objectives of the site.