Advanced Topics in Search Theory 1 - Introduction.

51
Advanced Topics in Advanced Topics in Search Theory Search Theory 1 - Introduction
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    221
  • download

    2

Transcript of Advanced Topics in Search Theory 1 - Introduction.

Advanced Topics in Advanced Topics in Search TheorySearch Theory

1 - Introduction

In Today’s ClassIn Today’s Class

Course proceduresWhat is economic search?Characteristics of economic searchClassical models in Search Theory:

– One Sided– Two-Sided– Mediated Search

Reservation-Value based search2

GoalGoal

Get familiar with the concept of “economic search”

Learn and master the main principles of economic search:– One-sided– Two-sided

3

Course ProceduresCourse Procedures

Course web-site can be found here:http://www.cs.biu.ac.il/~sarned/Courses/search/

Teacher: David Sarne ([email protected])

Office hours: Thu 15:00-16:00 (building 216, room 2)

Course exercises – 20%Course final exam – 80%

4

Course PlanCourse Plan

5

Week Topic Readings

1 Introduction to Search Theory

2 Pandora’s Problem

3 One-Sided Search – principles and optimal strategy

4 One sided search with unknown distribution

5 Concurrent search

6 Cooperative Search

7 The secretary Problem

8 Market throughput in one-sided search

9 Two-Sided Search with no search costs

10 Two-Sided Search with search costs multi-type

11 Two-Sided Search with search costs with one and two types

12 Throughput in two-sided search

13 Two-sided search with mediators

Disclaimer…Disclaimer…

Search in AI: deals with finding nodes having certain properties in a graph (find an optimal path from the initial node to a goal node if one exists)

– Branch and bound– A*– Hill climbing– …

This is not what we are interested in (at least in this course)

We deal with economic search

6

Have you searched for Have you searched for something lately?something lately?

Can you give examples for what you’ve searcher for?

7

8

Searching What?Searching What?

Everything!– Searching for a partner– Searching for a job– Searching for a product– Searching for a parking space– Searching for a java class (reuse)– Search for a thesis advisor– …

The goal here is to optimize the process rather than ending up with the optimal search object

How about the How about the “secretary problem”?“secretary problem”?

(also known as the marriage problem, the sultan's (also known as the marriage problem, the sultan's dowry problem, the fussy suitor problem)dowry problem, the fussy suitor problem)

There is a single secretarial position to fill. There are n applicants for the position, and the value of n is

known. The applicants can be ranked from best to worst with no ties. The applicants are interviewed sequentially in a random order,

with each order being equally likely. After each interview, the applicant is accepted or rejected. The decision to accept or reject an applicant can be based only

on the relative ranks of the applicants interviewed so far. Rejected applicants cannot be recalled. The object is to select the best applicant. The payoff is 1 for the

best applicant and zero otherwise. 9

Example - Marriage MarketExample - Marriage Marketlegacy domain (search “pioneers”)legacy domain (search “pioneers”)

Lifetime Utility

f(x)

Statistics ReminderStatistics Reminder

given a continuous random variable X, we denote:– The probability density function, pdf as f(x).

(also known as the probability distribution function and the probability mass function)

– The cumulative distribution function, cdf, as F(x).

The pdf and cdf give a complete description of the probability distribution of a random variable

11

PDFPDF

The pdf of X, is a function f(x) such that for two numbers, a and b with a≤b:

That is, the probability that X takes on a value in the interval [a, b] is the area under the density function from a to b.

12

CDFCDF

Thecdf is a function F(x), defined for a number x by:

That is, for a given value x, F(x) is the probability that the observed value of X will be at most x.

13

אחידה: התפלגות אחידה: דוגמה התפלגות דוגמה

14

200 300

f(x)=0.01

300 x 1

200 x 0

300x200 200*01.0

)(

x

xF

בדידה בדידה התפלגות התפלגות

במקוםf(x) אנו מדברים על P(x) ,למשל בהטלת קוביהP(2)=1/6

15

Sampling from the Sampling from the distributiondistribution

Draw a random value from a uniform distributionTake the value for which the CDF equals the

value drawn

16

P1

P2

P3

P4

t

f(t)

f4

f3

f1

f2

x1 x2 x3 x4 x5x

Fitting a DistributionFitting a Distribution

Visualize the Observed Data (decide on how to divide date to bins)

Come up with possible theoretical distributions

Test goodness-of-fit and p-values based on the empirical distribution function (EDF):– Kolmogorov-Smirnov– Chi-Square– Anderson-Darling

17

measures of discrepancy between the empirical distribution function and the cumulative distribution function based on a specified distribution

18

Comparison Shopping Agents Comparison Shopping Agents (CSAs)(CSAs)

Shopbots and Comparison Shopping– automatically query

multiple vendors for price information

– Growing market, growing interest

comparison-shopping agents

Comparison Shopping Agents Comparison Shopping Agents (CSAs)(CSAs)

Offline - central DB of prices (daily updated):

DB RequestsUIQuery

Timely Updates

Timely Updates

Timely Updates

Timely Updates

Real-time querying upon receiving a request:

RequestsUI

Query

Query

Query

Query

Real-Time Querying (CSAs)Real-Time Querying (CSAs)• Ever-increasing frequency of price updates

• Dynamic pricing theories (based on competitors’ prices) [Greenwald and Kephart, 1999]

• “Hit and run” sales strategies (short term price promotions at unpredictable intervals) [Baye et al, 2004]

Assumption: Future CSAs will use real-time (costly) querying

ExerciseExercise

Select 5 different products (preferable electronics, computers etc.)

Collect Prices for these products over the internet – build their empirical distribution (at least 50 prices for each)

Fit to a know distribution or describe the empirical distribution obtained

Calculate the optimal search ruleSend all the data with your file

22

23

Example - Marriage MarketExample - Marriage Marketlegacy domain (search “pioneers”)legacy domain (search “pioneers”)

Lifetime Utility

Should I try to do better?

f(x)

24

Can we do better?Can we do better?

Yes we can!However, it has a costThus a search strategy is needed

Strategy: (opportunities, time, cost)->(terminate, resume)

Search CharacteristicsSearch Characteristics

A distribution of plausible opportunities

The searcher is interested in exploiting one opportunity

Unknown value of specific opportunities

Search costs

Searching What?Searching What?Application Cost Opportunity

Marriage Market Time / money / loneliness

Better partner

Job Market Time / money / confidence

Better job

Product Time / money Better price / performance

Parking time Closer parking space

Looking for a thesis advisor

Working with him a little

More interesting thesis

Anyone searched for an apartment in her life? What made you take the one you are living in?

Anyone searched for an apartment in her life? What made you take the one you are living in?

Anyone sold an apartment in her life? What made you accept the “winning” bid?

Anyone sold an apartment in her life? What made you accept the “winning” bid?

The key concept – don’t attempt to find the best opportunity, instead find the best policy

The search strategyThe search strategy

After each draw, the searcher has a choice:– Keep what he has– Draw another opportunity from the

distribution F(), at a cost c

Notice: the net profit is a random variable whose value depends both on the actual draws and on his decisions to accept or reject particular opportunities

27

The GoalThe Goal

Maximize the expected value of the net profit

28

Application Cost Opportunity

Marriage Market Time / money / loneliness

Better partner

Job Market Time / money / confidence

Better job

Product Time / money Better price / performance

Parking time Closer parking space

The optimal strategyThe optimal strategy

Let V* be the expected profit if following the optimal strategy

Clearly the searcher should never accept an opportunity with a value less than V*

If he rejects the opportunity, he is in the same situation as a searcher who is starting anew: expect profit V*

Therefore:

29

y

dyyfVycV )(*],max[*

30

Example - Marriage MarketExample - Marriage Market

Lifetime Utility

Should I try to do better?

f(x)

Reservation V

alue - x

In a simple infinite horizon model - doesn’t depend on history

What is a reservation value?What is a reservation value?

It’s a threshold for decision making!

Example: “Krovim Krovim”

The reservation property of the optimal search rule is a consequence of the stationarity of the search problem (a searcher discarding an opportunity is in exactly the same position as before starting the search)

31

32

Example - Marriage MarketExample - Marriage Market

Lifetime Utility

Should I try to do better?

f(x)

Reservation V

alue - x

Terminate Search

Resume Search - sample one more

In a simple infinite horizon model - doesn’t depend on history

33

Terminate Search

Resume Search - sample one more

The optimal Reservation ValueThe optimal Reservation Value

Lifetime Utility

f(x)

V (x) c yf (y)dyyx

F(x)V (x)

x

V (x)

c yf (y)dyyx

1 F(x)

Distribution of utilities in the environment (p.d.f / c.d.f)Search

costExpected utility when using reservation value x

)(xfF(x)

34

The Reservation Value ConceptThe Reservation Value Concept

V (x) c yf (y)dyyx

F(x)V (x)

V (x)

c yf (y)dyyx

1 F(x)

Distribution of utilities in the environment (p.d.f / c.d.f)Search

costExpected utility when using reservation value x

What is x that maximizes V(x)?

)(xfF(x)

35

The Reservation Value ConceptThe Reservation Value Concept

V (x)

c yf (y)dyyx

1 F(x)

dV (x)dx

xf (x) 1 F(x) f (x) c yf (y )dyyx

1 F(x) 20

V (x) 1 F(x)

dV (x)dx

xf (x) 1 F(x) f (x)V (x) 1 F(x)

1 F(x) 20

xV (x)

36

Example - Marriage MarketExample - Marriage Market

Lifetime Utility

Should I try to do better?

f(x)

Reservation V

alue - x

Terminate Search

Resume Search - sample one more

The expected utility from accepting only “better” partner than the optimal reservation value woman will yield an expected overall utility equal to the “lowest’ partner I’m willing to accept

Some more interesting Some more interesting interpretationsinterpretations

37

*)(*)()(*)(*

xVxFdyyyfcxVxy

**)()(**

xxFdyyyfcxxy

**)( xxV

*

*

)(*)(*x

yxy

dyyfxdyyyfcx

*

)(*)(**xy

dyyfxyxcx

Some more interesting Some more interesting interpretations (2)interpretations (2)

38

*

)(*)(**xy

dyyfxyxcx

Stop searching and keeping x*

Searching exactly one more time

Myopic ruleMyopic rule

Important property of the optimal search rule – myopic:– The searcher will never decide to accept

an opportunity he has rejected beforehandSearcher cares only about whether or

not he wants the opportunity nowTherefore, we don’t care for the recall

option

39

Also notice that…Also notice that…

and:

40

0*

dc

dx

V (x)

c yf (y)dyyx

1 F(x)

Bernoulli trial is an experiment whose outcome is random and can be either of two possible outcomes, "success" and "failure".

Calculating the optimal RVCalculating the optimal RV

41

dV (x)dx

xf (x) 1 F(x) f (x) c yf (y )dyyx

1 F(x) 20

xy

dyyyfcxfxFxxf )()()(1)(

xy

dyyyfcxFx )()(1

xy

xy

xy

dyyFyyFdyyyf )()()(

Notice that:

Calculating the optimal RVCalculating the optimal RV

42

xy

xy

xy

dyyFyyFdyyyf )()()(

xy

xy dyyFyyFcxFx )()()(1

Therefore:

xy

y dyyFxyyFc )()(

xy

dyyFc )(1

CS economic search domainsCS economic search domains

CSAsJob schedulingSearching for free space in disksSearching for media in P2P

Classical tradeoff – time it takes to process vs. time it takes to find a strong processor

43

The Scheduling ProblemThe Scheduling Problem

Proxy

Price quote (q)Processor 1

Processor 2

Processor N)(qf

)(qf

)(qf

Price quote (q)

Price quote (q)

Scheduling

Process

c1

c2

cN

WorkFlowWorkFlow

Receive a job Contact proxy to learn about available

processors Query processors by using the proxy

– Each query delays you in c_i seconds– Each query will return the temporary load on

the server (this value will not change as long as current job is not scheduled)

Keep on querying until you are ready to schedule your job

The Goal is…The Goal is…

To schedule the job in a way that minimizes the EXPECTED overall delay– Overall delay = all delays due to queries +

the time job waits in queue of the selected processor

Problem 1Problem 1

You are about to purchase an iPod touch over the internet

You estimate the price distribution of the product over the different sellers to be uniform between 200-300 dollars

You can search by yourself, by visiting different web-sites – the cost of time for obtaining a price quote is $1

How will you search? What will be your expected cost? What’s the mean of the number of merchants you’ll visit?

SolutionSolution

200 300

f(x)

0.01

• Sequential search:

x

y

dyyFc0

)(

x

cost of search

marginal benefit

300 x 1

200 x 0

300x200 200*01.0

)(

x

xF

)()(1)()(200

xVxFdyyyfcxVx

y

x

y

ydyxVxF200

01.01)()(

Find the minimum costFind the minimum cost

x

y

ydyxVxF200

01.01)()(

2*01.0

200005.01)(

2

x

xxV

2*01.0

005.01)( 200

2

x

yxV

x

y

0

2*01.0

199005.001.02*01.001.0)('

2

x

xxxxV

199005.02*01.0 2 xxx

214.14 185.8,x

300 x 1

200 x 0

300x200 200*01.0

)(

x

xF

VerificationVerification

V(x)=x?

Mean number of merchants visited:

Mean payment to merchant: 214.14-7.14=207 (notice it’s less than minimum of sampling 7 merchants)

14.214

2*01.0

200005.01)(

2

x

xxV V

14.72*01.0

1

xN

Alternative SolutionAlternative Solution

200 300

f(x)

0.01

• Sequential search:

x

y

dyyFc0

)(

x

cost of search

marginal benefit

x

y

x200

201.01

xyyy 2002 2005.01

2002005.01 2 xx

14.214x

300 x 1

200 x 0

300x200 200*01.0

)(

x

xF