Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly...

24
Nearly optimal scheduling under time varying departure probabilities Martin Erauskin UPV / EHU Martin Erauskin (UPV / EHU) Nearly optimal scheduling 1 / 20

Transcript of Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly...

Page 1: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Nearly optimal scheduling under time varying departureprobabilities

Martin Erauskin

UPV / EHU

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 1 / 20

Page 2: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Outline of talk

Introduction

Problem description

A nearly optimal solution

Numerical results

Conclusions and future work

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 2 / 20

Page 3: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Multi armed bandit problem

A sequential decision problem where at each time slot the agent mustchoose one of K available options.

Depending on the chosen action, the agent receives a payoff at theend of the time slot.

Goal: maximize the present value of the future payoffs, choosing theright sequence of actions.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 3 / 20

Page 4: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Motivation: wireless application

In each time slot, thebase station selects acustomer to serve.

Channel conditions varydue to fading andinterference effects.

Each state representsdifferent channelconditions.

In each channel condition,the probability ofcompleting the job in onetime slot is different.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 4 / 20

Page 5: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Problem description

Time is slotted.

K customers waiting for service, k ∈ K = {1, 2, ...,K}.Nk = {1, 2, ...,Nk} set of possible states for customer k .

∀n ∈ Nk , µk,n: departure probability for customer k, if served, whenit is at state n.

∀n ∈ Nk , qk,n: probability for customer k of being at state n.

ck : holding cost of customer k per slot waiting for service.

0 ≤ µk,1 ≤ µk,2 ≤ · · · ≤ µk,Nk≤ 1.

Independence in the state evolution history, independence betweendifferent customer’s current states.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 5 / 20

Page 6: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

A particular case: cµ rule

Let |Nk | = 1, ∀k ∈ K. Then:

Theorem

The policy that gives service to the customer k∗, where

k∗ = arg maxk∈K

ckµk

minimizes the expected total cost incurred by the system.

Remark: This policy also minimizes the one-period expected costincurred by the system.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 6 / 20

Page 7: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

MDP formulation

A = {0, 1}: Action space. Action 0 means ’not serving’, action 1means ’serving’.

The expected one-period reward earned by customer k at state n,depending if it is served or not, will be given by

R1k,n = −ck(1− µk,n) R0

k,n = −ckXk(·): state process of customer k .

ak(·): action process of customer k.

We define the next β-average operator for 0 < β < 1:

Bπ0[Q

a(·)X (·), β

]:= lim

T→∞

T−1∑t=0

βt Eπ0[Q

a(t)X (t)

]T−1∑t=0

βt

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 7 / 20

Page 8: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

MDP formulation

Let Π be the set of admisible policies.

The optimization problem can be described as follows:

maxπ∈Π

Bπ0

[∑k∈K

Rak (·)k,Xk (·)

](P)

subject to∑k∈K

ak(t) = 1, for all t ∈ T

The original problem can not be solved neither analytically nornumerically.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 8 / 20

Page 9: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Relaxations (P. Whittle (1988))

We relax the constraint: serve 1 customer on average.∑k∈K

ak(t) = 1 =⇒ Eπ

[∑k∈K

ak(t)

]= 1,∀t ∈ T

We relax again this constraint to the β-average constraint:

[∑k∈K

ak(t)

]= 1, ∀t ∈ T =⇒ Bπ0

[∑k∈K

ak(·)

]= 1

We obtain the next relaxed problem:

maxπ∈Π

Bπ0

[∑k∈K

Rak (·)k,Xk (·)

](RP)

subject to Bπ0

[∑k∈K

ak(·)

]= 1

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 9 / 20

Page 10: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Solution: Potential Improvement rule

Relaxed problem can be approached using Lagrangian methods.

maxπ∈Π

Bπ0

[(∑k∈K

Rak (·)k,Xk (·) − ν

∑k∈K

ak(·)

)]− ν

We decompose this problem in K subproblems:

maxπ̃k∈Πk

Bπ̃k0

[(Rak (·)k,Xk (·) − νak(·)

)](SRP)

We solve K subproblems, and we obtain the joint optimal policy forthe relaxed problem combining them.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 10 / 20

Page 11: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Solution: Potential Improvement rule

Theorem

Let

νk,n =ckµk,n

(1− β) + β∑m>n

qk,m(µk,m − µk,n)for n 6= Nk , νk,Nk

=∞

Then:

If ν ≤ νk,n, it is optimal to serve customer k under state n ∈ Nk ;

If ν ≥ νk,n, it is optimal not to serve customer k under state n ∈ Nk ;

Sketch of the proof.

By solving the dynamic programming equation.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 11 / 20

Page 12: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Solution: Potential Improvement rule

We construct a feasible policy for the original problem, using theoptimal solution of the relaxed problem:

Potential Improvement rule: gives service at time t to job k∗ (t)such that:

k∗(t) := arg maxk∈K

νk,Xk (t)

Not necessarilly optimal for the original problem.

For β = 1, we have the time-average index:

νk,n =ckµk,n∑

m>nqk,m(µk,m − µk,n)

for n 6= Nk , νk,Nk=∞

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 12 / 20

Page 13: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Scheduling disciplines

cµ index:

νcµk,n := ckµk,n for n ∈ Nk ;

Score Based index (T.Bonald, 2004):

νSBk,n := ck

n∑m=1

qk,m, for n ∈ Nk .

Relatively Best index (Qualcomm 3G standard, 2000):

νRBk,n :=

ckµk,nNk∑m=1

qk,mµk,m

, for n ∈ Nk .

Potential Improvement index:

νPIk,n =

ckµk,n∑m>n

qk,m(µk,m − µk,n)for n 6= Nk , νPI

k,Nk=∞

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 13 / 20

Page 14: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Problem with arrivals of new customers

We consider k ∈ K different classes of customers.

λk : probability, in each time slot, of having a new customer of class k.

Definition

A system is called stable if the number of customers does not explode.

Consider

%k =λkµk,Nk

% =∑k∈K

%k

Theorem (S. Aalto, P. Lassila (2010))

If any customer in its best state is preferred over any other customerwhich is not in its best state, then the policy is stable for every % < 1.

Remark: PI rule is stable for every % < 1.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 14 / 20

Page 15: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Numerical simulations: scenario 1

Two classes, k ∈ K = {1, 2}.λ2 fixed.Departure probabilities fixed for both classes of customers.c1 = c2 = 1.We move λ1 such that % varies from 0.5 to 1.

Figure: Mean number of customers in the system versus %, Scenario 1.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 15 / 20

Page 16: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Numerical simulations: scenario 1

Two classes, k ∈ K = {1, 2}.λ2 fixed.Departure probabilities fixed for both classes of customers.c1 = c2 = 1.We move λ1 such that % varies from 0.5 to 1.

Figure: Mean number of customers in the system versus %, Scenario 1.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 15 / 20

Page 17: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Numerical simulations: scenario 1

Figure: Sample path of the number of customers in the system in Scenario 1,% = 0.95.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 16 / 20

Page 18: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Numerical simulations: scenario 1

Mean number of class-2 customers versus mean number of class-1jobs.

Indifference curves link points with the same value of %.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 17 / 20

Page 19: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Numerical simulations: scenario 1

Mean number of class-2 customers versus mean number of class-1jobs.

Indifference curves link points with the same value of %.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 17 / 20

Page 20: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Numerical simulations: scenario 2

λ1 and λ2 fixed.

Departure probabilities fixed for class-2 customers.

We vary proportionally departure probabilities for class-1 customers,moving % between 0.50 and 1.

Figure: Mean number of customers in the system versus %, Scenario 2.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 18 / 20

Page 21: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Numerical simulations: scenario 2

λ1 and λ2 fixed.

Departure probabilities fixed for class-2 customers.

We vary proportionally departure probabilities for class-1 customers,moving % between 0.50 and 1.

Figure: Mean number of customers in the system versus %, Scenario 2.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 18 / 20

Page 22: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Stochastic dominance

Definition

Two random variables X and Y are stochastically ordered (denoted asX ≤st Y ) if and only if P(X ≤ z) ≥ P(Y ≤ z), ∀z .

Simulations suggest stochastic dominance of Potential Improvementrule over the other rules.

X: Number of jobs in the system.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 19 / 20

Page 23: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Conclusions and future work

Main conclusions:

Depending on the value of the parameters, RB or cµ might outperformthe others.PI consistently outperforms all the other policies (or is equivalent tothe best one).Simulations strongly suggest stochastic dominance of PI over the otherrules.The stability region is the maximum for PI rule, while it is not for cµand RB rules.

Future work:

Include correlations between the states of different jobs in the model.Overload analysis: The slope of the sample paths when system isunstable.Stability region for different policies.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 20 / 20

Page 24: Nearly optimal scheduling under time varying departure probabilities · 2011-02-10 · Nearly optimal scheduling under time varying departure probabilities Martin ... A nearly optimal

Conclusions and future work

Main conclusions:

Depending on the value of the parameters, RB or cµ might outperformthe others.PI consistently outperforms all the other policies (or is equivalent tothe best one).Simulations strongly suggest stochastic dominance of PI over the otherrules.The stability region is the maximum for PI rule, while it is not for cµand RB rules.

Future work:

Include correlations between the states of different jobs in the model.Overload analysis: The slope of the sample paths when system isunstable.Stability region for different policies.

Martin Erauskin (UPV / EHU) Nearly optimal scheduling 20 / 20