Optimal threshold policies in a two-class preemptive …Using the dual approach, RIGHTER [7] extends...

Optimal threshold policies in a two-class preemptive priorityqueue with admission and termination controlCitation for published version (APA):Brouns, G. A. J. F., & Wal, van der, J. (2002). Optimal threshold policies in a two-class preemptive priority queuewith admission and termination control. (SPOR-Report : reports in statistics, probability and operations research;Vol. 200216). Technische Universiteit Eindhoven.

Document status and date:Published: 01/01/2002

Document Version:Publisher’s PDF, also known as Version of Record (includes final page, issue and volume numbers)

Please check the document version of this publication:

• A submitted manuscript is the version of the article upon submission and before peer-review. There can beimportant differences between the submitted version and the official published version of record. Peopleinterested in the research are advised to contact the author for the final version of the publication, or visit theDOI to the publisher's website.• The final author version and the galley proof are versions of the publication after peer review.• The final published version features the final layout of the paper including the volume, issue and pagenumbers.Link to publication

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain • You may freely distribute the URL identifying the publication in the public portal.

If the publication is distributed under the terms of Article 25fa of the Dutch Copyright Act, indicated by the “Taverne” license above, pleasefollow below link for the End User Agreement:www.tue.nl/taverne

Take down policyIf you believe that this document breaches copyright please contact us at:[email protected] details and we will investigate your claim.

Download date: 09. Nov. 2020

https://research.tue.nl/en/publications/optimal-threshold-policies-in-a-twoclass-preemptive-priority-queue-with-admission-and-termination-control(f8b184ad-90dc-42c7-a476-4f1e62f25435).html

TU/e technische universiteit eindhoven

SPOR-Report 2002-16

Optimal threshold policies in a two-class preemptivepriority queue with admission and terminationcontrol

G.A.J.F. BrounsJ. van der Wal

SPOR-ReportReports in Statistics, Probability and Operations Research

Eindhoven, November 2002The Netherlands

/ department of mathematics and computing science

SPaR-ReportReports in Statistics, Probability and Operations Research

Eindhoven University of TechnologyDepartment of Mathematics and Computing ScienceProbability theory, Statistics and Operations researchP.O. Box 5135600 MB Eindhoven - The Netherlands

Secretariat: Main Building 9.10Telephone: + 3140 247 3130E-mail: [email protected]: http://www.win.tue.nVmathlbslcosor.html

ISSN 1567-5211

Optimal threshold policies in a two-class preemptivepriority queue with admission and termination control

Giclo A.J.F. Braunsace@win. tue.nl

Jan van cler Waljan. v.d. [email protected]

Eindhoven University of TechnologyDepartment of Mathematics and Computing Science

PO Box 513 / 5600 MB Eindhoven / The Netherlands

Nov 4,2002

Abstract

We consider a two-class M A" A2 IMI'11 preemptive priority queue in which there are

two essential on-line decisions that have to be taken. The first is the decision to eitheraccept or reject new type-lor type-2 jobs. The second is the decision to abort jobs, i.e.,to remove any type-lor type-2 jobs from the system. We show that there exist optimalthreshold policies for these two types of decisions.

Key words: priority queues, admission and termination control, optimal thresholdpolicies, Markov decision processes.

1 Introduction

There is an extensive literature on the optimal dynamic control of queueing systems.Comprehensive overviews are given by, e.g., STIDHAM AND WEBER [8] and TEGHEM [9],who provide an extensive list of references to literature devoted to the analysis of specificworkload models. Topics include optimal control of service rates, optimal admission control,optimal routing control, optimal server allocation and optimal scheduling in networks ofqueues. Emphasis is laid on the characterization of the structure of optimal control policies.

In the light of the model we study in this paper, we mention, as an example, the two-classpreemptive priority queue (see, e.g., the recent work of GROENEVELT, KOOLE AND NAIN [4]).This model concerns a single server serving two customer classes with holding and switchingcosts. The corresponding control problem involves the objective to switch between classes insuch a way that the sum of expected holding and switching costs is minimized.

An important characteristic of almost all optimal control problems studied in literature, eitherwith or without admission control, is that admission is final, Le., once new work has beenaccepted for service, it must be processed by the system, and must be processed to a finish,before it can be considered to be out of the system. Models subject to clearing control count

1

as an exception. These are models in which at any time it may be decided to instantaneouslyremove all work from the system.

A new field of application of workload control models is workflow management. In workflowcontrol problems, e.g., tax control, handling insurance claims and crime investigation, thecapacity is insufficient to deal with all jobs and to treat all jobs to the full extent. It must bedecided which jobs to serve and when to stop. The types of control studied in literature do notcover this type of decision. Clearing, i.e., either removing the complete workload or keepingall work in the system, is far too rigorous. Workflow problems call for a more subtle controlwith respect to the admission and disposal of jobs. In this paper, we consider a basic modelin which these two decisions, accepting or rejecting new jobs and removing or maintaining ajob, are present.

An initial effort to model the disposal of jobs is given by XU AND SHANTHIKUMAR [11], whointroduce a new approach for determining the optimal admission control policy in a FCFSMIMlm ordered-entry queueing system with nonidentical servers. The idea of this approachis to construct a dual system: a preemptive LCFS MIMlm ordered-entry system withoutadmission control, but with expulsion control. A system is subject to expulsion control ifcustomers-which may not be denied entry to the system-may be expelled from the system,with the restriction that one can only expel customers-one after another-from the end ofthe queue. It is shown that the two systems induce the same probabilistic behaviour for thedeparture process and the number of customers in the system under any given policy. Hence,the optimal policy in the original system agrees with its counterpart in the dual system.

Xu [10] employs the dual approach to determine the optimal admission and scheduling controlpolicy in a FCFS MIMI2 queueing system with nonidentical servers. The corresponding dualsystem is subject to expulsion and scheduling control.

Using the dual approach, RIGHTER [7] extends the results of [10] to an MIMI2 queueingsystem with nonidentical servers and multiple classes of customers, where preemption isallowed. Further extensions are given to models with finite buffers and models with deadlinesfor customer service completion.

In the aforementioned papers, expulsion control models were used as a tool rather than agoal. Within the framework of workflow control, expulsion control is too restrictive, since onemay only expel a job from the end of the queue and not, for example, the job currently inservice. Apart from either serving a job completely or not at all, there is no control of theservice times of the jobs in the system.

BRaUNS AND VAN DER WAL [1] introduce the concept of termination control, studying aFCFS single server one-class workload model in which the service of a job may be abortedbefore the job has received full service, and in which work may be removed from the queueas well, at any point in time. Under certain regularity conditions, they show that there existoptimal threshold policies for the decision to accept or reject a new job and the decision tocontinue or abort the service of a job.

JOHANSEN AND LARSEN [5] also consider a FCFS single server one-class workload model inwhich a key feature of the control policy is its ability to let the service time of a job dependon the actual number of jobs in the system, and to remove jobs from the queue. Their controlpolicy is less dynamic in the sense that a job entering service is assigned a service time in

2

advance, which may not be altered during service. So, service may not be aborted before thepre-assigned service time has elapsed and service may not be extended either.

Our model: The model we study is a two-class M Aj,

A2 IMtL\1 queue with unrestricted(preemptive resume) service order and two decision features. The first type of decisionconcerns admission control: for each arrival we have to decide to accept or reject it. Thesecond type concerns termination control: at any time, we may decide to remove jobs fromthe system. Formally, there is also a third decision feature: the service order. However, itwill be shown presently that this third type is not a real issue-the system is essentially apriority queue in which type-2 jobs have priority over type-l jobs.

We assume that the decision maker knows the number of type-l jobs present and the numberof type-2 jobs present. We will show that both the optimal admission control policy and theoptimal termination control policy have a threshold structure.

The remainder of this paper is organized as follows. In section 2 we describe the model indetail. We also reduce the model by recognizing that type-2 jobs are preferable to type-ljobs and should be given priority over type-l jobs. Section 3 gives an overview of the mainresults for the reduced model, and the line of proof. Section 4 contains the proofs for thefinite horizon case. Section 5 discusses the extension to the infinite horizon. Finally, section 6discusses three model extensions and section 7 contains our conclusions.

2 Model description

The basic model we study is a two-class queueing system with infinite buffer capacity. Type-ijobs, i = 1,2, arrive at this station according to a Poisson process with arrival rate Ai ~ O.The workload of a job is exponential with mean service time 1/j.t, independent of which ofthe two classes the job belongs to. The service discipline is unrestricted. Queued jobs may berearranged at any time and at any time the service of a job may be interrupted-and resumedlater, if so desired-in order to commence the service of another job. The system is controlledin three ways: one has to decide to accept or reject new arrivals, one has to decide to removejobs from the system or to maintain them, and one has to decide what job to serve. Recallthat the decision maker knows the number of type-l jobs and the number of type-2 jobs inthe system. The structure of the system is that of a (semi-)Markovian decision process. Itcan be described as follows.

States: The state of the system is described by the tuple (x, y), where x is the number oftype-l jobs in the system and y is the number of type-2 jobs in the system. A two-dimensionalstate space suffices, because service may be interrupted at any time and because service timesare exponential. The question what job is served is part of the decision, not of the state. Wealso use the intermediate states (x, y, arr/l) and (x, y, arr/2) immediately after the arrival ofa type-l or type-2 job, respectively.

Events: We distinguish two possible events: (i) the arrival of a new job and (ii) a servicecompletion.

Decisions: If the event is an arrival, then first it has to be decided whether to accept (decisionaccept) or reject (decision rej act) the newly arrived job. If it concerns a type-l job, then

3

this changes the state (x, y, arr/l) into (x + 1, y) or (x, y), respectively. If, alternatively,it concerns a type-2 job, then this changes the state (x, y, arr/2) into (x, y + 1) or (x, y),respectively. Next, it is decided either to maintain all jobs in the system (decision continue)or to remove one or more jobs from the system (decision abort), and it is decided what jobto serve. If this is not the job already in service, then the job in service is either removedfrom the system or put back in the queue. The service of a job that has been placed back inthe queue can be resumed later-it need not be started all over again.

If the event is a service completion, then only the continue/abort decision and the decisionwhat job to serve have to be taken.

Costs and rewards: The reward for a type-i job, i = 1,2, is ri, where 1'2 > 1'1 > O. Thisreward is earned upon service completion. Jobs that do not complete their service, e.g.,because they are rejected upon arrival or removed while awaiting service, receive a reward ofzero. Removing jobs from the system is free of charge.

Apart from these rewards there are holding costs for the jobs residing in the system, eitherawaiting service or being served. We assume these costs are linear in the number of jobs andclass-independent, namely, mh ;::: 0 per unit of time when there are m = x + y jobs present.In addition, each time a job is admitted to the system, class-independent consideration costsc ;::: 0 are incurred. Rejecting jobs is free of charge. We assume 1'1 > c + h/f.l, otherwise itwill not be interesting to serve any type-I jobs.

Finally, there are no switchover costs, i.e., no costs are incurred if we start serving a type-2job if the previous job in service was a type-1 job and vice versa.

Discounting: We discount at a rate a ;::: 0, i.e., rewards and costs at time t are to bemultiplied by exp(-at). We treat the discount rate a as the rate by which the processvanishes. In other words, the process will live for an exponential time with rate a, afterwhich there will be no more arrivals, service completions, rewards or costs.

Criterion: The objective is to maximize the expected (discounted) reward over an n-periodtime horizon. We allow A1 + A2 > f.l, as well as A1 = A2 = O. In the latter case, there aretwo batches, one consisting of type-I jobs awaiting service and one consisting of type-2 jobsawaiting service, without any future arrivals.

Uniformization: The system evolves at arrival times, at service completion times, andeventually at the time the process vanishes. Applying the uniformization method, we canconsider that transitions occur at the jump times of a Poisson process with rate A1 +A2 +f.l+a > O. By scaling time, we take A1 + A2 + f.l + a = 1 without loss of generality. Then, withprobability Ai ;::: 0, i = 1,2, a transition concerns the arrival of a type-i job, with probabilityf.l > 0 it concerns a service completion and with probability a ;::: 0 the process vanishes. Aservice completion is either a real service completion or an artificial service com pletion if theserver idles, which occurs when the system is empty. In the latter case the state of the systemstays (0,0) and we take the continue decision per definition.

As a result, the times between consecutive events are identically distributed. Such times arecalled periods and if we reverse the direction of time, we can consider the number n of periodsleft until the process hits time zero. If the process vanishes before n =0, at n = no say, thenthe state of the system will see no more changes during the remaining no periods, and therewill be no more rewards and costs.

4

Uniformization enables us to use induction on the remaining number of periods to prove ourresults for any finite time horizon. These results can then be extended to the infinite timehorizon case; cr. section 5.

The dynamic programming approach takes a prominent position in our research. In addition,we will occasionally make use of sample path arguments. For an exposition of the samplepath approach, see LID, NAIN AND TOWSLEY [6].

2.1 Dynamic programming formulation

In this section, we summarize and complete the model in terms of a mathematical formulation.We first give the general model and then promptly reduce the model by showing that it isoptimal to always give type-2 jobs priority over type-l jobs. After that, we successively stateand prove our main theorem.

Recapitulating, x and y denote the number of type-l and type-2 jobs in the system,respectively, and (x, y) is the state of the system for x, y ~ O. We will use the followingnotation:

• Wn(x, y) will denote the maximum expected n-period a-discounted reward when thecurrent state, just before the next continue/abort decision, is (x, V). State (x, y) may bethe result of an arrival immediately after the accept/reject decision.

• Wn(x, y; 1r) denotes the maximum expected n-period a-discounted reward when thecurrent state, just before the next continue/abort decision, is (x, V), and given thatdecision 1r is chosen in that state, where 1r E {continue, abort} if either x = 0 or y = 0and 1r E {continue/I, continue/2, abort/I, abort/2} if x, y > O. Here, continue/imeans we take the continue decision-i.e., it is decided to maintain all jobs currentlypresent-and commence the service of a type-i job, i = 1,2, and abort/i means weremove a type-i job from the system, i = 1,2, after which we make a transition to state(x - 1, y) if i = 1 and a transition to state (x, y - 1) if i = 2. Let 1r* denote the optimaldecision, so Wn(x, y) = Wn(x, y; 7l"*). Note that in the notation 1r* the dependence onx, y and n is suppressed. We also note that we use commas in our notation to separatestate characteristics and a semi-colon to separate the decision from the state.

• Wn(x, y, arr/i) denotes the maximum expected n-period a-discounted reward when thecurrent state is (x, y), given that at this very point in time an arrival event occurs,concerning a type-i job, i = 1,2.

• Wn(x, y, arr/i; 1r) denotes the maximum expected n-period a-discounted reward whenthe current state is (x, V), given that at this very point in time an arrival event occurs,concerning a type-i jobs, i = 1,2, and given that decision 1r is chosen in that state, where1r E {accept, reject}. Again, 1r* denotes the optimal decision, so Wn(x,y,arr/i) =Wn(x, y, arr/i; 1r*).

• Finally, when time hits zero, all jobs currently in the system yield a reward of zero fornot having completed service.

5

Proposition 1 For all n 2: °and x, y 2: 0,

Proof. We first consider the left-hand inequality. Consider two n-period process instances ofour model, instance I l starting in (x + 1, y) and instance I 2 starting in (x, y + 1). We coupleall jobs, all events and all decisions. Instance I l follows the optimal policy and instance I 2

copies all actions taken in I l . In particular, we let the additional type-2 job in I 2 go throughexactly the same as the additional type-I job in I l . I.e., if I l aborts its additional type-I job,then I 2 aborts its additional type-2 job, and if II takes the additional type-I job into service,then I 2 takes the additional type-2 job into service.

As long as the additional job does not complete its service, the rewards and costs are the samefor both instances. So, if the additional job never completes its service, then the difference inreward between the two instances is zero. If, alternatively, the additional job completes itsservice at some point in time, generating a reward of fl in I l and a reward of r2 in I 2 , thenI l and I 2 become identical immediately after this service completion, so that the differencein reward between the two instances is r2 - fl > 0.

The reasoning is almost the same for the right-hand inequality. Again, let instance I l startin (x + 1, y) and let instance I 2 start in (x, y + 1). But now let I 2 follow the optimal policyand let I l copy all actions taken in I 2 •

o

Corollary 1 Type-2 jobs are preferred to type-i jobs. If x, y > 0, tllen decision abort/i instate (x, y) is at least as good as decision abort/2.

Proof. Immediate, by noting that

Wn(x, y; abort/I) == Wn(x - 1, y) 2: Wn(x, y - 1) == Wn(x, y; abort/2).

o

Corollary 1 makes the use of the notation abort/1 and abort/2 redundant. We only need thenotation abort, where it is determined by the state (x, y) what type of job will be removed:a type-I job if x > 0 and a type-2 job if x == o.

Corollary 2 Type-2 jobs are preferred to type-i jobs. For all n 2: 1 and x, y > 0, decisioncontinue/2 in state (x, y) is at least as good as decision continue/i.

Proof. Immediate, from

Wn(x,y;continue/2)

Wn(x,y;continue/I)

2:~=l AiWn-l(X, y, arr/i) + fL[Wn-d x , y - 1) + r2] - (x + y)h,

2:~=l AiWn- l (x, y, arr/i) + fL[Wn- l (x - 1, y) + rd - (x + y)h,

and the right-hand inequality of Proposition 1.

o

6

Corollary 2 makes the use of the notation continue/1 and continue/2 redundant. We onlyneed the notation continue, where it is determined by the state (x, y) what type of job willbe served: a type-2 job if y > 0 and a type-l job if y = O.

Then our model is defined by the following Dynamic Programming Equations (DPEs). Tosave space, we will usually write ab for abort and co for continue in formal expressions (andalso ac for accept and rj for reject).

Wo(X, y) = 0

For n > 0:

Wn(X, y, arr/l) = max{Wn(x + 1, y) - c, Wn(x, y)}

Wn(x, y, arr/2) = max{Wn(x, y + 1) - c, Wn(x, y)}

and for n> 1:

Wn(O,O) = Wn(O, 0; co)

x, y 2: 0

x, y 2: 0

Wn(O, 0; co) = 2:;=1 Ai Wn-1 (0, 0, arr/i) + IlWn-1(0, 0)

Wn(x, y) = max{Wn(x, y; co), Wn(x, y; ab)} x + Y > 0

Wn(x, y; co) = 2:;=1 Ai Wn-1(X, y, arr/i) + Il[Wn- 1(x, Y -1) + r2] - (x + y)h x 2: 0, y > 0

Wn(x, 0; co) = 2:;=1 Ai Wn-dx, 0, arr/i) + Il[Wn- 1(x - 1,0) + r1] - xh X> 0

Wn(X,Yiab) = Wn(x -I,y) X> O,y 2: 0

Wn(O,y;ab) = Wn(O,y-l) y> °

3 Overview of the results

We will prove the following theorem.

Theorem 1 {CHARACTERIZATION OF THE OPTIMAL ADMISSION/TERMINATION POLICY}

Let the remaining number of periods be n. Then the optimal admission/termination policycan be c11aracterized as follows:

1. a. If it is optimal to reject an arriving type-l job in state (x, y), then it is optimalas well to reject it in all states (x, y + j) with j > 0 and in all states (x + j, y - j)with 0 < j ~ y, and thus in all states (x + j, y) with j > O.

b. If it is optimal to reject an arriving type-2 job in state (x, y), then it is optimalas well to reject it in all states (x +j, y) with j > 0 and in all states (x - j, Y + j)with 0 < j ~ x, and thus in all states (x,y+j) with j > O.

7

2. If it is optimal to abort in state (x, y), then it is optimal as well to abort in all states(x, y + j) with j > 0, in all states (x + j, y - j) with 0 < j ~ y, and thus in all states(x + j, y) with j > O.

3. If it is optimal to reject an arriving type-2 job in state (x, y), then it is optimal as wellto reject an arriving type-1 job in state (x, y).

For a graphical representation of the structure of a typical admission/termination policy, werefer to Figure 1. In the optimal termination policy, the hollow dots represent states in whichwe continue and the solid dots represent states in which we abort. The polyline marks thetermination region. In the optimal admission policy, the hollow dots represent states in whichwe accept any new job, the half-filled dots represent states in which we only accept a new jobif it is a type-2 job and the solid dots represent states in which we reject any new job.

Remark 1 Part 3 of Theorem 1 follows immediately from Proposition 1 and the DPEs forWn(x, y, arr/l) and Wn(x, y, arr/2).

Remark 2 If c = 0, then it is always optimal to accept an arriving job, since it may beaborted at the same moment in time at no additional cost.

Example 1 Consider the following instance of our model: J.l = 0.45, Al = 0.35, A2 = 0.2,a = 0, h = 0.5, c = 5, rl = 10 and r2 = 25. Let n = 5. The optimal admission policy-shouldthere be an arrival at this point in time-and the optimal termination policy are given byFigure 1.

ty

2

1

o

. .••••• ••••••••• ••••••••• ••••() () () () () () .

() () () () () () ..

012

ty

2

1

o

.................• •••••••• ••••••• ••••••••••

o 1 2

Figure 1: Optimal admission (lhs) and termination (rhs) policies for Example 1

3.1 The line of proof

The main technique to prove parts 1 and 2 of Theorem 1 will be to use induction on theremaining number of periods. In order to establish these parts of the theorem, we will provethe following monotonicity results, which will be interpreted directly below.

8

Proposition 2 {KEY PROPOSITION}

For all n ~ 0 and x, y ~ 0,

Wn(X + 1, y) - Wn(x, y) > Wn(x + 2, y) - Wn(x + 1, y), (1)

Wn(x,y+ 1) - Wn(x,y) > Wn(x,y+2) - Wn(x,y+ 1), (2)

Wn(x + 1, y) - Wn(x, y) > Wn(x + 1, y + 1) - Wn(x, y + 1), (3)

Wn(x + 1, Y+ 1) - Wn(x, y + 1) > Wn(x + 2, y) - Wn(x + 1, y), (4)

Wn(x + 1, Y+ 1) - Wn(x + 1, y) > Wn(x, y + 2) - Wn(x, y + 1), (5)

Wn(O, y + 1) - Wn(O, y) > Wn(x + 1, y + 1) - Wn(x, y + 1). (6)

In addition, inequalities (1) to (6) hold:

• at arrival times oftype-1 jobs; the inequalities are then referred to as (1 arr/l) to (6 arr/ 1),

• at arrival times oftype-2 jobs; the inequalities are then referred to as (1 arr/2) to (6 arr/2),

• given that we take the (not necessarily optimal) decision continue in each of the statesthat appear in the respective inequality; the inequalities are then referred to as (1 CO)

to (6 CO).

Inequality (1) states that Wn(x, y) is concave in x, i.e., the value of an additional type-l jobis non-increasing in x for fixed y. Inequality (2) states that Wn(x, y) is concave in y as well.Inequality (3) states that the value of an additional type-1 job is non-increasing in y for fixedx. Inequality (4) states that the value of an additional type-l job is non-increasing in x for afixed total number of jobs in the system. Inequality (5) states that the value of an additionaltype-2 job is non-increasing in y for a fixed total number of jobs in the system. Inequality (6)is an auxiliary inequality, which is used in our proofs of the other inequalities for certainboundary states.

Remark 3 Adding (4) to (3), we obtain (1), and adding (5) to (3), we obtain (2).Furthermore, for x, y ~ 0, (6) can be obtained through

~ {Proposition I}~ {(I) x times}~ {(3)}

W(I, y) - Wn(O, y)W(x + 1, y) - Wn(x, y)W(x + 1, Y+ 1) - Wn(x, y + 1).

Therefore, it suffices to prove the set of inequalities (3) to (5). However, it will be convenientin our proofs to make use of (1), (2) and (6) as well. One may easily verify that theseimplications also apply at arrival times and given that we take the continue decision in eachstate appearing in the respective inequality.

One might conjecture the reverse of (4), i.e., that the value of an additional type-l job isnon-decreasing in x for a fixed total number of jobs in the syste~. This would imply that (4)holds by equality for all x, y ~ O. However, the conjecture translates to

(7)

which does not hold in general. For example, in the instance considered in Example 1,W5 (2, 1) - W5 (l, 1) is 1.0863 ± 0.0002 (the accuracy in the calculations is 0.0002), whereasW5 (1, 2) - W5 (0, 2) is 1.3216 ± 0.0002, so (7) is surely violated.

9

However, if we let n -+ 00 in the instance considered in Example 1, then for each pair (x +1, y)with 0 :S x :S Sand 0 :S Y :S 14 (following the optimal policy, the number of type-1 jobs inthe system will never exceed S and the maximum number of type-2 jobs in the system willnever exceed 14) the left-hand side and right-hand side of (7) converge to the same value.We have numerically analyzed a variety of other instances and have always found the sameresult. This leads to the following conjecture.

Conjecture 1 Let the total number of jobs in the system be fixed and at least 1. Then forn -+ !Xl the value of an additional type-l job is constant in x.

The conjecture implies that for n -+ !Xl and x ~ 1, the decision to abort or not to abort isdetermined solely by the total number of jobs in the system, Le., x +Y, and not by x and Yindividually.

4 Proof of the Key Proposition

The proof of the Key Proposition uses induction on the remaining number of periods andruns as follows.

Step 0: Observe that (3), (4) and (S) hold for n = O.

Step 1: Assuming (3) to (5) to hold for some n ~ 0, prove (3 arr/ l ) to (Sarr/l) for n, as wellas (3 arr/2) to (5 arr/2) for n.

Step 2: Using this result, prove that (3 CO) to (S CO) hold for n + 1.

Step 3: Finally, prove that (3) to (5) also hold for n + 1.

In the proof we make use the following lemma.

Lemma 1 Let either 8 m = (xm,Ym) orSm = (xm,Ym,arrji), m = 1, ... ,4 and i = 1,2, andlet ¢ and 'lj; be authorized decisions, given Sm, and recall that 11"* denotes the optimal decisionin a state. Then

(8)

implies

o

We will use Lemma 1 in the following way: when distinguishing between all possiblecombinations of optimal decisions in certain states 82 and S3, we choose ¢ and 'lj; such that (8)holds.

10

Proof of the Key Proposition.

Step o. Inequalities (3) to (S) hold by definition for n = O.

Induction hypothesis. Assume that for some n ~ 0, inequalities (3) to (S) hold for allx, y ~ O. This will be our induction hypothesis.

Step 1. Under the induction hypothesis, we show that (3 arr/I) to (S arr/I) and (3 arr/2)

to (Sarr/2) hold for n.

Let x, y ~ O. Let us first consider (3 arr/I), and thus the arrival of a type-I job.

The next decision, dl say, prescribed by the (optimal) policy corresponding to Wn(x, y, arr/l),is either to accept or to reject the new (type-I) job. Clearly, this also holds for the nextdecision, d2 say, prescribed by the (optimal) policy corresponding to Wn(x + 1, Y + 1, arr/I).There are at most four joint cases (d l , d2). These cases can be presented as follows, where Aindicates that accept is optimal and R indicates that accept is not optimal:

AA Wn(x + I,y) - c ~ Wn(x,y) 1\ Wn(x + 2,y+ 1) - c ~ Wn(x + I,y+ 1),AR Wn(x + 1, y) - c ~ Wn(x, y) 1\ Wn(x + 2, y + 1) - c < Wn(x + 1, Y + 1),RA Wn(x+I,y)-c<Wn(x,y) 1\ Wn(x+2,y+I)-c~Wn(x+I,y+I),

Rn Wn(x + 1, y) - c < Wn(x, y) 1\ Wn(x + 2, y + 1) - c < Wn(x + 1, y + 1).

We will show that inequality (3 arr/I) holds for each case separately (irrespective ofthe question whether that case can actually occur). This is done by choosing anappropriate decision that is to be taken in the state corresponding to the leftmost term ofinequality (3 arr/l), i.e., state (x+ 1, y, arr/1), and an appropriate decision that is to be taken inthe state corresponding to the rightmost term of the inequality, i.e., state (x, y+I, arr/I), suchthat we obtain an inequality that holds under the induction hypothesis, and by subsequentlyapplying Lemma 1.

E.g., under AA,

Wn(X + 1, y, arr/l; ac) - Wn(x, y, arr/I) = Wn(x + 2, y) - c - [Wn(x + 1, y) - c]

Wn(x + 2, y) - Wn(x +1, y)

> {induction hypothesis; (3)}

Wn(x + 2,y+ 1) - Wn(x + I,y+ 1)

Wn(x + 1, Y + 1, arr/I) - Wn(x, y + 1, arr/I; ac),

to which we apply Lemma 1 to obtain the desired result for this case.

It is easy to see that the reasoning for case AA is similar for inequalities (4 arr/ l ) and (Sarr/I),and (3 arr/2) to (Sarr/2). For each inequality (jarr/i), j = 3,4,S, i = 1,2, case AA can bedealt with by choosing accept in the other two states as well and by then using inequality (j),which holds under the induction hypothesis. Similarly, case RR can always be dealt with bychoosing rej ect in the other two states as well.

The remaining cases An and RA are somewhat more complicated. We have convenientlysummarized the analysis of these two cases in Table 1. For each inequality we give twodecisions that can be inserted such that an inequality is obtained that holds, either under the

11

induction hypothesis or because its left-hand side is identical to its right-hand side. In eachcase, Lemma 1 can then be applied to obtain the desired result.

inequality case result by

(3 arr/l) rj AR ac Ihs=rhs=crj RA ac induction hypothesis; (3); (1)

(4 arr/1) rj AR ac Ihs=rhs=crj RA ac induction hypothesis; (4); (1)

(5 arr/l) rj AR ac induction hypothesis; (4); (5)rj RA ac induction hypothesis; (2)

(3 arr/2) ac AR rj Ihs=rhsrj RA ac induction hypothesis; (3) twice

(4 arr/ 2) rj AR ac induction hypothesis; (5); (4)rj RA ac induction hypothesis; (1)

(5 arr/ 2)rj AR ac Ihs=rhs=crj RA ac induction hypothesis; (5); (2)

Table 1: Analysis of cases AR and RA

Step 2. Assuming (3) to (5), (3 arr/1) to (5 arr/ 1) and (3 arr/2) to (Sarr/2) forn, we show that(3 CO) to (5 CO) hold for n + 1. We will use the following lemma.

Lemma 2 For all n ~ °and x, y ~ 0,

Proof. By coupling and a sample path argument, cr. the proof of Proposition 1. Let instanceII start in (x + 1, y) and instance I o in (x, y). Couple all events and all decisions. Instance IIfollows the optimal policy and instance I o copies all actions taken in II, until either II abortsits additional type-l job or the additional type-l job in II completes its service. This occursat time T, say. Until then, the difference in reward between II and I o is at most zero. At timeT, either II aborts its additional type-l job, generating a reward of zero, or this additionaltype-l job completes its service, generating a reward of rl' Immediately afterwards, II andI o become identical. So the difference in reward between the two instances is at most rl.

o

Now consider (3 CO). We distinguish the following three cases, which cover all possible states(x,y): (I) x ~ O,y > 0, (II) x> O,y= °and (III) x = O,y= 0.

Case (I) gives

Wn+dx + 1, y; co) - Wn+dx, Yi co)

2:~=1 Ai[Wn(X + 1, y, arr/i) - Wn(x, y, arr/i)] + JL[Wn(x + 1, y - 1) - Wn(x, y - 1)] - h

> {induction hypothesis; (3 arr/1); (3 arr/2); (3)}

2:~=1 Ai[Wn(X + 1, Y + 1, arr/i) - Wn(x, y + 1, arr/i)] + JL[Wn(x + 1, y) - Wn(x, y)] - h

Wn+l (x + 1, Y+1; co) - Wn+1 (x, Y+ 1; co).

12

Case (II) gives

Wn+i (x + 1,0; co) - Wn+i (x, 0; co)

E~=l/\[Wn(x + 1,0, arr/i) - Wn(x, 0, arr/i)] + fL[Wn(X, 0) - Wn(x - 1,0)] - h

> {induction hypothesis; (3 arr/i); (3 arr/2); (I)}

E;=i Ai[Wn(X + 1,1, arr/i) - Wn(x, 1, arr/i)] + fL[Wn(x + 1,0) - Wn(x, 0)] - h

Wn+i (x + 1,1; co) - Wn+l (x, 1; co).

Case (III) gives

Wn+i (1,0; co) - Wn+l (0,0; co)

E;=i Ai[Wn(l, 0, arr/i) - Wn(O, 0, arr/i)] + fLT'l - h

> {induction hypothesis; (3 arr/i); (3 arr/2); Lemma 2}

E;=i Ai[Wn(l, 1, arr/i) - Wn(O, 1, arr/i)] + fL[Wn(l, 0) - Wn(O, 0)] - h

Wn+d1, 1; co) - Wn+i(O, 1; co).

Next, consider (4 CO), and distinguish the following two cases, covering all possible states(x,y): (I) x ~ O,y > 0 and (II) x ~ O,y=O.

For case (I), the derivation is analogous to the one for (3 CO) for case (I), using the inductionhypothesis and inequalities (4 arr/ i ), (4 arr/2) and (4).

For case (II), we have

Wn+l (x + 1, 1; co) - Wn+l (x, 1; co)

E~=i Ai[Wn(X + 1,1, arr/i) - Wn(x, 1, arr/i)] + fL[Wn(X + 1,0) - Wn(x, 0)] - h

> {induction hypothesis; (4 arr/ i ); (4 arr/ 2)}

E~=i Ai[Wn(X + 2,0, arr/i) - Wn(x +1,0, arr/i)] + fL[Wn(X + 1,0) - Wn(x, 0)] - h

= Wn+l (x + 2, OJ co) - Wn+i (x +1,0; co).

Finally, consider (S co), and distinguish the same two cases as considered for (4 CO).

For case (I), the derivation is analogous to the one for (3 CO) for case (I), using the inductionhypothesis and inequalities (Sarr/i), (Sarr/2) and (S).

For case (II), we have

Wn+1(x + 1,1; co) - Wn+1(x + 1,0; co)

= E;=l >';[Wn(x + 1, 1, arr/i) - Wn(x + 1,0, arr/i)] + J.t[Wn(x + 1,0) - Wn(x, 0) + T'2 - T'1] - h

> {induction hypothesis; (5 arr/!); (5 arr/2); Proposition I}

2:;=1 >';[Wn(x, 2, arr/i) - Wn(x, 1, arr/i)] + J.t[Wn(x, 1) - Wn(x, 0)] - hWn+l(x, 2; co) - Wn+t{x, 1; co).

Step 3. Assuming (3) to (S), (3 arr/ i ) to (Sarr/i) and (3 arr/2) to (Sarr/2) for n, and (3 CO )to (S CO) for n + 1, we show that (3) to (S) hold for n +1.

13

The line of reasoning resembles the one we followed in Step 1 of our proof. For any of thethree inequalities (3), (4) and (5) for n+l, we distinguish all possible combinations of optimaldecisions in two of the four states, namely, the states 82 and 83 in Lemma 1.

The results are summarized in Table 2, where for each relevant situation appropriatearguments are given, including the choices for the decisions in states 81 and 84 (cr. Lemma 1).If no decision is shown, then we take the optimal decision in that state. The abbreviation 'ih'used in the table indicates that the induction hypothesis is used there.

In the notation, C indicates that continue is strictly optimal, A that abort is optimal andX that it is of no interest whether continue or abort is optimal. Recall that A is not anoption in state (0,0). Furthermore, if it is optimal to first abort j jobs and then to continue,then this is denoted by Ai and we write ab i for the not necessarily optimal copied decision.So, C == AD and A equals Ai for some j > O.

As an example of the reasoning and the notation used in Table 2, we consider the second casefor inequality (3), i.e., AkC for 0 ::; k ::; x. Then

W n+1 (x + 1, y; ab k) - W n+1(x, y) W n+1(x + 1 - k, y; co) - Wn+I (x - k, y; co)

> {induction hypothesis; (1 CO) k times}

Wn+I (x + 1, y; co) - W n+1 (x, y; co)

> {induction hypothesis; (3 CO)}

Wn+I (x + 1, Y + 1; co) - Wn+I (x, Y + 1; co)

Wn+I (x + 1, Y + 1) - W n+1 (x, Y + 1; co),

to which we apply Lemma 1 to obtain the desired result for this case.

For a ::; j ::; k ::; x, a < l ::; y, 0 < m ::; y + 1, a < p ::; y +2 and k < q ::; x + 1:

inequality

(3)

(4)

(5)

abab k

ab x +[

abab k

ab x +m

case

XAx+p

AiAk

AqAk

Ax+I+[Ak

coco

cocoabx+(p-l)

ab J

ab k

ab k

result byIhs=rhs=Oih; (1 CO) k times; (3 CO)ih; (2 CO) l times; (6 CO)Ihs=rhs=Oih; (1 CO) k times; (4 CO)ih; (2 CO) m - 1 times; (6 CO); (4 CO)Proposition 1; rhs=Oih; (4 CO) k - j times; (5 CO)ih; (3 CO) q - k times; (5 CO)ih; (2 CO) l + 1 times; (3 CO) x - k times

Table 2: Analysis of inequalities (3), (4) and (5) for n + 1

With Table 2 we conclude our proof of the Key Proposition.

o

We now derive parts 1 and 2 of Theorem 1 from the Key Proposition by means of threecorollaries. Note that Corollaries 3 and 4 correspond to parts la and Ib of Theorem 1,respectively, and that Corollary 5 corresponds to part 2 of Theorem 1.

14

Corollary 3 Let n ~ 0 and x, y ~ O. If it is optimal to reject an arriving type-l job instate (x, y), then it is optimal to reject it in all states (x, y +j) with j > 0 and in all states(x+j,y-j) with 0 < j:::; y.

Proof. Let n ~ 0 and x ~ O. It suffices to show that

Wn(x,y)~Wn(x+l,y)-c ==> Wn(x,y+l)~Wn(x+l,y+l)-c, y~O, (9)

Wn(x, y) ~ Wn(x + 1, y) - c ==> Wn(x + 1, Y - 1) ~ Wn(x + 2, y - 1) - c, y> O. (10)

One can easily verify that implications (9) and (10) are immediate from inequalities (3)and (4), respectively.

o

Corollary 4 Let n ~ 0 and x, y ~ O. If it is optimal to reject an arriving type-2 job instate (x, y), then it is optimal to reject it in all states (x + j, y) with j > 0 and in all states(x - j, y + j) with 0 < j :::; x.

Proof. Let n ~ 0 and y ~ O. It suffices to show that

Wn(x, y) ~ Wn(x, Y + 1) - c ==> Wn(x + 1, y) ~ Wn(x + 1, Y + 1) - c, x ~ 0, (11)

Wn(x, y) ~ Wn(x, Y+ 1) - c ==> Wn(x - 1, Y + 1) ~ Wn(x - 1, Y + 2) - c, x > O. (12)

One can easily verify that implications (11) and (12) are immediate from inequalities (3)and (5), respectively.

o

Corollary 5 Let n ~ O. If it is optimal to abort in state (x, y), then it is optimal to abortin all states (x, y +j) with j > 0 and in all states (x +j, y - j) with 0 < j :::; y.

Proof. Let n ~ O. It suffices to show that

Wn(x - 1, y) = Wn(x, y) ==> Wn(x - 1, Y + 1) f. Wn(x, y + 1),

Wn(O, y - 1) = Wn(O, y) ==> Wn(O, y) f. Wn(O, y + 1),

Wn(x - 1, y) = Wn(x, y) ==> Wn(x, Y - 1) f. Wn(x + 1, Y - 1),

Wn(O, y - 1) = Wn(O, y) ==> Wn(O, y - 1) f. Wn(1, y - 1),

x> 0,

y> 0,

x,y> 0,

y> O.

(13)

(14)

(15)

(16)

One can easily verify that implications (13), (14), (15) and (16) are immediate frominequalities (3), (2) and (4) and Proposition 1, respectively.

o

15

5 Infinite time horizon

So far we only considered a finite time horizon, i.e., a finite number of periods. For h > 0,the extension of the threshold structure to the optimal strategy for the infinite horizon modelis fairly standard. First, note that h > 0 implies that the system can be reduced to a finitestate system, because jobs will not be accepted if the number of jobs in the system is toolarge. To see this, consider a job that arrives when there are already m jobs of the sameor higher priority in the system. Suppose we accept it and that it will (eventually) go intoservice. Without loss of generality we can say that we discard jobs from the queue in theorder 'lowest priority and latest arrivals discarded first'. Then, if our job goes into service, allthe jobs in front of it must have gone into service as well. Hence it takes at least m periodsbefore our job goes into service, and its total holding costs are at least mho If the job is atype-i job, then its reward is at most rio This implies that its total contribution is negativeif m > riJh. So the total number of type-i jobs in the system will not exceed riJh and thesystem is essentially a finite state system.

For this finite state system there are only finitely many stationary strategies. For each numberof periods n we get an optimal threshold policy in. Then there must be a subsequence {int}and a policy 1* with int = 1* for all 1. In the discounted reward case, i.e., a > 0, this policyis optimal, because Wn converges. In the average reward case, Le., 0: = 0, we can use the factthat for all policies the resulting Markov chain has only one recurrent class (state (0,0) willalways be reached) and is aperiodic. Thus Wn - ng*, with g* the optimal average reward,converges for all initial states. Thus 1* (which we know has the threshold structure) will beaverage reward optimal (see, e.g., DENARDO [3]).

The case h = 0, C > 0 is somewhat different. In the average reward case, we accept on theaverage as many jobs as needed to fully occupy the server, if possible, serving type-2 jobswhenever possible. In the discounted reward case, we can again reduce the system to a finitestate system, because for the investment c to be of interest with respect to some job, its rewardhas to come soon enough, and with more jobs in front of it, the expected return decreases.

6 Extensions

In this section we discuss three extensions of our model. The first concerns heterogeneousconsideration costs. The second and third concern the extension of our model to the generalmulti(~I)-server case and the general multi(~2)-class case.

6.1 Heterogeneous consideration costs

In our basic model we considered class-independent consideration costs c ~ O. It is readilyverified that Proposition 1 as well as the Key Proposition and its proof stay intact if weconsider class-dependent consideration costs Cl, C2 ~ O. As a result, parts 1 and 2 of Theorem 1remain valid if the consideration costs are class-dependent. If Cl ~ C2, Le., if the considerationcosts are at least as high for type-I jobs as for type-2 jobs, then part 3 of Theorem 1 remainsvalid as well, by Proposition 1 and the new DPEs for Wn(x, y, arr/I) and Wn(x, y, arr/2).

16

6.2 General multi-server model

In our basic model we considered a single-server queue. The extension of our results for thesingle-server model to the general multi-server case is not straightforward. If we follow thesame approach as for the single-server model, then it turns out that (7) is required in order toestablish (5) for all x, y for the multi-server model. But in Example 1 we have seen that (7)need not hold. Under the restrictive assumption that C2 = 0 or that type-2 jobs may not berejected upon arrival, it can be shown that (7) holds for the single-server model, so that thevalue of an additional type-l job only depends on the total number of jobs in the system, andnot on the number of type-I jobs and the number of type-2 jobs individually. In this case ourresults can be extended to the multi-server model; see BROUNS AND VAN DER WAL [2].

6.3 General multi-class model

In our basic model we considered two classes of jobs. The extension of our results for thetwo-class model to the general multi-class case also causes difficulties. Under the restrictiveassumption that Cj = 0 for all j = 1, ... , J (where J is the number of classes of jobs) or thatall jobs must be accepted upon arrival, our monotonicity results and characterization of theoptimal termination policy can be extended to the multi-class model. For details, see [2].

7 Conclusions

We have considered a two-class MAl 'A2 1MJ.L 11 preemptive priority queue. For this queue wehave dealt with two additional decision features. First, one has to decide upon arrival of a jobto accept or reject the new job. Second, at any point in time, one may decide to remove anynumber of jobs from the system. We have shown that the optimal strategy for both types ofdecisions is characterized by threshold policies.

References

[1] BROUNS, G.A.J .F. AND J. VAN DER WAL, Optimal threshold policies in a workload model witha variable number of service phases per job, Math. Methods Oper. Res., to appear.

[2] BROUNS, G.A.J.F. AND J. VAN DER WAL, Optimal threshold policies in a multi-classmulti-server preemptive priority queue with termination control, working paper, EindhovenUniv. of Tech.

[3] DENARDO, E.V., A Markov decision problem, Mathematical programming (Proc. Adv. Sem.,Univ. Wisconsin, 1972), ed. by T. Hu and S. Robinson, Academic Press, 1973,33-68.

[4] GROENEVELT, R., G. KOOLE AND P. NAIN, On the bias vector of a two-class preemptivepriority queue, Math. Methods Oper. Res. 55,2002, 107-120.

[5] JOHANSEN, S.G. AND C. LARSEN, Computation of a near-optimal service policy for a

single-server queue with homogeneous jobs, European J. Oper. Res. 134, 2001, 648-663.

[6] Lm, Z., P. N AIN AND D. TOWSLEY, Sample path methods in the control of queues, QueueingSystems Theory Appl. 21, 1995, 293-335.

17

[7] RIGHTER, R., Expulsion and scheduling control for multic1ass queues with heterogeneous servers,Queueing Systems Theory Appl. 34, 2000, 289-300.

[8] STIDHAM JR., S. AND R.R. WEBER, A survey of Markov decision models for control ofnetworksof queues, Queueing Systems Theory Appl. 13, 1993, 291-314.

[9] TEGHEM JR., J., Control of the service process in a queueing system, European J. Oper. Res.23, 1986, 141-158.

[10] Xu, S. H., A duality approach to admission and scheduling controls ofqueues, Queueing SystemsTheory Appl. 18, 1994, 273-300.

[11] Xu, S.H. AND J .G. SHANTHIKUMAR, Optimal expulsion control-a dual approach to admissioncontrol of an ordered-entry system, Oper. Res. 41, 1993, 1137-1152.

18

Optimal threshold policies in a two-class preemptive …Using the dual approach, RIGHTER [7] extends...

Documents

Transcript of Optimal threshold policies in a two-class preemptive …Using the dual approach, RIGHTER [7] extends...