The Optimizing-Simulator: Merging Optimization and ......Special maintenance at airbase -1000...

Post on 16-Apr-2020

6 views 0 download

Transcript of The Optimizing-Simulator: Merging Optimization and ......Special maintenance at airbase -1000...

© 2005 Warren B. Powell Slide 1

The Optimizing-Simulator: MergingOptimization and Simulation Using

Approximate Dynamic Programming

Winter Simulation ConferenceDecember 5, 2005

Warren PowellCASTLE LaboratoryPrinceton University

http://www.castlelab.princeton.edu

© 2004 Warren B. Powell, Princeton University

© 2005 Warren B. Powell Slide 2

Yellow Freight System

© 2004 Warren B. Powell, Princeton University

© 2005 Warren B. Powell Slide 3

Yellow Freight System

© 2004 Warren B. Powell, Princeton University

© 2005 Warren B. Powell Slide 4

The fractional jet ownership industry

© 2005 Warren B. Powell Slide 5NetJets Inc.

© 2005 Warren B. Powell Slide 6

© 2005 Warren B. Powell Slide 7

© 2005 Warren B. Powell Slide 8

Schneider National

© 2005 Warren B. Powell Slide 9

Schneider National

© 2005 Warren B. Powell Slide 10

© 2005 Warren B. Powell Slide 11

Air Mobility Command

AirMobility

Command

Fuel

Cargo HandlingRamp Space

Maintenance

Cargo Holding

© 2005 Warren B. Powell Slide 13

The challenges

Needs for simulation:» Are we using the right mix of people and equipment?» What is the effect of new policies regarding the

management of people and equipment?» What is the marginal contribution from serving

customers?» What is the effect of last-minute demands on the

system?

© 2005 Warren B. Powell Slide 14

The challenges

We need simulation technology that accomplishes the following:» Decisions have to handle high dimensional states and

actions (assigning different types of resources to different types of tasks).

» The simulator has to capture behaviors that produce “good” behaviors not just at a point in time, but over time (decisions have to think about the future).

» Performance statistics must match historical performance.

© 2005 Warren B. Powell Slide 15

Outline

Modeling and problem representation

© 2005 Warren B. Powell Slide 16

Modeling

Resources can have a number of attributes:

LocationEquipment type⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

LocationETA

Equipment typeTrain priority

PoolDue for maint

Home shop

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

LocationETA

A/C typeFuel level

Home shopCrewEqpt1

Eqpt100

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

LocationETA

Bus. segmentSingle/team

DomicileDrive hoursDuty hours

8 day historyDays from home

⎡ ⎤⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎢ ⎥⎣ ⎦

a =

© 2005 Warren B. Powell Slide 17

Modeling

The attribute vector

The resource state variable

( )Number of resources with attribute at time .

Resource state variableta

t ta a

R a tR R

=

= =A

1

2t

n

aa

a

a

⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦

© 2005 Warren B. Powell Slide 18

Modeling

Decision set function:( ) Set of decision types we can use to act

on a resource with attribute .a

a= D

1

2t

n

aa

a

a

⎡ ⎤⎢ ⎥⎢ ⎥=⎢ ⎥⎢ ⎥⎣ ⎦

Modified resource label1ta +d

© 2005 Warren B. Powell Slide 19

Modeling

The “modify” function

The information process

1 ( , , ) ( , )t t t tM a W d a c− =

Vector of information arriving during time interval .

Ex: new customer requests, equipment failures, weather delays.

tW t=

© 2005 Warren B. Powell Slide 20

Modeling

Decisions

The decision function

( )t t tx X Iπ=

Set of decision functions (policies)π ∈Π =

Information available for making a decision

( ) ,

Number of resources with attribute that we can act on with decision using the information available at time .

tad

t tad a d

x ad t

x x∈ ∈

=

=A D

© 2005 Warren B. Powell Slide 21

Approximate dynamic programming

Information and decision processes:

Time

1W

1x 2x 3x 4x 5x 6x0x

2W 3W 4W 5W 6W

Exogenous information process

Decisions determined by a policy

© 2005 Warren B. Powell Slide 22

Modeling

System dynamics (classical view):

1 1

Given a decision function (policy) ( ) andexogenous information process , we can modelthe evolution of the state of our system using:

( , ( ), )

t t

t

t t t t t

X SW

S f S X S W

π

π+ +=

© 2005 Warren B. Powell Slide 23

Modeling

( )t tX Sπ

tSxtS

1tW +

1tS +

© 2005 Warren B. Powell Slide 24

Modeling

User provides:Model of physical system

( )1

Data: Resource vector Information process Software: Decision set function Modify function ( , , )

t

t

t t t

RW

D aM a d W +

Our research goal:The decision function

Decision functions ( )t tX Iπ

© 2005 Warren B. Powell Slide 25

Outline

The optimizing simulator

© 2005 Warren B. Powell Slide 26

Optimizing over time

Resources

© 2005 Warren B. Powell Slide 27

Optimizing over time

Tasks

© 2005 Warren B. Powell Slide 28

Optimizing over time

t t+1 t+2

Optimizing at a point in time

Optimizing over time

© 2005 Warren B. Powell Slide 29

The optimizing simulator

t = t + 1

Make decision at time t

Update system stateat t+1

t = 0

t < T ???

Classical simulation:» Simple» Extremely flexible

But . . .» Limited solution

quality» Often requires

extensive user defined tables to guide the simulation.

» Can respond to changes in inputs in an unpredictable way.

© 2005 Warren B. Powell Slide 30

The optimizing simulator

Optimization» Intelligent» Responds naturally to

new datasets.But . . .» Struggles to handle

complexity of real operations.

» Does not model evolution of information.

» Might be “too intelligent”?

1 1

min

0

t tt

t t t t tt

t t t

t

c x

A x B x b

D x ux

− −− =

≤≥

© 2005 Warren B. Powell Slide 31

Multicommodity flowTime

Spac

e

Type

© 2005 Warren B. Powell Slide 32

The optimizing simulator

Simulation» Strengths

• Extremely flexible• High level of detail

» Weaknesses• Low level of “intelligence”• Lower solution quality• May have difficulty

“behaving” properly with new scenarios.

• Difficulty adapting to random outcomes.

Optimization» Strengths

• High level of intelligence• System behaves “optimally”

even with new datasets• Reduces data set preparation.

» Weaknesses• Strict rules on problem structure• Low level of detail• Inflexible!

To simulate or to optimize . . .

. . . Why are we asking this question?

© 2005 Warren B. Powell Slide 33

Decision-making technologies

Cost-based» The standard assumption of

math programming.» Easily handles tradeoffs.» Easily handles high

dimensions.» Can be difficult to tune to

get the right behavior.

Rule-based» Typically associated with AI.» Very flexible.» Difficult coding tradeoffs.» Struggles with higher

dimensional states.

© 2005 Warren B. Powell Slide 34

Expert knowledge ρ

The four information classes

Forecasts of impacts on others tV

tΩForecasts of exogenous events

Knowledge tK

© 2005 Warren B. Powell Slide 35

The four information classes

Knowledge tK

© 2005 Warren B. Powell Slide 36

Knowledge

Rule-based: one aircraft and one requirement

California

Germany

New Jersey

Colorado

Taiwan

England

New Jersey

Aircraft Requirements

© 2005 Warren B. Powell Slide 37

Knowledge

Cost based: one requirement and multiple aircraft

California

Germany

New Jersey

Colorado

Taiwan

England

New Jersey

Aircraft Requirements

© 2005 Warren B. Powell Slide 38

Knowledge

Costs allow you to make tradeoffs:

California

Germany

-8000Total “cost”-1000Special maintenance at airbase-3000Requires modifications+8000Utilization+5000Appropriate a/c type

-$17,000Repositioning cost“cost”/“bonus”Issue

© 2005 Warren B. Powell Slide 39

Knowledge

Cost based: multiple requirements and aircraft

California

Germany

New Jersey

Colorado

Taiwan

England

New Jersey

Aircraft Requirements

© 2005 Warren B. Powell Slide 40

The information classes

tΩForecasts of exogenous events

Knowledge tK

© 2005 Warren B. Powell Slide 41

Forecasts of exogenous information

California

Germany

New Jersey

Colorado

Taiwan

England

New Jersey

( ) involves solving a linear program/network model.X Iπ

Aircraft Requirements

Resources that are known now…

© 2005 Warren B. Powell Slide 42

Forecasts of exogenous information

Aircraft Requirements

California

Germany

New Jersey

Colorado

Taiwan

England

New Jersey

( ) involves solving a linear program/network model.X Iπ

CaliforniaGermany

New Jersey

Colorado

TaiwanEngland

New Jersey

Aircraft Requirements

Resources that are known now…

© 2005 Warren B. Powell Slide 43

Forecasts of exogenous information

Aircraft Requirements California

Germany

New Jersey

Colorado

TaiwanEngland

New Jersey

tR

⎧⎪⎪= ⎨⎪⎪⎩

CaliforniaGermany

New Jersey

Colorado

Taiwan

England

New Jersey

( )' 't t tR

>=

⎧⎪⎪⎨⎪⎪⎩

… and are forecasted for the future.

© 2005 Warren B. Powell Slide 44

The information classes

The Information classes

Forecasts of impacts on others tV

tΩForecasts of exogenous events

Knowledge tK

© 2005 Warren B. Powell Slide 45

Approximate dynamic programming

Decisions now may need to know the impact on future decisions:» What is the cost of assigning this type of aircraft to

move a requirement?» What is the value of having a certain number of aircraft

in a region?» Should this requirement be satisfied now? Later?

Never?

For these questions, it is important that we optimize over time.

Time tV(a’)

a

V(a’’)

Time t '1( )V a

1a

'2( )V a

2a

© 2005 Warren B. Powell Slide 48

The optimization challenge

?

© 2005 Warren B. Powell Slide 49

State variables

Systems evolve through a cycle of exogenous and endogenous information

Time

1R̂

1x 2x 3x 4x 5x 6x0x

2R̂ 3R̂ 4R̂ 5R̂ 6R̂ω =

© 2005 Warren B. Powell Slide 50

State variables

Systems evolve through a cycle of exogenous and endogenous information

Time

1R̂

1x 2x 3x 4x 5x 6x0x

2R̂ 3R̂ 4R̂ 5R̂ 6R̂

1R 2R 3R 4R 5R 6R0R

© 2005 Warren B. Powell Slide 51

Approximate dynamic programming

Using this state variable, we obtain the optimality equations:

Problem: Curse of dimensionality

{ }1 1( ) max ( , ) ( ) |t t t t t t t txV R C R x E V R R+ +∈

= +X

Three curses

State spaceOutcome spaceAction space (feasible region)

© 2005 Warren B. Powell Slide 52

Approximate dynamic programming

The computational challenge:

{ }1 1( ) max ( , ) ( ) |t t t t t t t txV R C R x E V R R+ +∈

= +X

How do we find ? 1 1( )t tV R+ +

How do we compute the expectation?

How do we find the optimal solution?

© 2005 Warren B. Powell Slide 53

Approximate dynamic programming

A possible approximation strategy:

( ){ }1 1

We start with:

( ) max ( , ) |t t t t t t t tt

V R C R x E V R Rx + += +

Can’t compute this!!!

( )1 1

We solve this for a sample realization:

( , ) max ( , ) ( )t t t t t t tt

V R C R x V Rxω ω+ += +

( )1 1

Now substitute in function approximations:

( , ) max ( , ) ( )t t t t t t tt

V R C R x V Rxω ω+ += +

Don’t know what this is!

Need to approximate V

© 2005 Warren B. Powell Slide 54

Approximate dynamic programming

One big problem….

( )1 1( , ) max ( , ) ( )t t t t t t tt

V R C R x V Rxω ω+ += +

1Seeing is cheating!tR +

© 2005 Warren B. Powell Slide 55

Approximate dynamic programming

Alternative: Change the definition of the state variable:

Time

1R̂

1x 2x 3x 4x 5x 6x0x

2R̂ 3R̂ 4R̂ 5R̂ 6R̂

1R 2R 3R 4R 5R 6R0R 1R 2R 3R 4R 5R 6R0R 1R 2R 3R 4R 5R 6R0R 1R 2R 3R 4R 5R 6R0R 3R1R 2R 4R 5R0R

© 2005 Warren B. Powell Slide 56

Approximate dynamic programmingNow our optimality equation looks like:

We drop the expectation and solve the conditional problem:

Finally, we substitute in our approximation:

{ }1, 1 1( ) max ( , ) ( ( , )) |t

x x xt t t t t t t t t tx

V R E C R x V R x Rω− − −∈= +

X

( )( )1 1 ( ) )ˆ( , ( )) max ( ( ), ( )) ,x x

t t t t t t t t txV R R C R x V R x

ω ωω ω ω ω− − ∈

= +(X

( )( )1 1 ( ) )ˆ( , ( )) max ( ( ), ( )) ,x x

t t t t t t t t txV R R C R x V R x

ω ωω ω ω ω− − ∈

= +(X

Expectation outside of the “max” operator.

Post-decision state variable

“Convenient” value function approximation.

© 2005 Warren B. Powell Slide 57

Approximate dynamic programming

Approximating the value function:» We choose approximations of the form:

Linear (in the resource state):

( )

Piecewise linear, separable:

( ) ( )

t t ta taa

t t ta taa

V R v R

V R V R

= ⋅

=

A

A

Best when assets are complex,which means that is small(typically 0 or 1).

taR

Best when assets are simple,which means that may belarger.

taR

© 2005 Warren B. Powell Slide 58

Approximate dynamic programming

A myopic decision rule (policy):

A decision rule that looks into the future:

( )( )( ) )

arg max ( ( ), ( )) ,n xt t t t t t t

xx C R x V R x

ω ωω ω ω

∈= +

(X

( ) )arg max ( ( ), ( ))n

t t t tx

x C R xω ω

ω ω∈

=(X

© 2005 Warren B. Powell Slide 59

Approximate dynamic programming

t t+1 t+2Simulating a myopic policy:

© 2005 Warren B. Powell Slide 60

Approximate dynamic programming

A myopic decision rule (policy):

A decision rule that looks into the future:

( )( )( ) )

arg max ( ( ), ( )) ,n xt t t t t t t

xx C R x V R x

ω ωω ω ω

∈= +

(X

( ) )arg max ( ( ), ( ))n

t t t tx

x C R xω ω

ω ω∈

=(X

© 2005 Warren B. Powell Slide 61

Approximate dynamic programming

1a

'1( )V a

2a

'2( )V a

© 2005 Warren B. Powell Slide 62

Option 1: Send directly to customersOption 2: Send to regional depotsOption 3: Send to classification yards

Classification yards

© 2005 Warren B. Powell Slide 64

Approximate dynamic programmingTwo-stage resource allocation under uncertainty

© 2005 Warren B. Powell Slide 65

Approximate dynamic programmingWe obtain piecewise linear recourse functions for each regions.

© 2005 Warren B. Powell Slide 66

Approximate dynamic programmingThe function is piecewise linear on the integers.

We approximate the value of cars in the future using a separable approximation.

0 1 2 3 4 5Number of vehicles at a location

Prof

its

© 2005 Warren B. Powell Slide 67

Approximate dynamic programmingTo capture nonlinear behavior:

Each link captures the marginalreward of an additional car.

© 2005 Warren B. Powell Slide 68

Approximate dynamic programming

© 2005 Warren B. Powell Slide 69

Approximate dynamic programming

© 2005 Warren B. Powell Slide 70

Approximate dynamic programming

1nR →

2nR →

3nR →

4nR →

5nR →

© 2005 Warren B. Powell Slide 71

Approximate dynamic programmingWe estimate the functions by sampling from our distributions.

1nR →

2nR →

3nR →

4nR →

5nR →

1 ( )nD ω

2 ( )nD ω

3 ( )nD ω

( )nCD ω

1( )nv ω

2 ( )nv ω

3 ( )nv ω

4 ( )nv ω

5 ( )nv ω

Marginal value:

© 2005 Warren B. Powell Slide 72

Approximate dynamic programming

The time t subproblem:

1tR

2tR

3tR

t1 2 3( , , )n

ta t t tV R R R(i-1,t+3)

(i,t+1)

(i+1,t+5)

1 1

2 2

3 3

Gradients:ˆ ˆ( , )ˆ ˆ( , )ˆ ˆ( , )

n nt t

n nt tn nt t

v v

v v

v v

− +

− +

− +

© 2005 Warren B. Powell Slide 73

Approximate dynamic programming

Left and right gradients are found by solving flow augmenting path problems.

3tR

t

i

1 2 3( , , )nta t t tV R R R

(i-1,t+3)Gradients:

3ˆ( )ntv +

The right derivative (the value of one more unit of that resource) is a flow augmenting path from that node to the supersink.

The right derivative (the value of one more unit of that resource) is a flow augmenting path from that node to the supersink.

© 2005 Warren B. Powell Slide 74

Approximate dynamic programming

Left and right derivatives are used to build up a nonlinear approximation of the subproblem.

R1t

1( )kit tV R

R1tk

© 2005 Warren B. Powell Slide 75

Approximate dynamic programming

Left and right derivatives are used to build up a nonlinear approximation of the subproblem.

R1t

ktv+

ktv−

Right derivativeLeft derivative

R1tk

1( )kit tV R

© 2005 Warren B. Powell Slide 76

Approximate dynamic programming

Each iteration adds new segments, as well as refining old ones.

R1t

( 1)ktv+ +

( 1)ktv− +

R1tk+1

1( )kit tV R

© 2005 Warren B. Powell Slide 77

Approximate dynamic programming

0.0

0.5

1.0

1.5

2.0

2.5

0 1 2 3 4 5 6 7 8 9 10

Variable Value, s

Func

tiona

l Val

ue, f

(s) =

ln(1

+s)

Exact1 Iter2 Iter5 Iter10 Iter15 Iter20 Iter

Number of resources

App

roxi

mat

e va

lue

func

t ion

© 2005 Warren B. Powell Slide 78

Simulating a myopic policy

Approximate dynamic programming

t

© 2005 Warren B. Powell Slide 79

Simulating a myopic policy

Approximate dynamic programming

© 2005 Warren B. Powell Slide 80

Using value functions to anticipate the future

Approximate dynamic programming

t

“Here and now” Downstream impacts

© 2005 Warren B. Powell Slide 81

Approximate dynamic programming

Using value functions to anticipate the future

© 2005 Warren B. Powell Slide 82

Approximate dynamic programming

Using value functions to anticipate the future

© 2005 Warren B. Powell Slide 83

Approximate dynamic programming

Using value functions to anticipate the future

© 2005 Warren B. Powell Slide 84

© 2005 Warren B. Powell Slide 85

© 2005 Warren B. Powell Slide 86

© 2005 Warren B. Powell Slide 87

© 2005 Warren B. Powell Slide 88

80

85

90

95

1001 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100

Iteration No.

% o

f Obj

ectiv

e U

pper

boun

d

Agg_PWLinear_1

Agg_PWLinear_2

Agg_PWLinear_3

DisAgg_Linear

DisAgg_PWLinear

Decomp_Location

The mathematical optimum

Approximate dynamic programming

Approximate DP vs. LP

© 2005 Warren B. Powell Slide 89

Downloadable atwww.castlelab.princeton.edu

© 2005 Warren B. Powell Slide 90

The information classes

Expert knowledge ρ

Forecasts of impacts on others tV

tΩForecasts of exogenous events

Knowledge tK

© 2005 Warren B. Powell Slide 91

Low dimensional patterns

Old modeling approach: Engineering costs

0, :Subject tominarg*

≥==

xbAxcxx

Objectives

“Physics”

“Behavior”

© 2005 Warren B. Powell Slide 92

Flows from history

© 2005 Warren B. Powell Slide 93

Flows from history

Flows from the model

© 2005 Warren B. Powell Slide 94

Low dimensional patterns

Bottom up/top down modeling:

Specify the behaviorsyou want at a general

level.

Patterns

Specify costs,driver availability,work rules, routing

preferences, load avail.

Engineering

© 2005 Warren B. Powell Slide 95

Low dimensional patterns

Pattern matching

* arg min ( , )x cx H xθ ρ= +

Cost function

“Behavior”

The “happiness” function –measures the degree to which model behavior agrees with a knowledgeable expert.

( , ) || ( ) || where ( ) is an aggregation functionH x G x G xρ ρ= −

© 2005 Warren B. Powell Slide 96

Low dimensional patterns

Patterns and aggregation:» What we do:

• We define patterns based on an aggregation of the attributes of a single vehicle.

• Patterns indicate the desirability of a single decision.

» Patterns can be expressed at different levels of aggregation, simultaneously.

• Don’t send C-5’s into Saudi Arabia• Don’t send C-5’s needing maintenance into Saudi Arabia• Don’t send C-5’s needing maintenance loaded with freight to

southeast Asia into Saudi Arabia.

» Patterns are not hard rules – they express desirable or undesirable patterns of behavior.

© 2005 Warren B. Powell Slide 97

Flows from history

Flows from the model

© 2005 Warren B. Powell Slide 98

Flows from history

Flows from the model

© 2005 Warren B. Powell Slide 99

Low dimensional patterns

Length of haul calibration-teams

600

650

700

750

800

850

1 2 3 4 5 6 7 8 9 10

Iteration

MinSolo w/ patternSolo w/o patternMax

Without pattern

With pattern

© 2005 Warren B. Powell Slide 100

Low dimensional patterns

Patterns can come from history:

© 2005 Warren B. Powell Slide 101

Low dimensional patterns… or an expert:

© 2005 Warren B. Powell Slide 102

The information classes

Expert knowledge ρ

Forecasts of impacts on others tV

tΩForecasts of exogenous events

Knowledge tK

© 2005 Warren B. Powell Slide 103

The military airlift problem

© 2005 Warren B. Powell Slide 104

(EK)Expert knowledge

(ADP)Approximate Dynamic Programming

(RH)Rolling horizon

(MP:RL-AL/KNAF)

Myopic cost-based, a list of requirements to a list of aircraft, known now and actionable in the future

(MP:RL-AL/KNAN)

Myopic cost-based, a list of requirements to a list of aircraft, known now and actionable now

(MP:R-AL/KNAF)

Myopic cost-based, one requirement to a list of aircraft, known now and actionable in the future

(MP:R-AL/KNAN)

Myopic cost-based, one requirement to a list of aircraft, known now and actionable now

(RB:R-A)Rule-based

Decision Functions

Information ClassesPolicy

ttt RI =

),( tttt cRI =

),)(( tttttt cRI ≥′′=

),( tttt cRI =

),)(( tttttt cRI ≥′′=

}|,){( ''''''ph

ttttttt tcRI T∈′= ′≥

}|,,){( phttttttttt tVcRI T∈′= ′′≥′′

}|,,,){( phttttttttt tVcRI T∈′= ′′≥′′ ρ

Optimizing simulator

Increasing information sets

© 2005 Warren B. Powell Slide 105

Costs of different policies

0

50

100

150

200

250

(RB:R-A)(MP:RL-AL/KNAN)

(ADP)

Policies

Mill

ion

Dol

lors

Optimizing simulator

Increasing information sets

Transportation cost

Late delivery cost

Repair cost

Total cost

RuleBased

Value functions

Actionablefuture

ActionableNow

Choice ofaircraft

© 2005 Warren B. Powell Slide 106

Throughput curves of policies

0

5

10

15

20

25

30

35

40

45

50

0 30 60 90 120 150 180 210

Mill

ions

Time periods

Poun

ds

Cumulative expected thruput(RB:R-A)(MP:R-AL/KNAN)(MP:RL-AL/KNAN)(MP:RL-AL/KNAF)(ADP)

Increasing information sets

Optimizing simulator

© 2005 Warren B. Powell Slide 107

Throughput curves of policies

0

5

10

15

20

25

30

35

40

45

50

0 30 60 90 120 150 180 210

Mill

ions

Time periods

Poun

ds

Cumulative expected thruput(RB:R-A)(MP:R-AL/KNAN)(MP:RL-AL/KNAN)(MP:RL-AL/KNAF)(ADP)

Optimizing simulator

© 2005 Warren B. Powell Slide 108

Areas between the cumulative expected thruput curve and different policy thruput curves

0

50

100

150

200

250

300

350

400

(RB:R-A)(MP:R-AL/KNAN)

(MP:RL-AL/KNAN)

(MP:RL-AL/KNAF)

(ADP)

Mill

ions

Policy

Poun

d * d

ays

Increasing information sets

Optimizing simulator

© 2005 Warren B. Powell Slide 109

Outline

Recent experiments with modeling airlift operations

© 2005 Warren B. Powell Slide 110

Random demands and equipment failures

© 2005 Warren B. Powell Slide 111

Pilots

Aircraft

Customers

© 2005 Warren B. Powell Slide 112

Case study

Questions:

» What is the effect of uncertain demands on a military airlift schedule?

» What is the effect of equipment failures?

» How does adaptive learning change the effect of randomness on the performance of the simulation?

» What is the effect of advance information?

© 2005 Warren B. Powell Slide 113

250000

260000

270000

280000

290000

300000

310000

320000

330000

1 9 17 25 33 41 49 57 65 73 81 89 97

Determ demand|NoBreak|LearnDeterm demand|Break|Learn

Random demand|Nobreak|LearnDeterm demand|No Break|NolearnRandom demand|Break|Learn

Determ demand|Break|No learn

Random demand|No Break|NolearnRandom demand|Break|Nolearn

Iterative learning

Tot

al c

ontr

ibut

ion

© 2005 Warren B. Powell Slide 114

250000

260000

270000

280000

290000

300000

310000

320000

330000

1 9 17 25 33 41 49 57 65 73 81 89 97

Determ demand|NoBreak|LearnDeterm demand|Break|Learn

Random demand|Nobreak|LearnDeterm demand|No Break|NolearnRandom demand|Break|Learn

Determ demand|Break|No learn

Random demand|No Break|NolearnRandom demand|Break|Nolearn

Deterministic demands, no failures

With learning

Without learning

© 2005 Warren B. Powell Slide 115

250000

260000

270000

280000

290000

300000

310000

320000

330000

1 9 17 25 33 41 49 57 65 73 81 89 97

Determ demand|No Break|Learn

Determ demand|Break|Learn

Random demand|Nobreak|Learn

Determ demand|No Break|Nolearn

Random demand|Break|Learn

Determ demand|Break|No learn

Random demand|No Break|Nolearn

Random demand|Break|Nolearn

Deterministic demands, with failures

With learning

Without learning

© 2005 Warren B. Powell Slide 116

250000

260000

270000

280000

290000

300000

310000

320000

330000

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99

Determ demand|No Break|Learn

Determ demand|Break|Learn

Random demand|No break|Learn

Determ demand|No Break|No learn

Random demand|Break|Learn

Determ demand|Break|No learn

Random demand|No Break|No learn

Random demand|Break|No learn

Random demands, no failures

With learning

Without learning

© 2005 Warren B. Powell Slide 117

250000

260000

270000

280000

290000

300000

310000

320000

330000

1 9 17 25 33 41 49 57 65 73 81 89 97

Determ demand|No Break|Learn

Determ demand|Break|Learn

Random demand|Nobreak|Learn

Determ demand|No Break|Nolearn

Random demand|Break|Learn

Determ demand|Break|No learn

Random demand|No Break|Nolearn

Random demand|Break|Nolearn

Random demands, with failures

With learning

Without learning

© 2005 Warren B. Powell Slide 118

Effect of advance notice

86

88

90

92

94

96

98

100

Prebook 0 hours Prebook 2 hours Prebook 6 hours

Perc

ent c

over

age

Effect of advance booking

Withoutlearning

© 2005 Warren B. Powell Slide 119

Effect of advance booking

Effect of advance notice

86

88

90

92

94

96

98

100

Prebook 0 hours Prebook 2 hours Prebook 6 hours

Perc

ent c

over

age

Withoutlearning

Withlearning

© 2005 Warren B. Powell Slide 120

Midair refueling: initial solution

© 2005 Warren B. Powell Slide 121

Midair refueling: initial solution

Path followed by tanker (moves up and down Atlantic).

© 2005 Warren B. Powell Slide 122

Midair refueling: initial solution

Second plane crashes

First plane refuels

Green: full of fuelYellow to red: nearing emptyBlack: empty (plane crashes)

© 2005 Warren B. Powell Slide 123

Midair refueling: exploration

Learning over many iterations.

© 2005 Warren B. Powell Slide 124

Planes learn to meet in the middle so both can refuel.

Midair refueling: final solution

© 2005 Warren B. Powell Slide 125

Outline

Calibrating a model for a major truckload motor carrier

© 2005 Warren B. Powell Slide 126

Schneider National

© 2005 Warren B. Powell Slide 127

Schneider National

© 2005 Warren B. Powell Slide 128

© 2005 Warren B. Powell Slide 129

Truckload trucking

Questions for the model:» What types of drivers should they hire?

• Domicile?• Single drivers vs. teams?

» What is the value of knowing about customer requests farther in the future?

» What is the profitability of different customers?» What is the value of increasing terminal capacity?

© 2005 Warren B. Powell Slide 130

LOH

0

200

400

600

800

1000

1200

1400

1600

US_SOLO US_IC US_TEAM

Capacity category

LOH

Historical maximumSimulationHistorical minimum

Truckload trucking

© 2005 Warren B. Powell Slide 131

Revenue per WU

Utilization

0

200

400

600

800

1000

1200

1400

US_SOLO US_IC US_TEAM

Capacity category

Reve

nue

per W

U

Historical maximumSimulationHistorical minimum

0

200

400

600

800

1000

1200

US_SOLO US_IC US_TEAM

Capacity category

Util

izat

ion Historical maximum

SimulationHistorical minimum

Truckload trucking

© 2005 Warren B. Powell Slide 132

Truckload trucking

Challenge» We want to know the marginal value of each type of

driver.» A driver type is determined by:

» There are 30,000 driver “types”!!!» We need to take the “derivative” of our simulation for

each type.

Location 100Domicile 100

Driver type 3a

⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥= =⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦

© 2005 Warren B. Powell Slide 133

Res

ourc

e St

ate-

Type

Time2+t1+tt

Multistage problems

( )t tX Rπ

3ˆntv

2ˆntv

1ˆntv

© 2005 Warren B. Powell Slide 134

Time

Res

ourc

e St

ate-

Type

2+t1+t

Multistage problems

1 1( )t tX Rπ

+ +

1,2ˆntv +

1,1ˆntv +

1,3ˆntv +

© 2005 Warren B. Powell Slide 135

Time

Res

ourc

e St

ate-

Type

2+t

Multistage problems

2 2( )t tX Rπ

+ +2,1ˆn

tv +

2,2ˆntv +

2,3ˆntv +

© 2005 Warren B. Powell Slide 136

Res

ourc

e St

ate-

Type

Time2+t1+tt

Multistage problems

( )t tX Rπ

3ˆntv

2ˆntv

1ˆntv

© 2005 Warren B. Powell Slide 137

Time

Res

ourc

e St

ate-

Type

2+t1+t

Multistage problems

1 1( )t tX Rπ

+ +

1,2ˆntv +

1,1ˆntv +

1,3ˆntv +

© 2005 Warren B. Powell Slide 138

Time

Res

ourc

e St

ate-

Type

2+t

Multistage problems

2 2( )t tX Rπ

+ +2,1ˆn

tv +

2,2ˆntv +

2,3ˆntv +

© 2005 Warren B. Powell Slide 139

( )t tX Rπ

1 1( )t tX Rπ

+ + 2 2( )t tX Rπ

+ +

Backward pass

© 2005 Warren B. Powell Slide 140

Time

Res

ourc

e St

ate-

Type

2+t

2,1ˆntv +

Backward pass

© 2005 Warren B. Powell Slide 141

Time

Res

ourc

e St

ate-

Type

2+t1+t

1,2ˆntv +

Backward pass

© 2005 Warren B. Powell Slide 142

Time

Res

ourc

e St

ate-

Type

2+t1+tt

3ˆntv

Backward pass

© 2005 Warren B. Powell Slide 143

Time

Res

ourc

e St

ate-

Type

2+t1+tt

3ˆntv

Backward pass

© 2005 Warren B. Powell Slide 144

Driver fleet optimization

simulation objective function

1800000

1810000

1820000

1830000

1840000

1850000

1860000

1870000

1880000

1890000

1900000

580 590 600 610 620 630 640 650

# of drivers

s1

s2

s3

s4

s5

s6

s7

s8

s9

s10

avg

pred

Base case+5 resources

+20 resources+30 resources+40 resources

+50 resources+60 resources

+10 resources

© 2005 Warren B. Powell Slide 145

Driver fleet optimization

simulation objective function

1800000

1810000

1820000

1830000

1840000

1850000

1860000

1870000

1880000

1890000

1900000

580 590 600 610 620 630 640 650

# of drivers

s1

s2

s3

s4

s5

s6

s7

s8

s9

s10

avg

pred

© 2005 Warren B. Powell Slide 146

Driver fleet optimization

simulation objective function

1800000

1810000

1820000

1830000

1840000

1850000

1860000

1870000

1880000

1890000

1900000

580 590 600 610 620 630 640 650

# of drivers

s1

s2

s3

s4

s5

s6

s7

s8

s9

s10

avg

pred

av

© 2005 Warren B. Powell Slide 147

Driver fleet optimization

-500

0

500

1000

1500

2000

2500

3000

3500

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Driver types

© 2005 Warren B. Powell Slide 148

Add drivers

© 2005 Warren B. Powell Slide 149

Reduce drivers

© 2005 Warren B. Powell Slide 150

Questions?