2011 02-04 - d sallier - prévision probabiliste

Paris, ENGREF 4 février 2011

ProbabilisticProbabilisticdemand forecastingdemand forecasting

Prepared & presented by Daniel SALLIERTraffic Data & Forecasting DirectorAéroports de [email protected] 70 03 45 68

2


ContentContent

ForegroundThe "classical" forecasting approach

Drawbacks of the "classical" forecasting approach

2 generic sources of uncertainty in any forecast

How to cope with the intrinsic technical uncertaintyWhat we are looking for …

Let's go back to the very basics

Step #1: model determination

Step #2: determination of the law of probability of the models parameters

Step #3: determination of the law of probability of the models output: Y

Step #4: determination of the law of probability of the future values

3


ContentContent(continued)(continued)

The data agregation / break-up issue

The data agregation issue

The data break-up issue

Part of the prospective uncertainty: the residual issueWhat are residuals?

Taking into account part of the prospective risk

Further developments and applicationsVertical cuts for most of the short term utilisation

Horizontal cuts for most of the mid & long term utilisation

ConclusionsSo many advantages, so few drawbacks

4


ForegroundForeground

5


The "classical" forecasting approachThe "classical" forecasting approachEconometrical or chronological models most of the time;Assumptions on the future value of the inputs leading to:

Single forecasted value (base case?);Scenario based forecast.

"Post-processing" of the model outputs by the experts and/or the management;

1950 1960 1970 1980 1990 2000 2010 2020Year

Pas

sen

ger

s (M

)

Base case

High case

Low case

Historical traffic

6


Drawbacks of the "classical" forecasting Drawbacks of the "classical" forecasting approachapproach

The "cheating/forgery" risk:"political" figures decided by the management to be "scientifically" justified by the forecasting team;experts eager to be as much consensual as possible with the rest of the community: better to be wrong together than right alone!

It ends up with self deception in the company

The no ending "what if …" questions asked by a management afraid of having to make up a decision;

The forecasting team implicitly deciding what is the level of risk the company should incur;

A single figure or even scenario related figures does not make any sense from a mathematical and statistical point of view.

7


2 generic sources of uncertainty in any 2 generic sources of uncertainty in any forecastforecast

The intrinsic technical uncertainty:Assumptions on the future value of the inputs

(GDP, population, fares, …);

The very nature of the forecasting model

(linear law, exponentiation law, log law, …);

The uncertainty on the value of the parameters of the forecasting models;

The residuals: the difference between actual values and estimates.

The prospective uncertainty; any "abnormal" event which may happen in the future.

The techniques developed by ADP's R&D teamaddress mostly the 1st type of generic uncertainty:

The intrinsic technical uncertainty

8


How to cope with the How to cope with the intrinsic technical intrinsic technical

uncertaintyuncertainty

9


What is the output we are looking for …What is the output we are looking for …

The theory of probabilities provides the tools to answer most of the issues raised by the measurement of the present and the future uncertainty:

Pax (M )% 2003 vs

2002Pax (M )

% 2003 vs 2002

Pax (M )% 2003 vs

2002

95% greater … 65.6 -8.1% 66.4 -7.1% 67.6 -5.4%90% greater … 66.0 -7.7% 66.7 -6.6% 68.0 -4.9%80% greater … 66.1 -7.4% 66.9 -6.4% 68.1 -4.6%

50% greater … 66.7 -6.6% 67.5 -5.5% 68.8 -3.7%80% lower … 67.8 -5.2% 68.5 -4.1% 69.8 -2.3%90% lower … 68.3 -4.4% 69.1 -3.3% 70.4 -1.5%95% lower … 68.5 -4.1% 69.3 -3.0% 70.6 -1.2%

Low fare

… than… for the demande

to be

Probability …

High fare Mid fare

Dummy data

… how to proceed?

10


Let's go back to the very basicsLet's go back to the very basicsThe full story always starts with a cloud of dots out of which one should find one or several laws/models to be further used as forecasting model(s):

Actual data

11


Actual dataActual data

1st model

Step #1: model determinationStep #1: model determination1 or several models can fit the data. The way the models are determined is not important (econometrical models, behavioural models, etc.)

Unless one has precise reason to select a specific model, there is no reasons to keep just one of them and to discard all the others. Each model is given an equal chance.

R&D works under process to address this issue: the ADN engine for Alexander’s Drift Net.

Actual data

1st model

2nd model

Actual data

1st model

2nd model

3rd model

12


Step #2: determination of the law of Step #2: determination of the law of probability of the models parametersprobability of the models parameters

Let's take the 1st model for instance.

• It's equation is:

where X is the residual

• Bootstrap techniques allow to determine the laws of probability of the different parameters (, , , ) of the model which are strongly correlated to each others.

21

2

;0

.6

...)(

Normal

XCOSXXXY

X

XXX

X

0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

17.5 18.0 18.5 19.0 19.5 20.0 20.5

Pro

bab

ilit

y

0%

1%

2%

3%

4%

5%

6%

7%

8%

9%

0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18

Pro

bab

ilit

y

Example of drawings of random samples of the model parameters

844.450 18.715 0.139 27.066 -0.329843.870 18.829 0.135 27.137 -0.301846.836 18.175 0.160 28.604 -0.202843.995 18.801 0.136 27.171 -0.315846.836 18.175 0.160 28.604 -0.202843.016 19.032 0.128 26.271 -0.373845.921 18.400 0.151 27.810 -0.279840.439 19.633 0.107 24.506 -0.443845.442 18.483 0.148 27.915 -0.263843.995 18.801 0.136 27.171 -0.315845.921 18.400 0.151 27.810 -0.279844.450 18.715 0.139 27.066 -0.329844.450 18.715 0.139 27.066 -0.329847.198 17.975 0.169 29.259 -0.086847.198 17.975 0.169 29.259 -0.086

13


Step #3: determination of the law of Step #3: determination of the law of probability of the models output: probability of the models output: YY

At this stage we have all the probabilistic components of the forecasting model. That's where the Monté-Carlo techniques proves to be useful:

Take a future deterministic or sampled value of X;

Draw a random sample of the model parameters;

Compute the corresponding value of Y;

Save the value of Y;

Start the process again until a sufficient number of Ys has been collected;

Compute the frequency/probability law of Y;

X Axis

Y a

xis

98% probability for Yto be within the band

Actual data

50% probability for Yto be greater or equal

Forecasting model #2

14


X axis

Y a

xis

Step #4: determination of the law of Step #4: determination of the law of probability of the future valuesprobability of the future values

At this stage of the process we have all the probabilistic future values of each forecasting model.

That where the Monté-Carlo techniques is used once again to combine all these values and get the final probabilistic forecast.

Each model is given an equal probability to occur.

98% probability for Yto be within the band

Actual data

50% probability for Yto be greater or equal

15


The data The data aggregation / break-aggregation / break-

up issueup issue

16


The data aggregation issueThe data aggregation issue

Let's suppose that we are interested in the forecasted demand of the French residents which depends on the French GDP.

For a given value of the French GDP, we can calculate a forecasted demand to/from UK, to/from the USA, to/from Japan, etc… It means that, from a statistical point of view, the different flows of traffic from/to France cannot be regarded as being independent variables.

Straightforward application of the Monté-Carlo technique would mix around all the random samples along the computation process as if they were fully independent which they are not.

17


The data agregation issueThe data agregation issue(continued)(continued)

This problem can be overcome by "flagging" each value of the explanatory variables (i.e. French GDP, British GDP, etc.) and to "stick" the flag(s) value to the intermediate or final random samples which are sharing the same value of the explanatory variable(s).

Instead of "mixing around" all the data set, the Monté-Carlo engine just "mixes around" the random samples which are sharing the same flag.

Flag Pax Flag Pax Flag Pax Flag Pax

1 11.0 1 4.9 1 6.4 1 22.41 10.9 1 5.2 1 7.0 1 22.91 13.3 1 5.9 1 7.9 1 23.91 13.4 1 6.7 1 7.9 1 23.82 14.2 2 6.1 2 8.2 1 22.62 12.4 2 5.7 2 7.7 1 23.22 13.6 2 6.7 2 7.2 1 24.13 12.0 3 5.2 3 6.7 1 24.13 12.1 3 5.9 3 7.2 1 23.33 10.8 3 4.7 3 6.9 1 23.93 14.0 3 6.8 3 7.8 1 24.84 12.7 4 5.6 4 7.3 1 24.84 10.1 4 4.6 4 5.7 1 24.14 11.8 4 5.8 4 6.2 1 24.7

French domestic demand

From/to UK From/to Germany From/to France

18


The data break-up issueThe data break-up issue(continued)(continued)

Let's suppose that the overall business level of risk as been set to 80% of probability for the overall demand to be greater or equal for instance. How does it cascade down? What is the corresponding level of risk of each traffic flow?

One should bare in mind that, unfortunately, 1+1 2 when dealing with probabilities; 1 + 1 could make 1.9!

Flagging the random samples of each traffic flow is one of the solutions to trace back which ones have been used in the final computation.

Cu

mu

late

d d

istr

ibu

tio

no

f p

rob

abil

itie

s

Overall demand

100%

80%

0%

Set of samplesto be discarded

Demand of the traffic flow #i

100%

74%

0%

Set of samplesto be elected

Cu

mu

late

d d

istr

ibu

tio

no

f p

rob

abil

itie

s

Frequency law ofthe elected samples

19


Part of the prospective Part of the prospective uncertainty:uncertainty:

the residual issuethe residual issue

20


Taking into account part of the prospective Taking into account part of the prospective riskrisk

A very simple and straightforward idea:Determination of the law of probability of the residuals.

Addition of the residual effects to the "regular" probabilistic forecast which can be achieved with a new round of Monté-Carlo simulations.

By doing so we can take into account part of the prospective risks: i.e. the risks linked to "unusual" events which already happened in the past and may happen again.

Of course there is no statistical or probabilistic methods to estimate the effects of future events which never happened yet; that where scenario based approaches can be brought back to the front stage.

This approach answers the amplitude and the likelihood question of the 'unusual" events. It does not answer the when and how long questions: it just measures a "latent risk".

21


Taking into account part of the prospective Taking into account part of the prospective riskrisk (continued) (continued)

There is ground here for the development of specific financial / management /

industrial tools and policies to cover part of this latent

risk

0%

5%

10%

15%

20%

25%

-25% -20% -15% -10% -5% 0% 5% 10% 15%

Residuals (% of total pax)

Pro

bab

ilit

y

Probability distribution of the residuals

0

2

4

6

8

10

12

14

16

18

1986

1988

1990

1992

1994

1996

1998

2000

2002

2004

2006

2008

2010

2012

2014

2016

2018

2020

2022

2024

Tra

ffic

/dem

and

(M p

ax)

Actual traffic data

50% probability for the demandto be greater or equalNo residuals

98% probability rangeNo residuals

0

2

4

6

8

10

12

14

16

18

1986

1988

1990

1992

1994

1996

1998

2000

2002

2004

2006

2008

2010

2012

2014

2016

2018

2020

2022

2024

Tra

ffic

/dem

and

(M p

ax)

Actual traffic data

50% probability for the demandto be greater or equalResiduals included

98% probability rangeResiduals included

0

2

4

6

8

10

12

14

16

18

1986

1988

1990

1992

1994

1996

1998

2000

2002

2004

2006

2008

2010

2012

2014

2016

2018

2020

2022

2024

Tra

ffic

/dem

and

(M p

ax)

Actual traffic data



0

2

4

6

8

10

12

14

16

18

1986

1988

1990

1992

1994

1996

1998

2000

2002

2004

2006

2008

2010

2012

2014

2016

2018

2020

2022

2024

Tra

ffic

/dem

and

(M p

ax)

Actual traffic data



0

2

4

6

8

10

12

14

16

18

1986

1988

1990

1992

1994

1996

1998

2000

2002

2004

2006

2008

2010

2012

2014

2016

2018

2020

2022

2024

Tra

ffic

/dem

and

(M p

ax)

Actual traffic data





22


Further developments Further developments and applicationsand applications

23


Vertical cuts for most of the short term utilisationVertical cuts for most of the short term utilisation

1986

1988

1990

1992

1994

1996

1998

2000

2002

2004

2006

2008

2010

2012

2014

2016

2018

2020

2022

2024

Tra

ffic

/dem

and

Actual capacity

Demand/traffic (million pax)

Pro

bab

ilit

y fo

r th

ed

eman

d t

o b

e g

reat

er o

r eq

ual Capacity

threshold

Turnover (million €)

Pro

bab

ilit

y fo

r th

etu

rno

ver

to b

e g

reat

er o

r eq

ual

Capacitythreshold

OperationalProfit (million €)

Pro

bab

ilit

y fo

r th

e o

per

atin

gp

rofi

t to

be

gre

ater

or

equ

al

Capacitythreshold

€O million

etc.

To be used for:• (human) Resources dimensioning• Budget, cash flow• Future financial ratios analysis• Short term risk assessment• etc.

24


Horizontal cuts for most of the mid & long term Horizontal cuts for most of the mid & long term utilisationutilisation

1986

1988

1990

1992

1994

1996

1998

2000

2002

2004

2006

2008

2010

2012

2014

2016

2018

2020

2022

2024

Tra

ffic

/dem

and

Actual capacity

etc.To be mostly used for optimal dimensioning and planning of mid and long term capacity

growth: heavy investments

Planned capacity

50% probability - actual capacity

50% probability - planned capacity

98% centred probability - actual capacity

98% centred probability - planned capacity

Year

0

Op

erat

ing

pro

fit

25


ConclusionsConclusions

26


So many advantages, so few drawbacksSo many advantages, so few drawbacksA quite simple idea, but a rather complex and computer time consuming approach;

Put an end to the times when the forecasters were regarded as being fortune-tellers, gurus, devious crooks or scientific alibis for their boss misbehaviour (theirs of their boss' boss too);

Bring back the risk taking decision where it should have always been: the top management. In addition it offers the exhaustive set of data required by risk assessment tools;

Likely to offer a better legal protection to the forecasters in case of litigation with the share-holders or the financial markets;

Our own experience is that bankers are found of this way of making forecast. Aren't they mostly risk traders!

We (the ADP's forecasting team) are found of it too, since it saves us a lot of forecasting post-processing time while having no more pressures put on us for finding "convenient figures".

2011 02-04 - d sallier - prévision probabiliste

Technology

Transcript of 2011 02-04 - d sallier - prévision probabiliste