2011 02-04 - d sallier - prévision probabiliste
-
Upload
cornec -
Category
Technology
-
view
1.662 -
download
1
description
Transcript of 2011 02-04 - d sallier - prévision probabiliste
Paris, ENGREF 4 février 2011
ProbabilisticProbabilisticdemand forecastingdemand forecasting
Prepared & presented by Daniel SALLIERTraffic Data & Forecasting DirectorAéroports de [email protected] 70 03 45 68
2
Paris, ENGREF 4 février 2011
ContentContent
ForegroundThe "classical" forecasting approach
Drawbacks of the "classical" forecasting approach
2 generic sources of uncertainty in any forecast
How to cope with the intrinsic technical uncertaintyWhat we are looking for …
Let's go back to the very basics
Step #1: model determination
Step #2: determination of the law of probability of the models parameters
Step #3: determination of the law of probability of the models output: Y
Step #4: determination of the law of probability of the future values
3
Paris, ENGREF 4 février 2011
ContentContent(continued)(continued)
The data agregation / break-up issue
The data agregation issue
The data break-up issue
Part of the prospective uncertainty: the residual issueWhat are residuals?
Taking into account part of the prospective risk
Further developments and applicationsVertical cuts for most of the short term utilisation
Horizontal cuts for most of the mid & long term utilisation
ConclusionsSo many advantages, so few drawbacks
4
Paris, ENGREF 4 février 2011
ForegroundForeground
5
Paris, ENGREF 4 février 2011
The "classical" forecasting approachThe "classical" forecasting approachEconometrical or chronological models most of the time;Assumptions on the future value of the inputs leading to:
Single forecasted value (base case?);Scenario based forecast.
"Post-processing" of the model outputs by the experts and/or the management;
1950 1960 1970 1980 1990 2000 2010 2020Year
Pas
sen
ger
s (M
)
Base case
High case
Low case
Historical traffic
6
Paris, ENGREF 4 février 2011
Drawbacks of the "classical" forecasting Drawbacks of the "classical" forecasting approachapproach
The "cheating/forgery" risk:"political" figures decided by the management to be "scientifically" justified by the forecasting team;experts eager to be as much consensual as possible with the rest of the community: better to be wrong together than right alone!
It ends up with self deception in the company
The no ending "what if …" questions asked by a management afraid of having to make up a decision;
The forecasting team implicitly deciding what is the level of risk the company should incur;
A single figure or even scenario related figures does not make any sense from a mathematical and statistical point of view.
7
Paris, ENGREF 4 février 2011
2 generic sources of uncertainty in any 2 generic sources of uncertainty in any forecastforecast
The intrinsic technical uncertainty:Assumptions on the future value of the inputs
(GDP, population, fares, …);
The very nature of the forecasting model
(linear law, exponentiation law, log law, …);
The uncertainty on the value of the parameters of the forecasting models;
The residuals: the difference between actual values and estimates.
The prospective uncertainty; any "abnormal" event which may happen in the future.
The techniques developed by ADP's R&D teamaddress mostly the 1st type of generic uncertainty:
The intrinsic technical uncertainty
8
Paris, ENGREF 4 février 2011
How to cope with the How to cope with the intrinsic technical intrinsic technical
uncertaintyuncertainty
9
Paris, ENGREF 4 février 2011
What is the output we are looking for …What is the output we are looking for …
The theory of probabilities provides the tools to answer most of the issues raised by the measurement of the present and the future uncertainty:
Pax (M )% 2003 vs
2002Pax (M )
% 2003 vs 2002
Pax (M )% 2003 vs
2002
95% greater … 65.6 -8.1% 66.4 -7.1% 67.6 -5.4%90% greater … 66.0 -7.7% 66.7 -6.6% 68.0 -4.9%80% greater … 66.1 -7.4% 66.9 -6.4% 68.1 -4.6%
50% greater … 66.7 -6.6% 67.5 -5.5% 68.8 -3.7%80% lower … 67.8 -5.2% 68.5 -4.1% 69.8 -2.3%90% lower … 68.3 -4.4% 69.1 -3.3% 70.4 -1.5%95% lower … 68.5 -4.1% 69.3 -3.0% 70.6 -1.2%
Low fare
… than… for the demande
to be
Probability …
High fare Mid fare
Dummy data
… how to proceed?
10
Paris, ENGREF 4 février 2011
Let's go back to the very basicsLet's go back to the very basicsThe full story always starts with a cloud of dots out of which one should find one or several laws/models to be further used as forecasting model(s):
Actual data
11
Paris, ENGREF 4 février 2011
Actual dataActual data
1st model
Step #1: model determinationStep #1: model determination1 or several models can fit the data. The way the models are determined is not important (econometrical models, behavioural models, etc.)
Unless one has precise reason to select a specific model, there is no reasons to keep just one of them and to discard all the others. Each model is given an equal chance.
R&D works under process to address this issue: the ADN engine for Alexander’s Drift Net.
Actual data
1st model
2nd model
Actual data
1st model
2nd model
3rd model
12
Paris, ENGREF 4 février 2011
Step #2: determination of the law of Step #2: determination of the law of probability of the models parametersprobability of the models parameters
Let's take the 1st model for instance.
• It's equation is:
where X is the residual
• Bootstrap techniques allow to determine the laws of probability of the different parameters (, , , ) of the model which are strongly correlated to each others.
21
2
;0
.6
...)(
Normal
XCOSXXXY
X
XXX
X
0%
1%
2%
3%
4%
5%
6%
7%
8%
9%
17.5 18.0 18.5 19.0 19.5 20.0 20.5
Pro
bab
ilit
y
0%
1%
2%
3%
4%
5%
6%
7%
8%
9%
0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18
Pro
bab
ilit
y
Example of drawings of random samples of the model parameters
844.450 18.715 0.139 27.066 -0.329843.870 18.829 0.135 27.137 -0.301846.836 18.175 0.160 28.604 -0.202843.995 18.801 0.136 27.171 -0.315846.836 18.175 0.160 28.604 -0.202843.016 19.032 0.128 26.271 -0.373845.921 18.400 0.151 27.810 -0.279840.439 19.633 0.107 24.506 -0.443845.442 18.483 0.148 27.915 -0.263843.995 18.801 0.136 27.171 -0.315845.921 18.400 0.151 27.810 -0.279844.450 18.715 0.139 27.066 -0.329844.450 18.715 0.139 27.066 -0.329847.198 17.975 0.169 29.259 -0.086847.198 17.975 0.169 29.259 -0.086
13
Paris, ENGREF 4 février 2011
Step #3: determination of the law of Step #3: determination of the law of probability of the models output: probability of the models output: YY
At this stage we have all the probabilistic components of the forecasting model. That's where the Monté-Carlo techniques proves to be useful:
Take a future deterministic or sampled value of X;
Draw a random sample of the model parameters;
Compute the corresponding value of Y;
Save the value of Y;
Start the process again until a sufficient number of Ys has been collected;
Compute the frequency/probability law of Y;
X Axis
Y a
xis
98% probability for Yto be within the band
Actual data
50% probability for Yto be greater or equal
Forecasting model #2
14
Paris, ENGREF 4 février 2011
X axis
Y a
xis
Step #4: determination of the law of Step #4: determination of the law of probability of the future valuesprobability of the future values
At this stage of the process we have all the probabilistic future values of each forecasting model.
That where the Monté-Carlo techniques is used once again to combine all these values and get the final probabilistic forecast.
Each model is given an equal probability to occur.
98% probability for Yto be within the band
Actual data
50% probability for Yto be greater or equal
15
Paris, ENGREF 4 février 2011
The data The data aggregation / break-aggregation / break-
up issueup issue
16
Paris, ENGREF 4 février 2011
The data aggregation issueThe data aggregation issue
Let's suppose that we are interested in the forecasted demand of the French residents which depends on the French GDP.
For a given value of the French GDP, we can calculate a forecasted demand to/from UK, to/from the USA, to/from Japan, etc… It means that, from a statistical point of view, the different flows of traffic from/to France cannot be regarded as being independent variables.
Straightforward application of the Monté-Carlo technique would mix around all the random samples along the computation process as if they were fully independent which they are not.
17
Paris, ENGREF 4 février 2011
The data agregation issueThe data agregation issue(continued)(continued)
This problem can be overcome by "flagging" each value of the explanatory variables (i.e. French GDP, British GDP, etc.) and to "stick" the flag(s) value to the intermediate or final random samples which are sharing the same value of the explanatory variable(s).
Instead of "mixing around" all the data set, the Monté-Carlo engine just "mixes around" the random samples which are sharing the same flag.
Flag Pax Flag Pax Flag Pax Flag Pax
1 11.0 1 4.9 1 6.4 1 22.41 10.9 1 5.2 1 7.0 1 22.91 13.3 1 5.9 1 7.9 1 23.91 13.4 1 6.7 1 7.9 1 23.82 14.2 2 6.1 2 8.2 1 22.62 12.4 2 5.7 2 7.7 1 23.22 13.6 2 6.7 2 7.2 1 24.13 12.0 3 5.2 3 6.7 1 24.13 12.1 3 5.9 3 7.2 1 23.33 10.8 3 4.7 3 6.9 1 23.93 14.0 3 6.8 3 7.8 1 24.84 12.7 4 5.6 4 7.3 1 24.84 10.1 4 4.6 4 5.7 1 24.14 11.8 4 5.8 4 6.2 1 24.7
French domestic demand
From/to UK From/to Germany From/to France
18
Paris, ENGREF 4 février 2011
The data break-up issueThe data break-up issue(continued)(continued)
Let's suppose that the overall business level of risk as been set to 80% of probability for the overall demand to be greater or equal for instance. How does it cascade down? What is the corresponding level of risk of each traffic flow?
One should bare in mind that, unfortunately, 1+1 2 when dealing with probabilities; 1 + 1 could make 1.9!
Flagging the random samples of each traffic flow is one of the solutions to trace back which ones have been used in the final computation.
Cu
mu
late
d d
istr
ibu
tio
no
f p
rob
abil
itie
s
Overall demand
100%
80%
0%
Set of samplesto be discarded
Demand of the traffic flow #i
100%
74%
0%
Set of samplesto be elected
Cu
mu
late
d d
istr
ibu
tio
no
f p
rob
abil
itie
s
Frequency law ofthe elected samples
19
Paris, ENGREF 4 février 2011
Part of the prospective Part of the prospective uncertainty:uncertainty:
the residual issuethe residual issue
20
Paris, ENGREF 4 février 2011
Taking into account part of the prospective Taking into account part of the prospective riskrisk
A very simple and straightforward idea:Determination of the law of probability of the residuals.
Addition of the residual effects to the "regular" probabilistic forecast which can be achieved with a new round of Monté-Carlo simulations.
By doing so we can take into account part of the prospective risks: i.e. the risks linked to "unusual" events which already happened in the past and may happen again.
Of course there is no statistical or probabilistic methods to estimate the effects of future events which never happened yet; that where scenario based approaches can be brought back to the front stage.
This approach answers the amplitude and the likelihood question of the 'unusual" events. It does not answer the when and how long questions: it just measures a "latent risk".
21
Paris, ENGREF 4 février 2011
Taking into account part of the prospective Taking into account part of the prospective riskrisk (continued) (continued)
There is ground here for the development of specific financial / management /
industrial tools and policies to cover part of this latent
risk
0%
5%
10%
15%
20%
25%
-25% -20% -15% -10% -5% 0% 5% 10% 15%
Residuals (% of total pax)
Pro
bab
ilit
y
Probability distribution of the residuals
0
2
4
6
8
10
12
14
16
18
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
2012
2014
2016
2018
2020
2022
2024
Tra
ffic
/dem
and
(M p
ax)
Actual traffic data
50% probability for the demandto be greater or equalNo residuals
98% probability rangeNo residuals
0
2
4
6
8
10
12
14
16
18
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
2012
2014
2016
2018
2020
2022
2024
Tra
ffic
/dem
and
(M p
ax)
Actual traffic data
50% probability for the demandto be greater or equalResiduals included
98% probability rangeResiduals included
0
2
4
6
8
10
12
14
16
18
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
2012
2014
2016
2018
2020
2022
2024
Tra
ffic
/dem
and
(M p
ax)
Actual traffic data
50% probability for the demandto be greater or equalNo residuals
98% probability rangeNo residuals
0
2
4
6
8
10
12
14
16
18
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
2012
2014
2016
2018
2020
2022
2024
Tra
ffic
/dem
and
(M p
ax)
Actual traffic data
50% probability for the demandto be greater or equalResiduals included
98% probability rangeResiduals included
0
2
4
6
8
10
12
14
16
18
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
2012
2014
2016
2018
2020
2022
2024
Tra
ffic
/dem
and
(M p
ax)
Actual traffic data
50% probability for the demandto be greater or equalNo residuals
98% probability rangeNo residuals
50% probability for the demandto be greater or equalResiduals included
98% probability rangeResiduals included
22
Paris, ENGREF 4 février 2011
Further developments Further developments and applicationsand applications
23
Paris, ENGREF 4 février 2011
Vertical cuts for most of the short term utilisationVertical cuts for most of the short term utilisation
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
2012
2014
2016
2018
2020
2022
2024
Tra
ffic
/dem
and
Actual capacity
Demand/traffic (million pax)
Pro
bab
ilit
y fo
r th
ed
eman
d t
o b
e g
reat
er o
r eq
ual Capacity
threshold
Turnover (million €)
Pro
bab
ilit
y fo
r th
etu
rno
ver
to b
e g
reat
er o
r eq
ual
Capacitythreshold
OperationalProfit (million €)
Pro
bab
ilit
y fo
r th
e o
per
atin
gp
rofi
t to
be
gre
ater
or
equ
al
Capacitythreshold
€O million
etc.
To be used for:• (human) Resources dimensioning• Budget, cash flow• Future financial ratios analysis• Short term risk assessment• etc.
24
Paris, ENGREF 4 février 2011
Horizontal cuts for most of the mid & long term Horizontal cuts for most of the mid & long term utilisationutilisation
1986
1988
1990
1992
1994
1996
1998
2000
2002
2004
2006
2008
2010
2012
2014
2016
2018
2020
2022
2024
Tra
ffic
/dem
and
Actual capacity
etc.To be mostly used for optimal dimensioning and planning of mid and long term capacity
growth: heavy investments
Planned capacity
50% probability - actual capacity
50% probability - planned capacity
98% centred probability - actual capacity
98% centred probability - planned capacity
Year
0
Op
erat
ing
pro
fit
25
Paris, ENGREF 4 février 2011
ConclusionsConclusions
26
Paris, ENGREF 4 février 2011
So many advantages, so few drawbacksSo many advantages, so few drawbacksA quite simple idea, but a rather complex and computer time consuming approach;
Put an end to the times when the forecasters were regarded as being fortune-tellers, gurus, devious crooks or scientific alibis for their boss misbehaviour (theirs of their boss' boss too);
Bring back the risk taking decision where it should have always been: the top management. In addition it offers the exhaustive set of data required by risk assessment tools;
Likely to offer a better legal protection to the forecasters in case of litigation with the share-holders or the financial markets;
Our own experience is that bankers are found of this way of making forecast. Aren't they mostly risk traders!
We (the ADP's forecasting team) are found of it too, since it saves us a lot of forecasting post-processing time while having no more pressures put on us for finding "convenient figures".