“Renewable Energy Modelling and Forecasting”
Pierre Pinson
Technical University of Denmark.
DTU Electrical Engineering - Centre for Electric Power and Energymail: [email protected] - webpage: www.pierrepinson.com
23 April 2018
31761 - Renewables in Electricity Markets 1
Learning objectives
Through this lecture and additional study material, it is aimed for the students to be ableto:
1 Describe generalities of techniques that can be used to generate renewable energyforecasts
2 Build and estimate fairly simple models to be used as a basis for forecasting
3 Have enough material and motivation to improve your forecast competition entriesfor Assignment 3!
31761 - Renewables in Electricity Markets 2
The MIT Technology Review
The MIT Technology Review:
founded at MIT in 1899daily review/analysis of technological innovation worldwideimpact: 580.000 members and 2.400.000 website visitors per month!
The 10 breakthrough technologies2014:
genome editingmicroscale 3D printingneuromorphic chipsbrain mappingetc.
renewable energy analytics (!), andmore particularly forecasting...
[See link:
MIT Technology Review - Smart Wind and Solar Power]
31761 - Renewables in Electricity Markets 3
Outline
1 Test case and general considerations
2 The benchmark forecast approaches
persistenceclimatologyetc.
3 Going further in a regression-based framework
regression in a few slides (or, how to do it without a Ph.D. in Statistics)regression for forecastingan example: combination of persistence and climatology
4 Digging in the data
measurementsweather forecastsextra features?
31761 - Renewables in Electricity Markets 4
1 Test case and general considerations
31761 - Renewables in Electricity Markets 5
Basis for the lecture(s)
Wind Energy
Wave Energy (same ideas can be used)
... Also for Solar Energy, the same concepts can be applied!
31761 - Renewables in Electricity Markets 6
Test case: the Klim wind farm
The wind farm:
full name: Klim Fjordholmeonshore/offshore: onshoreyear of commissioning: 1996
nominal capacity (Pn): 21 MWnumber of turbines in farm: 35average annual electricity generation: 49 GWh
data available: 1999-2003 (for some researchers)temporal resolution: 5 mins, and hourly averagesweather forecasts: wind speed and direction,temperature
A link to the online description:Vattenfall’s Klim wind farm
The wind farm is being recommissioned these days:NordJyske online article
Remember that we normalize power generation - in practice, yt ∈ [0, 1], ∀t31761 - Renewables in Electricity Markets 7
General considerations
Forecasting is about the future! Lead times within 0-48 hours, in line withmarket-based operations
When being at time t and aiming to generate a forecast for time t + k, onlyknowledge available at time t can be used...
observations up to time t: power generation, meteorological measurements, etc.
weather forecasts for the period of interest
Since forecasts will always have a part of error, just accept, and try to minimize it
31761 - Renewables in Electricity Markets 8
2 The benchmark forecast approaches
31761 - Renewables in Electricity Markets 9
No need to make it difficult...
What is the easiest way to predict wind power generation?
Data-free approaches:
making random guesses (it could actuallywork...)
making educated guesses (works fine in certainplaces and seasons, e.g., summer in Crete,all-year-round in Egypt)
Data-based approaches:
persistence
climatology
simple statistical models, etc.
31761 - Renewables in Electricity Markets 10
No need to make it difficult...
What is the easiest way to predict wind power generation?
Data-free approaches:
making random guesses (it could actuallywork...)
making educated guesses (works fine in certainplaces and seasons, e.g., summer in Crete,all-year-round in Egypt)
Data-based approaches:
persistence
climatology
simple statistical models, etc.
31761 - Renewables in Electricity Markets 11
No need to make it difficult...
What is the easiest way to predict wind power generation?
Data-free approaches:
making random guesses (it could actuallywork...)
making educated guesses (works fine in certainplaces and seasons, e.g., summer in Crete,all-year-round in Egypt)
Data-based approaches:
persistence
climatology
simple statistical models, etc.
31761 - Renewables in Electricity Markets 12
The random guess approach
At time t, we make a random guess for all lead times t + k, k = 1, . . . , 48
This translates to
yt+k|t = uk , ∀k,
where uk ∼ U [0, 1]
Right:Example of randomguess forecast forKlim, issued on 28April 2002, 00:00UTC
●
●●
●● ●
●●
●● ●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
lead time [k]
pow
er [p
.u.]
1 6 12 18 24 30 36 42 48
0.0
0.2
0.4
0.6
0.8
1.0
●
forecastsobservations
Let us apply that forecast strategy for a whole sample year (2002), and analyse itsperformance...
31761 - Renewables in Electricity Markets 13
Evaluation of the random guess approach
The quality of the forecasts is summarized in terms of bias, MAE and RMSE
lead time [k]
erro
r [%
of P
n]
1 6 12 18 24 30 36 42 48
−40
−20
020
4060
biasMAERMSE
How does that look like?
31761 - Renewables in Electricity Markets 14
The persistence approach
At time t, the persistence forecast (“what you see is what you get”) for all leadtimes t + k, k = 1, . . . , 48 is based on the idea that your best guess is your latestpiece of information...
This translates to
yt+k|t = yt , ∀k,
where yt is the latestmeasurement available
Right:Example of apersistence forecast forKlim, issued on 28April 2002, 00:00UTC
●
●●
●● ●
●●
●● ●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
lead time [k]
pow
er [p
.u.]
●
0 6 12 18 24 30 36 42 48
0.0
0.2
0.4
0.6
0.8
1.0
●
forecastsobservations
Obs. at time t
Let us similarly apply that strategy for a whole sample year (2002), and analyse itsperformance...
31761 - Renewables in Electricity Markets 15
Evaluation of the persistence approach
Similar scores: bias, MAE and RMSE
lead time [k]
erro
r [%
of P
n]
1 6 12 18 24 30 36 42 48
−10
010
2030
40
biasMAERMSE
Such score values can be explained by the “inertia” in wind power dynamics
31761 - Renewables in Electricity Markets 16
A generalization: m-point averaging approach
There might be a gain in considering more than the last observation only...
At time t, the m-point averaging forecast, for all lead times t + k, k = 1, . . . , 48, isbased on an average of recent information
This translates to
yt+k|t =∑m
i=1 yt−i , ∀k,
where yt−i is the i th
latest measurementavailable
Right:Example of a m-pointaveraging (withm = 3) forecast forKlim, issued on 28April 2002, 00:00UTC
●
●●
●● ●
●●
●● ●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
lead time [k]
pow
er [p
.u.]
●
●
●
−2 6 12 18 24 30 36 42 48
0.0
0.2
0.4
0.6
0.8
1.0
●
forecastsobservations
Obs. average at time t
Let us similarly apply that strategy for a whole sample year (2002), and analyse itsperformance...
31761 - Renewables in Electricity Markets 17
Evaluation of the m-point averaging approach
Focus on RMSE only
●
●
●
●
●
●●
●●
●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
lead time [k]
erro
r [%
of P
n]
1 6 12 18 24 30 36 42 48
010
2030
40● RMSE − m=1 (persistence)
RMSE − m=3RMSE − m=20
There is a compromise to be made between short-term and longer-term forecastquality
31761 - Renewables in Electricity Markets 18
The limiting case: Climatology
Climatology is for the case where m→∞At time t, the climatology forecast, for all lead times t + k, k = 1, . . . , 48, is basedon an average of all information ever available (= wind farm capacity factor)
This translates to
yt+k|t =∑∞
i=1 yt−i , ∀k,
where yt−i is the i th
latest measurementavailable
Right:Example of aclimatology forecastfor Klim, issued on 28April 2002, 00:00UTC
●
●●
●● ●
●●
●● ●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
lead time [k]
pow
er [p
.u.]
0 6 12 18 24 30 36 42 48
0.0
0.2
0.4
0.6
0.8
1.0
●
forecastsobservations
Average of all past obs.
Let us similarly apply that strategy for a whole sample year (2002), and analyse itsperformance...
31761 - Renewables in Electricity Markets 19
Evaluation of the climatology forecast approach
Similar scores: bias, MAE and RMSE
lead time [k]
erro
r [%
of P
n]
1 6 12 18 24 30 36 42 48
−10
010
2030
40
biasMAERMSE
So, it is like random guessing, but somewhat better!
31761 - Renewables in Electricity Markets 20
A few conclusions at this stage
Even though these forecasting strategies do not look very smart...
They are difficult to beat!
Especially:
Persistence is difficult to outperform for lead times between 0 and 6 hours ahead
Climatology is difficult to outperform for the furthest lead times (say, after 24 hoursahead)
Still, we may be able to do something better
based on more dynamic approaches
extracting more information within available data
31761 - Renewables in Electricity Markets 21
A few conclusions at this stage
Even though these forecasting strategies do not look very smart...
They are difficult to beat!
Especially:
Persistence is difficult to outperform for lead times between 0 and 6 hours ahead
Climatology is difficult to outperform for the furthest lead times (say, after 24 hoursahead)
Still, we may be able to do something better
based on more dynamic approaches
extracting more information within available data
31761 - Renewables in Electricity Markets 22
3 Going further in a regression-based framework
31761 - Renewables in Electricity Markets 23
What is (linear) regression?
In the simplest case, data is available for:
yi (i = t − n, . . . , t), the response variable, i.e., the variable we will want to predict,eventually
xi (i = t − n, . . . , t), an explanatory variable, i.e., a variable that can help us predict y
At this stage, imagine that xi and yi are your most recent wind speed andcorresponding power observations up to current time t
Example set with thelast n = 120observations of
wind speed xi , and
corresponding powergeneration yi
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●●
●●
●
●
●●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
time t−i
win
d sp
eed
[m/s
]
t−120 t−100 t−80 t−60 t−40 t−20 t
02
46
810
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●
●
●
●
●●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●●
●●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●●
●
time t−i
pow
er [M
W]
t−120 t−100 t−80 t−60 t−40 t−20 t
1030
5070
90 time t
31761 - Renewables in Electricity Markets 24
What is (linear) regression? (continued)
The aim is to uncovering some relationship between these explanatory andresponse variables
We first do that visually...
Same example, with the lastn = 120 observations of
wind speed xi , and
corresponding powergeneration yi
In this scatterplot, thereseems to be a (linear)relationship between windspeed and power
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●●
●
●
●
●
●
●
0 2 4 6 8
2040
6080
wind speed [m/s]
pow
er [M
W]
0 2 4 6 8
1030
5070
90
31761 - Renewables in Electricity Markets 25
What is (linear) regression? (continued)
Such a linear relationship between x and y can be written as
yi = β0 + β1xi + εi , i = t − n, . . . , t
where
β0 and β1 are the model parameters (called intercept and slope)
εi is a noise term, which you may see as our forecast error we want to minimize
The linear regression model can be reformulated in a more compact form as
yi = β>xi + εi , i = t − n, . . . , t
with
β =
[β0
β1
], x =
[1xi
]
It is often easier to deal with such compact formulations...
31761 - Renewables in Electricity Markets 26
Least Squares (LS) estimation
Now we need to find the best value of β that describes this cloud of point
Under a number of assumptions, which we overlook here, the (best) modelparameters β can be readily obtained with Least-Squares (LS) estimation
The Least-Squares (LS) estimate β of the linear regression model parameters is given by
β = argminβ
∑i εi
2 = argminβ
∑i
(yi − β>xi
)2= (X>X)−1X>y
with
β =
[β0
β1
], X =
1 xt−n
1 xt−n+1
......
1 xt
, y =
yt−n
yt−n+1
...yt
Even better: some functions in R/Matlab can do it for you!
31761 - Renewables in Electricity Markets 27
The resulting (linear) regression
For the same example setwith the last n = 120observations of
wind speed xi , and
corresponding powergeneration yi
The LS-estimate of themodel parameters is:
β =
[−3.99.2
],
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●●
● ●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●
●
●
●
●●
●●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●
●●
●
●
●
●
●
●
0 2 4 6 8
2040
6080
wind speed [m/s]
pow
er [M
W]
0 2 4 6 8
2040
6080
0 2 4 6 8
1030
5070
90
y = − 3.9 + 9.2x
This type of model and estimation can then be incorporated within in a forecastingapproach
31761 - Renewables in Electricity Markets 28
Forecasting in a (linear) regression framework
At a given time t, you (as a forecaster) identified a good model:
yi = β0 +∑m
j=1 βjxj,i + εi , i = t − n, . . . , t
(or, equivalently: yi = β>xi + εi )
whereyi is still your response variable (say, wind power generation) observed at time i
xj,i is the observation at time i for the jth explanatory variable (j = 1, . . . ,m)
βj is the model parameter for the jth explanatory variable
εi is a noise term, which you may see as our forecast error we want to minimize
Based on the last n observations, you obtain an LS-estimate βt , valid at time t
And you can issue forecast using these estimates βt , for any new values of theexplanatory variables, i.e.
yt+k|t = β>t xt+k
Potential problem here: we do not know future values of the x variable (e.g.,wind speed)!
31761 - Renewables in Electricity Markets 29
Example application: combining persistence and climatology
Persistence and climatology were shown to be good benchmarks (difficult tooutperform), though
persistence is good for short lead times
climatology is good for longer lead times
Why no combining them, as function of the lead time k?
Reminder of the qualityof the persistence andclimatology forecasts,
in terms of RMSE
as a function of thelead time k
lead time [k]
erro
r [%
of P
n]
1 6 12 18 24 30 36 42 48
010
2030
40
RMSE − persistenceRMSE − climatology
31761 - Renewables in Electricity Markets 30
Proposal of a combination model and estimation
Our proposal model has the following form, for a given lead time k,
yi = βk,persy(p)i|i−k + βk,climy
(c)i|i−k + εi , i = t − n, . . . , t
where
y(p)i|i−k
and y(c)i|i−k
are the persistence and climatology forecasts, issued at time i − k for
time i
βk,pers and βk,clim are intercept and the weights to be given to persistence andclimatology forecasts, respectively
εi is a noise term, which you may see as our forecast error we want to minimize
It therefore combines persistence and climatology forecasts
The weight given to each of these forecasts can change with the lead time k
31761 - Renewables in Electricity Markets 31
Estimation of the model coefficients
For the example of the 28 April 2002 (as in first slides),
the necessary vectors and matrices are formed, with n = 200 last values
LS estimates βk,pers and βk,clim are computed for every lead time k (k = 1, . . . , 48)
Right:Evolution of theestimated modelparameters βk,pers andβk,clim as a function ofthe lead time k
persistence is given lessweight for further leadtimes
climatology is givenmore weight instead
lead time [k]
coef
ficie
nt v
alue
[p.u
.]
0 6 12 18 24 30 36 42 48
−0.
20.
00.
20.
40.
60.
81.
0
βpers
βclim
31761 - Renewables in Electricity Markets 32
The resulting forecast
For the example of the 28 April 2002 (as in first slides),
LS estimates βk,pers and βk,clim are used to combine the available persistence andclimatology forecasts
The combination is different for every lead time k (k = 1, . . . , 48)
Right:Example of acombined forecast forKlim (persistence andclimatology), issued on28 April 2002,00:00UTC
lead time [k]
pow
er [M
W]
●
●●
●● ●
●●
●● ●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
0 6 12 18 24 30 36 42 48
0.0
0.2
0.4
0.6
0.8
1.0
●
persistenceclimatologycombinedobservations
Let us similarly apply that strategy for a whole sample year (2002), and analyse itsperformance
31761 - Renewables in Electricity Markets 33
Evaluation of the combined forecasts
RMSE only, for persistence, climatology, and the combined forecasts
lead time [k]
erro
r [%
of P
n]
1 6 12 18 24 30 36 42 48
010
2030
40
RMSE − persistenceRMSE − climatologyRMSE − combined
With this combination strategy, we are getting the best out of the original simplebenchmarks!
31761 - Renewables in Electricity Markets 34
A few (more) conclusions at this stage
We have now learned to handle more variables and data
The forecasting approaches do not look impressive still
What could we do?
extracting more information within available data
go further than using simple linear relationships only (well, we will not do it today...)
31761 - Renewables in Electricity Markets 35
4 Digging in the data
31761 - Renewables in Electricity Markets 36
What to look for in the data?
What do we have here...?
measurements, i.e.,
power measurements (yt) - Remember that only past measurements can be used!
weather forecasts, i.e.,
wind speed (ut+k|t)
wind direction forecasts (θt+k|t)
temperature forecasts (Tt+k|t)
different variations of those could be used since the relationship betweenmeteorological variables and power is nonlinear, e.g.,
power of wind speed: u2t+k|t , u
3t+k|t , etc.
harmonics of wind direction: cos
(2πθt+k|t
360
), sin
(2πθt+k|t
360
), etc.
we also know the hour of the day (ht), or the lead time k, which could be useful...(though not used here)
Let us call all these variables xj (j = 1, . . . ,m), and also nickname them “features”
31761 - Renewables in Electricity Markets 37
We can still write a linear regression...
Remember that a linear relation between the xj variables and y can be written as
yi = β0 +∑m
j=1 βjxj,i + εi , i = t − n, . . . , t
(or, equivalently: yi = β>xi + εi )
whereyi is still your response variable (say, wind power generation) observed at time i
xj,i is the corresponding value for the jth explanatory variable (j = 1, . . . ,m, examplewind speed forecast used as input)
βj is the model parameter for the jth explanatory variable
εi is a noise term, which you may see as our forecast error we want to minimize
This linear regression model can be reformulated in a more compact form as
yi = β>xi + εi , i = t − n, . . . , t
with
β =
β0
β1
· · ·βm
, x =
1x1,i
· · ·xm,i
31761 - Renewables in Electricity Markets 38
Estimation and feature selection
We need to find the best value of β that describes this cloud of point, but alsousing a minimum of variables (parsimony principle)
LS-estimation is not very good for that, as the number of variables becomes high...the LASSO version should be used instead
The LASSO estimate β of the linear regression model parameters is given by
β = argminβ
1√nλ
∑i εi
2 +∑
j |βj |
with λ a so-called regularization parameter, and
β =
β0
β1
· · ·βm
, X =
1 x1,t−n . . . xj,t−n
1 x1,t−n+1 . . . xj,t−n+1
...... . . .
...1 x1,t . . . xj,t
, y =
yt−n
yt−n+1
...yt
As before, some functions in R/Matlab can do it for you!
31761 - Renewables in Electricity Markets 39
Example based on a set of features
Our proposal model has the following form, for a given lead time k,
yi = βk,0 + βk,1ui|i−k + βk,2u2i|i−k + βk,3u
3i|i−k
+βk,4 cos
(2πθt+k|t
360
)+ βk,5 sin
(2πθt+k|t
360
)+βk,6Ti|i−k + βk,7yi|i−k + εi
where
we have 8 model parameters to estimate, for each lead time k
the weight given to each of these features therefore varies with the lead time k
εi is a noise term, which you may see as our forecast error we want to minimize
31761 - Renewables in Electricity Markets 40
Estimation of the model coefficients
For the example of the 28 April 2002 (as for the other examples),
the necessary vectors and matrices are formed, with n = 600 last values
LASSO estimates of the estimates βk,j s are computed for every lead time k(k = 1, . . . , 48)
Right:Evolution of theestimated modelparameters as afunction of the leadtime k
Only a few featureshave parameterssignificantly differentfrom 0: ut+k|t , u
2t+k|t
and yt
●●
●●
● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
● ● ● ●
lead time [k]
coef
ficie
nt v
alue
[p.u
.]
0 6 12 18 24 30 36 42 48
−0.
20.
20.
61.
01.
41.
8
● β0
β1
β2
β3
β4
β5
β6
β7
31761 - Renewables in Electricity Markets 41
The resulting forecast
For the example of the 28 April 2002 (as before),
the necessary vectors and matrices are formed, with n = 600 last values
Right:Example of aforecast for Klimwith our moreadvanced model,issued on 28 April2002, 00:00UTC
●
●●
●● ●
●●
●● ●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ●
●
●
●
●
●
lead time [k]
pow
er [p
.u.]
1 6 12 18 24 30 36 42 48
0.0
0.2
0.4
0.6
0.8
1.0
●
forecastsobservations
Let us similarly apply that strategy for a whole sample year (2002), and analyse itsperformance
31761 - Renewables in Electricity Markets 42
Evaluation of more advanced forecasts
Various criteria: bias, MAE, RMSE (and RMSE comparison with our previous bestforecasts)
lead time [k]
erro
r [%
of P
n]
1 6 12 18 24 30 36 42 48
−10
010
2030
40
biasMAERMSERMSE − ref.
It seems we have substantially improved... Could still do better!!
31761 - Renewables in Electricity Markets 43
Final remarks
What else may you need to forecast, in an electricity market context?
power generation from other renewables
electric load
market prices
imbalance sign
offers from other (important) players
import/export (for your market zone of interest)
clearing outcomes of neighboring markets, etc.
And, in terms of more advanced forecasting:
make the forecast probabilistic
build more complex models (i.e., not relying on regression only)
dig more into the data...
31761 - Renewables in Electricity Markets 44
Thanks for your attention! - Contact: [email protected] - web: pierrepinson.com
31761 - Renewables in Electricity Markets 45
Top Related