D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. ·...

48
This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 768619 D4.4 Predictive Energy Production and Demand Algorithms The RESPOND Consortium 2020 Integrated Demand REsponse SOlution Towards Energy POsitive NeighbourhooDs WP 4 ICT enabled cooperative Demand Response model T4.4: ENERGY PRODUCTION AND DEMAND FORECASTING Ref. Ares(2020)1858036 - 31/03/2020

Transcript of D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. ·...

Page 1: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

This project has received funding from the European

Union’s Horizon 2020 research and innovation

programme under grant agreement No 768619

D4.4 Predictive Energy Production

and Demand Algorithms

The RESPOND Consortium 2020

Integrated Demand REsponse

SOlution Towards Energy

POsitive NeighbourhooDs

WP 4 – ICT enabled cooperative Demand

Response model

T4.4: ENERGY PRODUCTION AND DEMAND FORECASTING

Ref. Ares(2020)1858036 - 31/03/2020

Page 2: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

2 | 48

PROJECT ACRONYM RESPOND

DOCUMENT D4.4 Predictive Energy Production and Demand

Algorithms

TYPE (DISTRIBUTION LEVEL) ☐ Public

☑ Confidential

☐ Restricted

DELIVERY DUE DATE 31/03/2020

DATE OF DELIVERY 31/03/2020

STATUS AND VERSION 1.0

DELIVERABLE RESPONSIBLE TEK

AUTHOR (S) Iker Esnaola (TEK)

Francisco Javier Diez (TEK)

Meritxell Gomez (TEK)

Dea Pujic (IMP)

Marko Jelic (IMP)

Nikola Tomasevic (IMP)

OFFICIAL REVIEWER(S) Carlos Lopez (FEN)

Page 3: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

3 | 48

DOCUMENT HISTORY

ISSUE DATE CONTENT AND CHANGES

V0.1 14/02/2020 Table of content

V0.2 01/03/2020 Contributions from TEK

V0.3 10/03/2020 Contributions from IMP

V0.4 13/03/2020 Unofficial review from TEK

V1.0 30/03/2020 Official review from FEN

Page 4: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

4 | 48

EXECUTIVE SUMMARY

The potential of DR programmes is particularly promising in the residential sector. Towards the

implementation of the optimal DR programmes, RESPOND aims at developing services that allow the

estimation of produced and consumed energy in dwellings and neighbourhoods within the pilot sites.

Although most energy forecasting approaches are data-driven due to their high performance, there are

also physics-based models. In this deliverable both approaches are followed, choosing the best ones for

each case.

The Energy Production Forecasting service develops models to estimate the production of RES generation

systems available at the RESPOND pilot sites, that is, PV panels (in Aarhus and the Aran Islands) and Solar

Thermal Collectors (in Madrid). For PV panels, Random Forest models were the used, whilst for the STC

Neural Networks were the chosen. For the Aran Islands, a physical model was employed, as it required

only parameters that can most commonly be found in the PV cells data sheets.

Likewise, the Energy Demand Forecasting service has been found necessary to accurately forecast short-

term electricity demand. Furthermore, this service has also covered the estimation of DHW consumption

in the Spanish neighbourhood. However, this DHW consumption case is rather complicated due to the

aforementioned difficulties, thus is yet to be adequately solved. These predictive models were based on

kNN type of algorithms due to their high performance.

All the developed models have been deployed and are currently automatized.

Page 5: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

5 | 48

TABLE OF CONTENTS

1. Introduction 9

1.1 Aims and objectives 9

1.2 Relation to other project activities 9

1.3 Deliverable Structure 9

2. Data Mining for Energy Forecasting 10

2.1 The Knowledge Discovery in Databases 10

2.2 Training, Validating and Testing Predictive Models 12

2.3 Putting Models into Production 13

3. Energy Production Forecasting 15

3.1 SoA review 15

3.2 Data availability 16

3.3 Methodology 17

3.4 Results and discussion 18

3.5 Service Deployment 23

4. Energy Demand Forecasting 25

4.1 SoA review 25

4.2 Data Availability 26

4.3 Methodology 29

4.4 Service Deployment 39

5. Conclusions 41

Annex 1 42

Page 6: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

6 | 48

LIST OF FIGURES

Figure 1: An overview of the steps that compose the KDD Process. 11

Figure 2: A visualization of train, validation and test dataset splits

(Source:https://tarangshah.com/blog/2017-12-03/train-validation-and-test-sets/) 13

Figure 3: Increase of renewable production with years 15

Figure 4: Dependency between the PV production and solar intensity 16

Figure 5: Example of GUI for PV production in Aarhus 17

Figure 6: Part of Madrid data point list relevant for measuring STC production 17

Figure 7: Madrid topology 18

Figure 8: Example of Aarhus PV forecaster performance 19

Figure 9: Example of Aran Island PV forecaster performance. 21

Figure 10 - Example of Madrid STC production forecaster performance 23

Figure 11 - Example of part of the weather forecasted data stored in MySQL 23

Figure 12 - Example of production forecast values MySQL table data 23

Figure 13: ARIMA's Electric Consumption prediction of a period of 10 days in March for Madrid_02. 30

Figure 14: ARIMA's Electric Consumption prediction of a period of 24 hours for Madrid_02 31

Figure 15: Linear Regression's Electric Consumption prediction for Madrid_02. 33

Figure 16: Electricity Consumption prediction obtained with an SVR model. 34

Figure 17: Real vs forecasted energy consumption of a predictive model with a good performance 35

Figure 18: Real vs forecasted energy consumption of a predictive model with a bad performance 35

Figure 19: Residuals of a predictive model with a good performance. 35

Figure 20: Residuals of a predictive model with a bad performance. 36

Figure 21: Forecasted vs actual electric consumption in Madrid (neighbourhood level) 36

Figure 22: Electric Consumption prediction of a period of 10 days in March for Madrid_02. 37

Figure 23: Electric Consumption prediction of a period of 10 days in March for Aarhus_11. 37

Figure 24: Initial predictions obtained for Madrid Neighbourhood DHW consumption. 38

Figure 25: Predictions obtained for Madrid Neighbourhood DHW consumption with predictive model

trained with data up to November 2019. 38

Figure 26: Predictions obtained for Madrid Neighbourhood DHW consumption with predictive model

trained with data up to March 2020. 39

Figure 27: Deployment of the Energy Demand Forecasting Services. 40

Page 7: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

7 | 48

LIST OF TABLES

Table 1: Comparison between different ML approaches for PV production forecasting. 18

Table 2: Constants. 21

Table 3: Static parameters. 21

Table 4: Comparison between different ML approaches for STC production forecasting 22

Table 5: Demand data availability for Madrid. 27

Table 6: Demand data availability for the Aran Islands. 28

Table 7: Demand data availability for Aarhus. 28

Table 8: Comparison between ARIMA and SARIMA models. 30

Table 9: DHW Consumption Prediction Confussion Matrix. 39

Page 8: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

8 | 48

ABBREVIATIONS AND ACRONYMS

DR Demand Response

MAE Mean Average Error

ML Machine Learning

PV Photovoltaic

RES Renewable Energy Source

STC Solar Thermal Collector

SVR Support Vector Regression

Page 9: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

9 | 48

1. INTRODUCTION

1.1 AIMS AND OBJECTIVES

Buildings’ energy consumption has dramatically increased over the last decade due to different factors

including the population growth, the increase in time spent indoors or the increased demand for building

functions and indoor quality [1]. As a matter of fact, buildings account for more than 35% of global energy

use and nearly 40% of energy-related CO2 emissions [2]. However, significant energy savings can be

achieved in buildings if they are properly operated.

The residential sector is characterized by many end consumers with relatively low individual energy

demand, but with very high demand when considered in terms of home clusters, districts and residential

communities. For example, in 2016 the residential sector represented the 25.4% of final energy

consumption and 17.4% of gross inland energy consumption in the EU [3]. In this sector, space heating

and water heating are the major end-uses, followed by appliances, cooking and lighting [4]. Therefore,

the potential of DR programmes is particularly promising for this sector.

Being able to accurately predict the amount of energy to be produced over a period of time, and knowing

in advance when demand peaks will occur, can definitely contribute to a better management of their

disparity, thus allowing the suggestion of the most suitable DR programs to end-users. And this is precisely

the aim of RESPOND’s Task 4.4: the development of services that allow the estimation of produced and

consumed energy in dwellings and neighbourhoods within the RESPOND pilot sites.

1.2 RELATION TO OTHER PROJECT ACTIVITIES

With regards to the interaction between Task 4.4 and the rest of RESPOND project activities, the main

interactions are listed below:

• As for the WP2, the T4.4 is built based on the data collected by the central IoT platform designed

in T2.1, the early deployment described in T2.4, and the actual platform deployment in T2.5.

• As for the WP4, the T4.4 supports the optimized control within T4.5.

• As for the WP5, T4.4 results are leveraged by the RESPOND mobile app developed in T5.4.

• As for the WP6, T4.4 results will be validated in T6.2 with the methods and criteria defined in T6.1.

1.3 DELIVERABLE STRUCTURE

The rest of the deliverable is structured as follows. Section 2 introduces the energy forecasting topic. Section 3 focuses on the development of Energy Production Forecasting services, while Section 4 focuses on the development of Energy Demand Forecasting services. Finally, conclusions of this task are collected in Section 5.

Page 10: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

10 | 48

2. DATA MINING FOR ENERGY FORECASTING

Energy forecasting is crucial for planning the optimal energy consumption. Therefore, numerous

approaches for energy forecasting were proposed in literature.

Autoregressive Integrated Moving Aveage (ARIMA) models are the most general class of models for

predicting the future values in a time series. These models aim to describe the autocorrelations in the

data. ARIMA are used to time series forecasting exploiting the past values.

The common conclusion is that the highest performances are being achieved using machine learning (ML)

models, and, therefore, various ML approaches will be presented in this section, with neural networks

(NNs) as the first one of them.

Neural Networks are models generally used for modelling complex dependences between the inputs and

outputs. They are capable of extracting relevant features even ones that have not been discovered by the

experts, and, therefore have been used in several of fields such as image processing, speech recognition,

spell checking etc. Unfortunately, the fact that most of those features are non-explainable, which is the

biggest drawback of most of the ML approaches, e.g. k-Nearest Neighours and support vector regression.

On the other hand, in the group of more explainable techniques which have been used linear regression,

regression trees and random forest algorithm could be found. Linear regression is linear function between

inputs and outputs, which gives bigger weights to the more important inputs. It could be suitable, for

example, for photovoltaic production modelling, having in mind high correlation between the solar

irradiance and the produced energy. Regression trees are models which separate space of input variables

into parts, for which is given certain output estimation. Averaging the output of different regression trees

modelling the same function, random forest estimation is obtained.

Finally, what all of previously presented approaches have in common is the fact that their parameters are

being determined using supervised learning techniques, or in other words, a vast variety of real-world

data is being used in order to reach the optimal set of model’s parameters. Therefore, the quality of those

data highly influences the model’s performances and is one of the most relevant parts when ML

approaches are considered. However, most common problems with raw data are detection of numerous

errors, which are inevitable due to problems with communications, sensors, harsh weather conditions

etc. Having all of previous in mind, data preprocessing is inevitable in order to exploit full potential of the

proposed ML approaches.

2.1 THE KNOWLEDGE DISCOVERY IN DATABASES

The KDD (Knowledge Discovery in Databases) is a process leading to the extraction of useful knowledge

from raw data [5]. This process is composed of the following five steps: Data Selection, Preprocessing,

Transformation, Data Mining and Interpretation. It is an interactive and iterative process rather than a

strict workflow. It involves numerous loops and many decisions made between any two of the mentioned

steps. The necessity of having such a flexible process arises from the wide range of methods and

Page 11: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

11 | 48

parameter selections that can be applied in each step. An overview of the flow of KDD process steps is

illustrated in Figure 1.

Figure 1: An overview of the steps that compose the KDD Process.

Next, each KDD step is explained.

• Data Selection. It consists in selecting the datasets and the subset of variables or data

samples where the knowledge discovery is going to be performed. With the advent of new

paradigms such IoT (Internet of Things) or LD (Linked Data), data analysts may get lost in

today's chaotic information universe. As a matter of fact, much of this available data may

be redundant and therefore, it hinders the knowledge extraction as well as making it more

time and resource consuming. Therefore, in order to ease the upcoming KDD phases, data

analysts need to put their domain knowledge to work to select the sets of data and

variables used to do the analysis.

• Preprocessing. Different methods are applied to ensure quality of the data and prepare the

data for a subsequent analysis. Nowadays, datasets are prone to suffer from noise, outliers,

missing values, and inconsistencies due to their typical big size and their probable origin

from multiple and heterogeneous sources. Not only do these data quality issues

compromise knowledge extraction algorithms' performance, but they also may have a

negative impact on decision-making processes.

• Transformation. The data is changed into a form which data mining algorithms can work

with and improve their performance. This phase comprises different tasks although there

are two of them which are particularly relevant: feature generation and feature selection.

These two tasks are related, and often applied subsequently, because it is useful to post-

process the set of created features and discard features that have little value.

• Data Mining. The data analysis or discovery algorithm that best matches the data analyst's

goals is applied searching for hidden patterns in the data. Data analyst's role in this phase

consists in selecting the suitable algorithm and fine-tuning it with the appropriate

parameters. Furthermore, as each algorithm's performance may vary depending on the

input data, data analysts’ expertise and even intuition at times play a role in this phase.

Page 12: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

12 | 48

• Interpretation. It is the final phase where the results, patterns and models derived are used

to support decision-making processes. This phase also relies on the data analysts’

knowledge in the domain at hand, and even for a domain-expert, this task may end up

being challenging in certain scenarios.

2.2 TRAINING, VALIDATING AND TESTING PREDICTIVE MODELS

Data-driven predictive models are highly dependent, not only on the quality but also on the amount of

data available. However, in order to ensure an adequate performance of the developed predictive model,

available data needs to be splitted for training, validating and testing purposes.

• Training Dataset: The sample of data used to fit the model. The developed model sees and

learns from this data.

• Validation Dataset: The sample of data used to provide an unbiased evaluation of a model

fit on the training dataset while tuning model hyperparameters. Hence the model

occasionally sees this data, but it never learns from it.

• Test Dataset: The sample of data used to provide an unbiased evaluation of a final model

fit on the training dataset. The Test dataset provides the gold standard used to evaluate

the model. It is only used once a model is completely trained (using the train and validation

datasets).

Finding an adequate splitting ratio of available data into Train, Validation and Test sets may depend on

two factors: the total number of samples in the available data, and the actual model being trained. Some

models need substantial data to train upon, so in those cases larger training sets are needed. Models with

few hyperparameters might be easier to validate and fine-tune, so in these cases, validation set may be

reduced. On the contrary, models with more hyperparameters, may need a large validation set.

Furthermore, it may also happen to have a model with no hyperparameters or ones that cannot be easily

tuned, where validation sets may not be necessary. Overall, similar to many other aspects in Machine

Learning, the train-test-validation split ratio (shown in Figure 2) is specific to the use case.

Page 13: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

13 | 48

Figure 2: A visualization of train, validation and test dataset splits (Source:https://tarangshah.com/blog/2017-12-03/train-validation-and-test-sets/)

2.3 PUTTING MODELS INTO PRODUCTION

The deployment of predictive models is the process for making them available in production

environments, where they can provide predictions to other software systems. It is only once these models

are deployed to production that they start adding value, making deployment a crucial step. However,

there is complexity in the deployment of machine learning models.

There are two main ways to get predictions from predictive models put into production: online (or real-

time) predictions and batch predictions. When deciding which one of these two to choose, there are

different factors that need to be considered.

Load implications

Choosing a real-time prediction approach requires managing peak load. Depending on the approach taken

and how the prediction is going to be used, choosing a real-time approach might require having a machine

with the extra computing power available for providing a prediction within a certain Service Level

Agreement (SLA). On the contrary, in a batch approach, the computing of predictions can be spread out

throughout the day based on the capacity available.

Infrastructure Implications

Selecting a real-time approach puts a much higher operational responsibility. There is a need to monitor

how the system is working, generate alerts when there are issues, as well as take some consideration

concerning failover responsibility. For batch prediction, however, the operational obligation is much

lower. Some monitoring is needed, and altering is desired, but the need to monitor arising issues is much

lower.

Cost Implications

Real-time predictions have also implications from a cost point of view. The need for more computing

power without the ability to spread the load throughout the day can force into purchasing more

computing capacity than you would need or to pay for a spot price increase. Depending on the approach

and requirements taken, there might also be extra cost because of the need to have more powerful

compute capacity for meeting SLAs. Additionally, there be a higher infrastructure footprint when choosing

real-time predictions. One potential limitation there is when it was chosen to rely on app prediction - for

that specific scenario, the cost might end up being cheaper than going for a batch approach.

Page 14: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

14 | 48

Evaluation Implications

Evaluating the prediction performance in a real-time manner can be more challenging than for batch

predictions. Evaluating and debugging real-time prediction models is significantly more complex to

manage. It requires a log collection mechanism that will allow collecting the different predictions and

features that yielded the score for further evaluation.

Page 15: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

15 | 48

3. ENERGY PRODUCTION FORECASTING

In the 20th century electrical energy was produced mainly from the fossil fuels. However, this created

concerns about ecological environmental – primary about greenhouse gas emissions, global warming and

climate change. Therefore, lately, renewable energy sources (RES), such as photovoltaic (PV) panels, solar

thermal collectors (STCs) and wind turbines (WTs), were incorporated in the energy production, as well,

to decrease the use of fossil fuels, as shown in Figure 3 from [6]. Nonetheless, the renewable production

highly depends on the weather conditions, so this change significantly influenced destabilization of the

grid system. With the aim of improving grid stability and quality of the grid systems, it was necessary to

provide consumption and production planning ahead, which resulted with the necessity of developing the

energy production forecaster, which is the main focus of this section.

3.1 SOA REVIEW

Before explaining the developed models, in this subsection brief state-of-the-art summary will be given.

Namely, as stated in literature, PV forecasting approaches can be divided in three groups: physical models,

statistical models and hybrid models [7,8] depending on the approach used for the estimation of the

production depending on the required inputs. However, what is in common for all of these methodologies

are the inputs, as they all model the dependency of the renewable production depending on the weather

conditions. Physical approaches were firstly presented, and they represent set of mathematical equations

and physical laws which model the renewable system. Even though they were replaced by the novel data-

driven approaches in the field of PV forecasting, these models are practically the only one presented in

literature regarding STC production [9]. However, for PV forecasting, physical models are usually

outperformed with data-driven approaches, which are present in recent and SoA papers. Nonetheless,

due to the fact that their estimation is based on the mathematical modeling of the system, their main

advantage is that they do not need any historical data, so in some use-cases when historical data in

inaccessible, they are the only applicable ones. However, for application of these methodologies

Figure 3: Increase of renewable production with years

Page 16: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

16 | 48

numerous physical parameters are required. This is significant drawback, having in mind that these

characteristics are usually hard to access. On the other hand, data-driven models, both regressive (AR,

ARMA, ARIMA, ARMAX, NARMAX, etc.) and machine learning (neural networks, support vector machines,

random forests, kNNs, etc.) require numerous of the historical data, but are capable of much more precise

modelling, which significantly improves performances. Additionally, none of the physical parameters are

required in order to implement this approach. Finally, hybrid approaches tend to combine benefits from

previously explained models in order to further improve them. As a part of this task, several models were

developed, comparing both data-driven models and physical ones, and the results and comparison will be

given further in this deliverable.

3.2 DATA AVAILABILITY

As renewable production forecasting is mainly motivated by the requirements of the planning and

rescheduling the production and consumption, this task was created with the aim of providing the

required inputs for the planning and optimization carried out as a part of Task 4.2 and 4.3. Therefore,

horizon and time resolution of the forecasted output correspond to the one defined as a part of previous

D4.2 and D4.3 as day ahead forecasting and optimization with the hourly resolution.

As a part of RESPOND project there are three pilots for which production forecasters were supposed to

be developed – Aarhus (Denmark), Aran Islands (Ireland) and Madrid (Spain). In Aarhus and Aran Islands,

PV panels were present, while in Madrid STCs were installed. As it has already been explained, current

State-of-the-art solutions for the PV production day-ahead forecasting are mostly based on the data-

driven techniques, and so brief analysis of the necessary data will be covered.

Having in mind the fact that production of the renewable energy sources highly depends on the weather

condition, it was necessary to provide forecasted weather parameters with horizon and time resolution

corresponding to the forecaster’s one. Additionally, if data-driven models were to be considered, it was

necessary to provide historical weather data

parameters, as well. With respect to the fact

that correlation between the PV and STC

production and solar radiation is extremely high

(Figure 4 from [10]), it was necessary to find

weather service which provides information

about the radiation. The weather forecasting

service that fulfilled all of the previous

requirements and that has been used as a part

of this task is WeatherBit1. This relevant data

obtained through the weather service has been

stored as a part of RESPOND platform in MySQL

DB.

1 https://www.weatherbit.io/

Figure 4: Dependency between the PV production and solar intensity

Page 17: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

17 | 48

3.3 METHODOLOGY

As it has been described, current state of the art solutions are primarily focused on Machine Learning

approaches (ML). However, their utilization depends on the data availability. For Aarhus pilot case,

production data for all the neighbouring buildings present as pilot within RESPOND project is available

online2 for 3 previous years, as shown in Figure 5. Therefore, all relevant data for ML models training was

available and it was decided to use them for purpose of PV production forecasting in Denmark.

Unlike the Aarhus pilot case, where a couple of pilot buildings are sharing a single PV plant, the Aran Island

pilot is formed out of different geographically separated houses with some of them having its own PV

production. However, for only 2, out of 6 of them, production measurement data were available in the

InfluxDB. Having in mind that these panels differ amongst each other, so that present data might

adequately represent the missing one, it was decided to employ physical modeling approach for Aran

Island pilot case. Additionally, this created space for benchmarking and comparing various techniques in

the similar scenarios.

Finally, for Madrid pilot various sensors were deployed in Boiler Room for temperature and heat metering,

as shown in Figure 6, with measurement with ID “TEK-0000001-009” corresponding to the optimizer STC

production input. Namely, as explained in D4.2, and shown in Figure 7 the topology of the hot water

system is modelled through the Energy Hub with it having forecasted previously mentioned measurement

as the input. Taking all previous into consideration, it was decided to use ML approach as enough data

2 https://evishine.dk/ALBOA

Figure 6: Part of Madrid data point list relevant for measuring STC production

Figure 5: Example of GUI for PV production in Aarhus

Page 18: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

18 | 48

was available. Additionally, the field of applying different ML techniques for STC production forecast is

unexplored, having one more argument for this choice.

3.4 RESULTS AND DISCUSSION

Aarhus pilot case

As it has previously described for Danish pilot Machine learning approach has been deployed. In order to

achieve as high performance as possible, various different ML methodologies were tested – support

vector regression (SVR), linear regression (LR), neural networks (NN), k nearest neighbors (kNN) and

random forest algorithm (RF). For all of them list of input parameters has been chosen as follows: relative

humidity, wind speed, pressure, dew point, UV, wind direction, temperature, cloud coverage and global

horizontal irradiance (GHI). The output of each of these models was a single value, representing the

production at the timestamp which corresponds to the inputs weather parameter’s one. In other words,

for day-ahead hourly production forecast 24 different arrays of forecasted weather parameters were

brought for model to estimate 24 different outputs.

Table 1: Comparison between different ML approaches for PV production forecasting.

Model/MAE [%] Aarhus

SVR - RBF 8.99

SVR - linear 8.87

SVR - sigmoid 8.79

Linear regression 9.63

Neural network 8.5

KNN 8.75

Figure 7: Madrid topology

STC forecaster

Page 19: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

19 | 48

After inputs and output have been adequately normalized models have

been developed and trained in Python. For all of the approaches, optimal

set of hyper parameters has been obtained using grid search, and the example of results is shown in Table

1. Namely, for linear regression two hyper parameters were considered – polynomial degree of input n

and regularization factor alpha. For each combination of parameters Mean Square Error (MSE) and Mean

Absolute Error (MAE) were obtained for 5 independent training for training, validation and testing set,

and depending on the mean validation performance the optimal parameters were found.

Finally, after optimal set of parameters for each methodology has been established, comparison against

each of the methodologies has been concluded, as shown in Table 1. It can be noticed that when

comparing MAEs, neural networks achieve the highest performance, and therefore have been chosen as

the final model for the Aarhus day-ahead hourly production forecaster. The example of this model

forecast for one day is given in Figure 5 and accompanying with the MAE of just 8.5% it can be concluded

that this model is adequate for RESPOND platform deployment.

Aran Island pilot case

In Aran Island pilot site, due to the lack of data, it was necessary to employ physical model for production

forecasting for 6 houses with PVs – h1, h2, h3, h4, h5 and h12. The model presented in [11] has been

selected, given the fact that required physical data is widely spread in PV cells’ data sheets, making it

applicable in practice. In Table 2 and Table 3, all constants and static parameters relevant for the following

model are listed in Table 2 and Table 3 and are given in basic SI units if not differently noted. All static

parameters were searched from pilot coordinator, and for those for which data was not available most

common values were taken (e. g. for 𝛽 optimal angle was adopted). Apart from the static, this model

required 5 dynamic parameters regarding time and weather – GHI, temperature, number of the day in

the year, cloud coverage and current time. Details regarding model equations are given next:

Final estimation of the PV power consumption 𝑃𝑃𝑉 is given as

𝑃𝑃𝑉 = 𝑌𝑃𝑉𝑓𝑃𝑉(𝐺𝑡/𝐺𝑡𝑆𝑇𝐶)(1 + 𝑎𝑝(𝑇𝑐 − 𝑇𝑐𝑆𝑇𝐶))

where estimated cell temperature 𝑇𝑐 is given as

𝑇𝑐 =

𝑇𝑎 + (𝑇𝑐𝑁𝑂𝐶𝑇 − 𝑇𝑎𝑁𝑂𝐶𝑇) (𝐺𝑡

𝐺𝑡𝑁𝑂𝐶𝑇) (1 − (

𝜂𝑚𝑝𝑆𝑇𝐶(1 − 𝑎𝑝𝑇𝑐𝑆𝑇𝐶)𝜏𝑎

))

(1 + (𝑇𝑐𝑁𝑂𝐶𝑇 − 𝑇𝑎𝑁𝑂𝐶𝑇)(𝐺𝑡/𝐺𝑡𝑁𝑂𝐶𝑇)(𝑎𝑝 ∗ 𝜂𝑚𝑝𝑆𝑇𝐶/𝜏𝑎)

while 𝜂𝑚𝑝𝑆𝑇𝐶 defined as maximum power point efficienncy under standard test conditions is given as

KNN weighted 8.45

Random forest 8.61

Figure 8: Example of Aarhus PV forecaster performance

Page 20: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

20 | 48

𝜂𝑚𝑝𝑆𝑇𝐶 = 𝑌𝑃𝑉/(𝐴𝑃𝑉 ∗ 𝐺𝑡𝑆𝑇𝐶)

and solar radiation incident on the PV array 𝐺𝑡 as

𝐺𝑡 = (𝐺𝑏 + 𝐺𝑑 ⋅ 𝐴𝑖) ⋅ 𝑅𝑏 + 𝐺𝑑(1 − 𝐴𝑖)(1 + cos 𝛽 /2)(1 + 𝑓 sin3𝛽

2+ 𝐺𝜌(1 − cos 𝛽 /2))

The ratio of the beam radion on the tilted surface to beam radiation on the horizontal surface 𝑅𝑏 is given as

𝑅𝑏 = 𝑐𝑜𝑠 𝜗 / 𝑐𝑜𝑠 𝜗𝑧

where 𝜗 is the angle of incidence and 𝜗𝑧 the zenith angle (both in °) and stays

cos 𝜗 = sin 𝛿 sin 𝜙 cos 𝛽 −sin 𝛿 cos 𝜙 sin 𝛽 cos 𝛾 + cos 𝛿 cos 𝜙 cos 𝛽 cos 𝜔 +

+ cos 𝛿 sin 𝜙 sin 𝛽 cos 𝛾 cos 𝜔 + cos 𝛿 sin 𝛽 sin 𝛾 sin 𝜔

cos 𝜗𝑧 = cos 𝜙 cos 𝛿 cos 𝜔 + sin 𝜙 sin 𝛿

where 𝜔 is the average hour angle given as the arithmetic mean hour angles 𝜔1 and 𝜔2 for the beginning and ending

timestamps 𝑡𝐶1 and 𝑡𝐶2

𝜔 = 𝜔1 + 𝜔2

2

𝜔1 = (𝑡𝑠1 − 12)/15

𝜔2 = (𝑡𝑠2 − 12)/15

where 𝑡𝑠1 and 𝑡𝑠2 are begging and ending timestamps in solar time

𝑡𝑠𝑖 = 𝑡𝑐𝑖 + 𝜆 15⁄ − 𝑍𝑐 + 𝐸

𝑅𝑏 is limited to the [𝑅𝑏𝑀𝐼𝑁, 𝑅𝑏𝑀𝐴𝑋], where 𝑅𝑏𝑀𝐼𝑁 = −1 and 𝑅𝑏𝑀𝐴𝑋 = 1 were experimentally determined.

Solar declination 𝛿, factor depending on the Earth’s position with the respect to the sun 𝐵 and solar equation of time

𝐸 are given as

𝛿 = 23.45 sin 360(284 + 𝑛)/365

𝐵 = 360(𝑛 − 1)/365

𝐸 = 3.82(0.000075 + 0.001868 cos 𝐵 − 0.032077 sin 𝐵 − 0.014615 cos 2𝐵 − 0.04089 sin 2𝐵)

Additionally, diffuse 𝐺𝑑, beam 𝐺𝑏 and extraterrestrial horizontal 𝐺𝑜 and extraterrestrial normal 𝐺𝑜𝑛 radiation are

calculated as follows

𝐺𝑑 = {

𝐺(1 − 0.09𝑘𝑡), 𝑘𝑡 ≤ 0.22

𝐺(0.9511 − 0.1604𝑘𝑡 + 4.388𝑘𝑡2 − 16.638𝑘𝑡

3 + 12.336𝑘𝑡4), 0.22 < 𝑘𝑡 ≤ 0.8

0.165𝐺, 𝑘 > 0.8

𝐺𝑏 = 𝐺 − 𝐺𝑑

𝐺𝑜 = 12/𝜋 ⋅ 𝐺𝑜𝑛 (cos 𝜙 cos 𝛿 (sin 𝜔2 − sin 𝜔1) + 𝜋(𝜔2 − 𝜔1)/180 sin 𝜙 sin 𝛿

𝐺𝑜𝑛 = 𝐺𝑠𝑐(1 + 0.033 cos 360𝑛/365)

where

𝐺 = (𝑜𝑓𝑓𝑠𝑒𝑡 + (1 − 𝑜𝑓𝑓𝑠𝑒𝑡) ⋅ (1 − 𝑐𝑙𝑜𝑢𝑑 𝑐𝑜𝑣𝑒𝑟𝑎𝑔𝑒)) ⋅ 𝑔ℎ𝑖

and

𝑘𝑡 = 𝐺/𝐺𝑜

Finally, horizon brightening factor 𝑓 and the anisotropy index 𝐴𝑖 are given as

Page 21: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

21 | 48

𝑓 = √𝐺𝑏/𝐺

𝐴𝑖 = 𝐺𝑏/𝐺𝑜

Table 2: Constants.

Label Description Value

𝐺𝑆𝐶 Solar constant 1367

𝜌𝑔 Ground reflectance 0.2

𝑇𝑎𝑁𝑂𝐶𝑇 The ambient temperature at which nominal operating cell temperature defined 20

𝐺𝑡𝑁𝑂𝐶𝑇 The solar radiation at which nominal operating cell temperature is defined 800

𝐺𝑡𝑆𝑇𝐶 The incident radiation at standard test conditions 1000

𝑇𝑐𝑆𝑇𝐶 The temperature at standard test conditions 25

𝑓𝑝𝑣 The derating factor 0.8

𝜏𝑎 The product of the solar transmittance and solar absorbance 0.9

𝑜𝑓𝑓𝑠𝑒𝑡 Parameter for GHI calculation in accordance with the cloud coverage 0.2

Table 3: Static parameters.

Label Description Unit H1 H2 H3 H4 H5 H12

𝜆 Longitude ° -9.686 -9.662 -9.687 -9.685 -9.685 -9.663

𝜙 Latitude ° 53.131 53.101 53.129 53.129 53.129 53.124

𝑍𝑐 Time zone offset 1 1 1 1 1 1

𝛽 Slope of the PV cell surface ° 32 32 32 32 32 32

𝛾 Azimuth of the PV cell surface

° 0 0 0 0 0 0

𝑌𝑃𝑉 Rated capacity of the PV array

W 2000 4000 2000 2000 2000 2000

𝑎𝑝 Temperature coefficient 1/°C -0.004 -0.004 -0.004 -0.004 -0.004 -0.004

𝐴𝑃𝑉 Surface area of the PV cell m2 13.04 26.08 13.04 13.04 13.06 13.04

𝑇𝑐𝑁𝑂𝐶𝑇 Nominal operating cell temperature

°C 45.3 45.3 45.3

45.3 45.3 45.3

Finally, the example of performance for one-day production forecasting for house 3 is given in Figure 9. It

can be noticed that the estimation has significant deviations from the real production in comparison with

Figure 9: Example of Aran Island PV forecaster performance.

Page 22: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

22 | 48

the Aarhus model, which was expected due to the fact that data-driven models usually achieve higher

performances. However, as ML approach was not applicable in the Aran use case, MAE of 21% achieved

on house 3 could be taken as acceptable.

In the end, it should be pointed out that in case of Aran, production forecast is calculated for each of the

houses with the renewable energy sources. Apart from the fact that the panels on different households

differ amongst each other, more importantly, optimization uses different energy hubs due to different

topology between the houses (e.g. having/not electrical storage, having different load profiles because of

electrical vehicle etc.), resulting with the necessity of separated production and demand forecasts for

these hubs. Nevertheless, more details will be given in the integration subsection.

Madrid pilot case

Similarly, to Aarhus case, Madrid production forecasting model was ML based, and involved

benchmarking various approaches – SVR, LR, NNs, kNNs and RF with hyper parameters optimized using

grid search (shown in Annex 1). Output of these models was estimation of renewable production at the

same time as the corresponded input weather parameters which included relative humidity, wind speed,

pressure, dew point, UV, wind direction, temperature, cloud coverage, global horizontal irradiance

(GHI), direct horizontal irradiance (DHI) and direct normal irradiance (DNI). However, measurements of

this output stored in InfluxDB are

obtained using sensor which generates

pulses after each 1kWh of energy,

leading to the conclusion that the measurements are highly imprecise.

Therefore, in order to compensate the lack of precision in the

measurement with higher model’s precision, additional inputs were

added being previous STC production. The fact that correlation

between the STC production at time 𝑡 and 𝑡 + Δ𝑡 is higher as Δ𝑡 is

smaller, was motivation to include 5 more inputs corresponding to

production in 5 previous hours. Apart from this change, the approach

for developing STC forecasting model was the same as the Aarhus one,

so hyper parameters were optimized using grid search and final comparison between different models is

given in Table 4, leading to the conclusion that random forest algorithm with MAE of 6.2% suits the best

for the STC production forecasting. An example of model estimation for one day is given in Figure 10,

corroborating the conclusion that this model is adequate for the RESPOND platform deployment.

Methodology/MAE Madrid

SVR - RBF 8.52

SVR - linear 9.17

SVR - sigmoid 9.01

Linear regression 8.60

Neural network 7.63

KNN 7.90

KNN weighted 7.85

Random forest 6.2

Table 4: Comparison between different ML approaches for STC production forecasting

Page 23: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

23 | 48

3.5 SERVICE DEPLOYMENT

As a last step in the production forecast services development, integration and deployment was carried

out. On top of the developed models, a forecaster service was developed. Namely, it was designed to

obtain all the necessary inputs, perform the calculation, and store back the outputs which are further

exploited by other parts of respond platform (e.g. optimizer).

In the context of input data collection, relevant data parameters are obtained from MySQL data base, in

which WeatherBit data is stored, as showed in Figure 11. Apart from the weather data, as dynamic inputs

previous STC production is obtained from the InfluxDB. For all the pilots, relevant input data are obtained

with horizon of 1 day and hourly resolution, corresponding to the gaining output. As far as output is

considered, it is an array of 24 values stored in the MySQL DB. Depending on the pilot, outputs correspond

either to the neighboring level (Aarhus, Madrid) or to the house level (Aran), which was predefined by the

optimizer’s requirement. It is necessary to point out that for all pilot sites, both neighboring and

household level can be calculated from the stored values, either by proportionally downscaling or

summing up the estimations. Examples of stored values in ‘production_forecast_values’ are presented in

Figure 11 - Example of part of the weather forecasted data stored in MySQL

Figure 12 - Example of production forecast values MySQL table data

Figure 10 - Example of Madrid STC production forecaster performance

Page 24: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

24 | 48

Figure 12, where can be seen that each forecasted production output value ‘value’ for corresponding time

interval between ‘timestamp_start’ and ‘timestamp_end’ is labeled with the ‘load_type_id’ (electrical

load, thermal load, dhw) and ‘location_id’ (Aarhus, Aran, Madrid and individual households). Finally, this

service is deployed on the server using Open Wisk and its running is orchestrated by the master scheduler

which controls the order of services in the control loop.

Page 25: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

25 | 48

4. ENERGY DEMAND FORECASTING

The ability to accurately forecast short-term electricity demand can assist power system operators and

market participants in ensuring sustainable electricity planning decisions and securing electricity supply

for the consumers [12]. Unlike the regularity in commercial buildings, more irregularity is foreseen in

residential electrical consumption. As a matter of fact, electricity usage at individual household level

shows high variance, since it relies on users’ lifestyle, occupancy behaviour, building characteristics and

calendar information [13,14].

4.1 SOA REVIEW

There is extensive research in the forecasting of energy demand. A study investigates fifteen anonymous

individual household’s electricity consumption forecasting using a Support Vector Regression (SVR)

modelling approach, applied both to daily and hourly data granularity [15]. In this experiment,

households’ occupation, dwelling properties and socioeconomic status were unknown. Therefore,

aggregating hourly consumption to daily was an effective way to mitigate the impact of randomness in

hourly behaviours of family members.

Under the assumption that there usually exists an intrinsic low-dimensional structure governing the data

recorded from a collection of residential houses and that using this structure in load forecasting can help

improve the forecasting performance, a compressive load forecasting approach incorporating both

temporal and spatial information is presented in another study [16]. The proposed method is called

nonuniform CST-LF as it is inspired by CS (Compressive Sensing) and structured-sparse recovery

algorithms, and it is tested against various benchmark models using real and high-quality data, showing

that the proposed approach improves the short-term electric demand forecasting.

A research focused showing how calendar effects, forecasting granularity and the length of the training

set affect the accuracy of a day-ahead load forecast for residential customers [17]. Regression trees,

neural networks, and support vector regression were tested, and the former was the technique obtaining

best results. The use of historical load profiles with daily and weekly seasonality, combined with weather

data, leaves the explicit calendar effects a very low predictive power. In the setting studied in the article,

it was shown that forecast errors can be reduced by using a coarser forecast granularity. It was also found

that one year of historical data is enough to develop a load forecast model for residential customers as a

further increase in training dataset has a marginal benefit.

However, the energy consumption prediction field is not limited to the electricity. On the contrary, the

forecasting of DHW (Domestic Hot Water) consumption has been proved to be of interest, as it has the

potential to reduce the energy consumption of hot water systems. In this regard, a research proposed a

recurrent neural network which was trained with the measured DHW consumption of a 40-unit residential

building in Quebec City (Canada), to predict the future consumption [18]. It was found that the water

consumption profile of the building changed from day to day throughout the year and that it had an

important noise component. A predictive model was developed in this work and it was obtained by pairing

Page 26: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

26 | 48

a recurrent neural network to predict the filtered domestic hot water demand with a random forest to

predict the noise signal.

All these evidences reinforce the need to develop a service that accurately forecasts the energy

consumption in RESPOND pilot sites, towards its optimisation.

4.2 DATA AVAILABILITY

Prior to the development of RESPOND’s Energy Demand Forecasting service, available energy sources

were analysed.

Time series data was collected on InfluxDB, and in order to ease its analysis, a Java-based application has

been developed using the influxdb-java client library. This application (which is known as

influxdbClient.jar) can be executed on any system with access to the database, and it allows the execution

of queries and saving of the results in different file formats.

Furthermore, this application is configurable as database connection and the query parameters can be

set by the user. Regarding connection settings, apart from the endpoint parameters, secure

communications establishment and authorized access can be configured. Regarding query settings write

and read timeouts and the maximum number of records that is expected to be retrieved by the database

can be configured. InfluxDB Settings are configurable as follow:

influxdb.ip=

influxdb.port=

influxdb.enablessl=

keystore.path

keystore.passwd=

influxdb.database=

influxdb.user=

influxdb.password=

influxdb.connectTimeout=

influxdb.writeTimeout=

influxdb.readTimeout=

influxdb.maxRecords=

Finally, with views to exporting InfluxDB query results, the application can be executed providing the

InfluxDB query and the name of the file where results will be saved as arguments. For example:

sudo java -jar influxdbClient.jar “InfluxDBquery" JSONFile

This application was leveraged to export the available data in JSON format. In order to evaluate the quality

of this data, the following indicators have been assessed:

• Completeness. It refers to the degree of presence of attributes in the data set, that is, the

percentage of data available. Three metrics are calculated in relation with data loss.

Completeness provides the number of data points that are lost. The other two indicators are the

complementary of percentage of observations and variable lost, respectively.

Page 27: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

27 | 48

• Time uniqueness. When data is received by sensors, repetitions in values of temporal variable

are quantified. This metric allows to know the percentage of unique dates.

• Precision. It measures the representation degree of data, that is, how correct the available data

is for each variable. Given an upper and lower limit, the percentage of values received outside

this threshold is calculated. This metric is called Range. Note that outliers are detected in this

process. Furthermore, in the attributes of interest, the deviation in relation to average values is

measured and the data dispersion is quantified by means of three different metrics: Consistency,

Typicity and Moderation.

• Timeliness. It checks the punctuality of data calculating the uniformity of temporal variable.

Usually, in time series, data is expected to be received at uniform time intervals. The percentage

of waits exceeding the expected one between observations is calculated by this indicator.

• Format. It refers to the percentage of data received in a format different from the expected for

the information contained to be consistent.

The evaluation of these indicators has been performed by means of an R script. Results show that

Completeness, Format and Time Uniqueness indicators obtain 100% of quality in all cases. Regarding the

Precision, it fluctuates between 95% and 100% due to small data variability. We do not consider these

values to be alarm indicators, since in the case of the Range indicator 100% is always obtained. It is worth

mentioning that Missing data are not identified by the proposed indicator. When a sensor fails for

whatever reason, it stops sending data, including the time value. Due to failures in sensors, waiting times

occur in the time variable. This fact is reflected in the low values of Timeliness indicators.

The following tables summarize the data available for RESPOND’s three pilot sites. For each table, the

initial and last date for the registered measurements are shown, as well as the percentage of lost data.

Rows in red indicate houses that were considered to have insufficient data to do acceptable predictions.

This limit has been established in the 30%, which is considered a significant number, as it is normally used

for testing purposes in Machine Learning model development approaches. This analysis was last

performed on 06/03/2020.

House % Missing Values Initial Date Final Date

Madrid_00 0.85% 2019-07-06 2020-02-16

Madrid_01 0.85% 2019-07-06 2020-02-16

Madrid_02 1.48% 2019-07-06 2020-02-16

Madrid_03 1.48% 2019-01-01 2020-03-05

Madrid_04 1.48% 2019-01-01 2020-03-05

Madrid_05 0.85% 2019-07-06 2020-02-16

Madrid_06 2.41% 2019-06-20 2020-03-05

Madrid_07 2.41% 2019-06-20 2020-03-05

Madrid_10 0.85% 2019-07-06 2020-02-16

Madrid_12 1.48% 2019-01-01 2020-03-05

Madrid_13 1.48% 2019-01-01 2020-03-05

Table 5: Demand data availability for Madrid.

Page 28: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

28 | 48

House % Missing Values Initial Date Final Date

Aran_01 6.30 % 2019-01-01 2019-07-20

Aran_02 8.54 % 2020-02-06 2020-03-05

Aran_03 1.93 % 2019-04-02 2020-03-05

Aran_04 24.25 % 2019-05-02 2020-03-05

Aran_05 2.08 % 2019-04-26 2020-03-05

Aran_06 12.75 % 2019-09-04 2020-03-05

Aran_08 6.30 % 2019-08-06 2020-03-05

Aran_10 3.97 % 2019-09-04 2020-03-05

Aran_12 40.48 % 2019-11-05 2020-03-05

House % Missing Values Initial Date Final Date

Aarhus_01 49.28% 2019-03-27 2019-03-05

Aarhus_02 36.17 % 2019-04-03 2020-03-05

Aarhus_03 16.33 % 2019-03-26 2020-03-05

Aarhus_04 62.29 % 2019-03-27 2020-03-05

Aarhus_05 13.50 % 2019-03-25 2020-03-05

Aarhus_06 3.59 % 2019-03-26 2020-03-05

Aarhus_07 22.15 % 2019-04-07 2020-03-05

Aarhus_08 2.17 % 2019-03-26 2020-03-05

Aarhus_09 4.74 % 2019-03-29 2020-03-05

Aarhus_10 58.41 % 2019-03-26 2020-03-05

Aarhus_11 2.12 % 2019-03-26 2020-03-05

Aarhus_12 1.28 % 2019-03-28 2019-07-20

Aarhus_13 21.96 % 2019-03-14 2020-03-05

Aarhus_14 3.53 % 2019-03-26 2020-03-05

Aarhus_15 32.72 % 2019-03-26 2020-03-05

Aarhus_16 57.75 % 2019-03-28 2020-03-05

Aarhus_17 21.88 % 2019-04-03 2020-03-05

Aarhus_18 45.07 % 2019-03-27 2020-03-05

Aarhus_19 99.40% 2019-03-26 2020-02-13

Aarhus_20 30.25 % 2019-04-03 2020-03-05 Table 7: Demand data availability for Aarhus.

Looking at the data availability results provided by Table 5, Table 6 and Table 7, it can be concluded that

not all the pilot sites have the same data availability. The Spanish pilot site is the one with the lowest data

Table 6: Demand data availability for the Aran Islands.

Page 29: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

29 | 48

loss percentage in their participant houses, losing less than 3% of data in the worst scenarios. For four of

the participant houses there is historical data for a period that exceeds the year. In the Aran Islands case,

data loss is rather heterogeneous among the participants, being less than 2% in the best case, and

exceeding the 40% of data loss in the worst case. It is worth mentioning that, only two of the nine

participant houses have more than the 20% of data loss. Finally, the Aarhus pilot site is the most affected

one by data loss. These results can definitely be attributed to the deployment problems explained in

RESPOND’s periodic reports. Only six out of the twenty participant houses have a lower data loss than 5%,

while other six houses have more than their 40% data missing. In total, there are nine houses data have

lost at least one quarter of their measurements.

4.3 METHODOLOGY

Traditionally, the energy demand forecasting has been addressed via data-driven algorithms due to their

high performance. Therefore, RESPOND’s Energy Demand Forecasting Service has targeted these

algorithms with views to having the best performance possible.

Electric Energy Forecasting

Firstly, we decide that the explanatory input variables in the predictive models were extracted from the

time variable. On the one hand, this agreement provides simplicity to the models and allows the results

to be explained. On the other hand, continuity in time allows the imputation of the missing data in case

of sensor failure.

Before creating models, we identified some outlier values. These are values that excessively exceed the

typical values for electrical consumption. After observing the behavior of the consumption data for the

different houses, we concluded that a common pattern would lack precision. Finally, we decided to

remove values greater than 3000 kWh. These values are considered meaningless and possibverly caused

by a failure in the data collection method.

In the process of finding the best predictive model, we started with Autoregressive Integrated Moving

Average (ARIMA(p,d,q)) models. Those models are fitted to time series data to predict future points where

data show evidence of non-stationarity. Time series can be transformed into stacionary by differentiation

d times. Once the series is stationary, we used the classic explanatory methods to choose the orders p

and q based on the comparation of Akaike Information Criterion (AIC) and Bayesian Information Criterion

(BIC). An autoregressive model of order p , AR(p) is one that forecast the variable of interest using a linear

combination of p past values of the variable. AR(p) can be written as

𝑦𝑡 = 𝑐 + 𝜙1𝑦𝑡−1 + 𝜙2𝑦𝑡−2 + ⋯ + 𝜙𝑝𝑦𝑡−𝑝 + 휀𝑡

where c and 𝜙𝑖 , 𝑖 = 1, … , 𝑝 are the regression coefficients that will be estimated by the maximum

likelihood estimation (MLE), 𝑦𝑡−𝑖, 𝑖 = 1, … , 𝑝 are the p lagged values of 𝑦𝑡 used as predictors and 휀𝑡 is

white noise. A moving average model of order q, MA(q) specifies that the variable of interest depends on

the lagged values of a stochastic term, according to the equation:

𝑦𝑡 = 𝜇 + 휀𝑡 + 𝜃1휀𝑡−1 + 𝜃2휀𝑡−2 + ⋯ + 𝜃𝑞휀𝑡−𝑞

Page 30: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

30 | 48

where 𝜇 is the mean of the series, 𝜃𝑖 , 𝑖 = 1, … , 𝑞 are the parameters estimated by MLE and 휀𝑡, … , 휀𝑡−𝑞 are

the current and the q previous values of white noise.

Due to high amount of data, this searching for the optimal p and q was neither simple nor satisfactory.

Models are implemented using the statistical software R. It has multiple functions for the treatment of

time series. Specifically, we used a method that finds the best Seasonal Autoregressive Integrated Moving

Average (SARIMA) model. The idea is that SARIMA models are ARIMA models (p, d, q) whose residues are

ARIMA (P, D, Q). Table 8 compares the bad results obtained in both predictions using data corresponding

on the second house of Madrid.

Model p d q P D Q Seasonal RMSE

ARIMA 2 1 4 - - - - 1540.07

SARIMA 3 1 4 0 0 2 24 1294.81 Table 8: Comparison between ARIMA and SARIMA models.

The problem in both cases is the same. The more time passes, the less accurate the estimation is. Figure

13 shows the result of the prediction of a period of 10 days in March for the second house in Madrid. The

graph is the SARIMA model prediction. Figure 14 shows the estimated values for 24 hours to observe that

the estimation is not accurate.

Figure 13: ARIMA's Electric Consumption prediction of a period of 10 days in March for Madrid_02.

Page 31: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

31 | 48

Figure 14: ARIMA's Electric Consumption prediction of a period of 24 hours for Madrid_02

Then, we used Machine Learning techniques to develop the energy demand forecasting predictive

models. One of the most important steps in Machine Learning was the decision of which input variables

to use. It is worth mentioning that the home-related data was captured for periods of time that never

exceeded one-year period. For that reason, information about minutes, hours, days and months are

included in the input variables, but not the year variable.

Having decided that the training variables will be the ones mentioned above, we consider that this type

of data is inherently cyclical. We used a sinusoidal transformation into 2 dimensions to include them on

the model. This way, two new features are created from each variable, deriving a sine transformation and

cosine transformation, considering their periodicity. This method is not applied in day variable because

each month has a different number of days. We consider that finding a method to generalize the

periodicity of this variable complicates the problem and we do not believe that it provides relevant

information.

For example, the transformation carried out in the minute variable is shown below.

𝑓𝑚𝑖𝑛: [0,60) → [0,1] × [0,1]

𝑥 ↦ (𝑐𝑜𝑠 (2𝜋𝑥

60) , 𝑠𝑖𝑛 (

2𝜋𝑥

60))

In the domain space of minute variable there are sixty natural values. The Euclidean distance between

two consecutive elements of this set is always 1

𝑑(𝑥, 𝑥 + 1) = 1, ∀𝑥 ∈ [0,59)

Since element 0 is consecutive to 59, the Euclidean distance between these two points should be 1 as

well. Obviously, this is not the case.

𝑑(59,0) = 59

Page 32: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

32 | 48

Using this kind of method, homogenous Euclidean distance is obtained between all consecutive points in

a two-dimensional space.

Additionally, other categorical variables are created based on the time variable. We included information

about the season of the year and the day of the week. We observed that the behavior in the electrical

consumption differs between working days and holidays. Due to this fact, we decided to create a new

dichotomous variable providing information about holiday days in each place.

The first machine learning algorithm based on supervised learning that we tested was Linear Regression.

Linear relationship between dependent variables and independent variable was found. Figure 15 shows

the estimated values of electric consumption using the best linear model.

0

10

20

30

40

50

60

70

0 3 6 9 12151821242730333639424548515457

Original minute variable

Linear

-1,5

-1

-0,5

0

0,5

1

1,5

-1,5 -1 -0,5 0 0,5 1 1,5Sin

us

Cosinus

Cyclical transformation of minute variable

day sin(month) cos(month) sin(hour) cos(hour) Season Weekday Workingday RMSE

x x x x x x x x 540.22

x x x x x x x 541.48

x x x x x 558.07

Page 33: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

33 | 48

Figure 15: Linear Regression's Electric Consumption prediction for Madrid_02.

Although the RMSE was significantly lower than in the results obtained with ARIMA, the Coefficient of

determination (𝑅2) was less than 0.3 in all fitted models.

Another supervised learning algorithm that we tested was Support Vector Regression (SVR). SVR uses the

same principles as SVM but it used in a regression method, so we can use SVR for working with continuous

values. The caret library available in R is leveraged, which contains functions to train machine learning

models. We use a cross-validation as resampling method. This type of procedure chooses the best

combination of hyperparameters for the model is being trained.

Electricity consumption took negative values in some cases using SVR and this makes no sense. Electric consumption can never be less than zero kWh. Although RMSE obtained was lower than the previous, the method was rejected because this problem could not be controlled. An example of this situation is shown in Figure 16.

day sin(month) cos(month) sin(hour) cos(hour) Season Weekday Workingday RMSE

x x x x x x x x 571.92

x x x x x x x 471.92

Page 34: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

34 | 48

Figure 16: Electricity Consumption prediction obtained with an SVR model.

Finally, we used the K-nearest neighbors algorithm (KNN). KNN is a supervised machine learning algorithm

that can be used to solve regression models. In this case, we also used the caret library and the value

chosen for parameter k was 5 in all houses.

Depending on the data availability, performance of forecasters varies. Therefore, predictive models are

periodically re-trained as they are expected to improve their performance as a bigger historical data size

is available.

For the predictive models developed in October 2019, predictive models can be classified in three

different categories. On the one hand, predictive models with a good accuracy which are able to predict

daily schedules and routines, with a MAE below 90kW (see Figure 17). On the other hand, less accurate

predictive models with errors of different magnitudes with MAEs over 125kW. Last but not least,

predictive models that have a bad accuracy (see Figure 18) with MAEs over 250kWs. These former two

type of models don’t adjust well to the dweller’s routine at certain time intervals, as there is not enough

data to learn from them. More historical data is required to train and adjust the model, which is foreseen

to be achieved during the last six months of the T4.4.

Figure 17 shows a predictive model with good accuracy for predicting the electric consumption of a

dwelling in Madrid. It can be seen that red dots (predicted consumptions) are rather close to blue dots

(real consumption). As for Figure 18, it compares the predictions made by a predictive model with a worse

performance with the real consumption of another house in Madrid. It can be concluded that these

predictions are not as accurate as the ones shown in Figure 17.

Page 35: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

35 | 48

Figure 17: Real vs forecasted energy consumption of a predictive model with a good performance

Figure 18: Real vs forecasted energy consumption of a predictive model with a bad performance

Figure 19 and Figure 20 show the residuals of a predictive model with enough data, and a predictive model

developed with scarce data respectively. It can be seen that the quality of the former model is better than

the latter, as most differences between observed and predicted values are closer to 0.

Figure 19: Residuals of a predictive model with a good performance.

Page 36: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

36 | 48

Figure 20: Residuals of a predictive model with a bad performance.

The neighborhood consumption is also forecasted, leveraging the already developed predictive models.

As it can be seen on Figure 21, the developed predictive models have a good accuracy, which is expected

to be improved as more data is available.

Figure 21: Forecasted vs actual electric consumption in Madrid (neighbourhood level)

When more historical data was available, predictive models were retrained. Figure 22 and Figure 23 show

the electricity consumption estimation for a 10-days period in March by predictive models trained in

March 2020. We can see that the estimation is more accurate with respect to the predictions that were

made in October.

Page 37: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

37 | 48

Figure 22: Electric Consumption prediction of a period of 10 days in March for Madrid_02.

Figure 23: Electric Consumption prediction of a period of 10 days in March for Aarhus_11.

DHW Forecasting

An initial study of the performance of the DHW consumption estimations was made using the kNN

method. DHW consumption is measured in m3, and the water meter doesn’t provide decimals. Since the

unit is too largue for the hourly consumption of an apartments building, values of the variable to be

predicted are natural numbers 0, 1, 2. Therefore, it was considered that a classification algorithm would

be more adequate. The consumption variable is introduced as a factor variable that takes three values.

Figure 24 shows the actual classification returned by the algorithm.

Page 38: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

38 | 48

Figure 24: Initial predictions obtained for Madrid Neighbourhood DHW consumption.

As it can be seen, all values are classified as 1. To avoid this fact, it was decided not to treat the values as

factors, but as numerical ones. After obtaining the prediction, it was decided to apply manual rounding of

the decimal values. Values less than 0.9 are assigned 0, values between 0.9 and 1.1 a 1 and values greater

than 1.1 a 2. Finally, the values are converted to a factor. Figure 25 shows the results obtained with a

redictive model trained with data until November 2019.

Figure 25: Predictions obtained for Madrid Neighbourhood DHW consumption with predictive model trained with data up to November 2019.

Figure 26 corresponds to the results obtained with a predictive model developed with data until March

2020, and the confussion matrix is shows in Table 9.

Page 39: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

39 | 48

Figure 26: Predictions obtained for Madrid Neighbourhood DHW consumption with predictive model trained with data up to March 2020.

Reference

Prediction 0 1 2

0 4 18 4

1 12 72 39

2 1 19 16 Table 9: DHW Consumption Prediction Confussion Matrix.

On the diagonal of the table are the values that were well classified. Accuracy is a measure of goodness

of classification that consists of dividing the values that were well classified by the total values that have

been predicted. In this case, 𝐴𝑐𝑐𝑢𝑟𝑎𝑐𝑦 =92

185= 0.4973, that is, 49.73 % of estimated values are the same

as real values. This result is low and shows the poor precision of the predictive model. However, we note

that values that are incorrectly classified are not generally estimated at the opposite extreme. That is,

only 1 value should be 0 and is classified as 2, and values that should be 2 and classified as 0 are only 4.

Other methods are going to be tried to improve this result.

4.4 SERVICE DEPLOYMENT

A data-driven predictive model has been developed for forecasting the electric consumption of each

dwelling, as well as the DHW consumption at a neighbourhood level in the Madrid pilot site. These

predictive models were developed in R and exported in *.rds files.

The execution of the predictive models to perform the upcoming 24 hours’ predictions were automated

using periodical tasks executed by a crontab daemon. These tasks execute remotely the models deployed

in an R server. The tasks admits several parameters to indicate which is the model that is going to be

executed, the period of time forecasted and othe input parameters needed by the model to generate an

output. This mechanism allow to execute multiple instances of the same model with different input

Page 40: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

40 | 48

parameters or different models with the same parameters. This generic execution module is deployed

within a web service in a Tomcat server. The web service interface (REST API) allows to manage remotely

and dynamically the execution of the models adding, modifying or deleting tasks and the corresponding

predictions.

They were scheduled to be executed daily. When executed, these tasks connect the RServe with the

models, perform a prediction and retrieve the information to be stored in a MySQL database for later use

for the optimization service or the different visualization tools as the mobile app or the desktop

dashboard. As shown in Figure 27, two of the components (RServe and Tomcat) where deployed using

Docker containers.

Figure 27: Deployment of the Energy Demand Forecasting Services.

The Energy Demand Forecasting service is not closed, which means that new predictions can be added as

well as modifying existing ones. If a new prediction is needed, data analysts must develop a predictive

model and generate the corresponding *.rds file. This file is then copied into the RServe. A new task must

be added to the taskservice, configuring the schedule of the task and the input parameters. A typical

problem of forecasting services is the need of adjust the behaviour to the model to the last trends. This

adjustment implies the retraining the predictive model with the last historical data available. With this

mechanism the retraining could be done offline by the analyst and it is only necessary to substitute the

old R model (.rds file) with the new one.

RServe Apache Tomcat

MySQL Task Service

(Java) .rds models

Docker Docker

Page 41: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

41 | 48

5. CONCLUSIONS

In 2016 the residential sector represented the 25.4% of final energy consumption and 17.4% of gross

inland energy consumption in the EU. Therefore, the potential of Demand Side Management activities

and DR programmes is particularly promising for this sector. Being able to accurately predict the amount

of energy to be produced over a period of time, and knowing in advance when demand peaks will occur,

can definitely contribute to a better management of their disparity, thus allowing the suggestion of the

most suitable DR programs to end-users. Furthermore, towards the improvement of the stability and

quality of the grid systems, it is necessary to provide consumption and production planning ahead. As part

of Task 4.4, RESPOND aims at developing services that allow the estimation of produced and consumed

energy in dwellings and neighbourhoods within the pilot sites.

Recently numerous approaches for energy forecasting were proposed in literature, and although most of

them focus on data-driven due to their high performance, there are also physics-based models.

Furthermore, the KDD process leading to the extraction of useful knowledge from raw data is at the core

of these predictive models. In this deliverable, the development of predictive models for energy

forecasting services are provided.

The Energy Production Forecasting service focuses on the development of models to estimate the

renewable energy production. Namely, it focuses on the RES generation systems available at the RESPOND

pilot sites, that is, on PV panels (in Aarhus and the Aran Islands) and Solar Thermal Collectors (in Madrid).

Various Machine Learning approaches were considered and tested using Python for Aarhus and Madrid

pilot sites. In all cases, optimal hyper parameters were chosen using grid search and the MAE was used as

an indicator of their performance. For forecasting energy coming from PV panels, Random Forest models

were the ones with the best performance, whilst for the STC Neural Networks were the chosen. For the

Aran Islands, a physical model was employed, as it required only parameters that can most commonly be

found in the PV cells data sheets.

Likewise, the Energy Demand Forecasting service has been found necessary to accurately forecast short-

term electricity demand. Furthermore, this service has also covered the estimation of DHW consumption

in the Spanish neighbourhood. However, this DHW consumption case is rather complicated due to the

aforementioned difficulties, thus is yet to be adequately solved. Similar to the Energy Production

Forecasting models, different ML algorithms have been tested and best results were obtained with kNN

type of algorithms, after generating the necessary input variables derived from raw data (e.g. the sine and

cosine of the hour data).

All the developed models have been deployed following a well-defined plan with views to passing from a

testing to a production environment. They are all designed to retrieve the necessary inputs, execute their

estimations and store the results in an automatized way.

Page 42: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

42 | 48

ANNEX 1

This Annex shows the benchmarking results for the Madrid pilot site’s production forecasting model based

on Machine Learning.

MSE MAE MSE MAE MSE MAE MSE MAE MSE MAE

n = 1 train 0.015604 0.077307 0.011866 0.071406 0.019039 0.094445 0.019159 0.092862 0.018788 0.085482

alpha = 0.0001

val 0.017681 0.084605 0.027365 0.087863 0.020549 0.096541 0.01852 0.094201 0.014967 0.083557

test 0.014385 0.077405 11.60716 0.396226 0.028892 0.103801 0.012492 0.077787 6.456884 0.329657

n = 1 train 0.016153 0.081352 0.015488 0.082038 0.018103 0.085819 0.016712 0.079699 0.01968 0.09203

alpha = 0.001

val 0.058684 0.106863 0.015584 0.083892 0.019242 0.092149 9.92237 0.397753 0.019073 0.096172

test 0.919666 0.177747 0.771831 0.166682 0.052011 0.095048 0.032856 0.080721 0.059233 0.114241

n = 1 train 0.018123 0.090033 0.021513 0.098689 0.019124 0.095226 0.0242 0.100317 0.016998 0.084057

alpha = 0.01 val 0.035563 0.099466 0.010709 0.074461 0.020864 0.101133 0.017806 0.09763 0.057648 0.123404

test 0.437981 0.149553 0.017974 0.095148 0.015575 0.08541 0.014523 0.086034 0.019351 0.088121

n = 1 train 0.01692 0.084723 0.019069 0.085512 0.019851 0.092154 0.018472 0.092493 0.018647 0.088056

alpha = 0.1 val 0.022594 0.101805 0.016687 0.084805 0.022353 0.091452 0.077364 0.107857 0.024052 0.101085

test 0.025846 0.102788 0.020192 0.096981 0.119338 0.125656 0.036669 0.109487 0.023387 0.10182

n = 1 train 0.023403 0.094512 0.020927 0.088634 0.022221 0.093445 0.02121 0.087403 0.017827 0.079256

alpha = 1.0 val 0.021463 0.089171 0.018326 0.083212 0.01356 0.072988 0.01834 0.080193 0.020109 0.086545

test 0.020912 0.08773 0.0224 0.090054 0.021636 0.088713 0.019025 0.084933 0.023306 0.097146

n = 1 train 0.03075 0.120032 0.030608 0.115127 0.034604 0.124481 0.028322 0.115598 0.028443 0.111767

alpha = 10.0 val 0.028408 0.114851 0.121721 0.136753 0.028582 0.117125 0.029281 0.121573 0.031952 0.113387

test 0.029413 0.118319 0.020277 0.089448 0.047586 0.141532 0.034518 0.120342 0.030724 0.118158

n = 1 train 0.057917 0.176015 0.055876 0.178485 0.057861 0.177748 0.062579 0.184571 0.05601 0.175951

alpha = 100.0

val 0.048689 0.164342 0.04847 0.155859 0.040056 0.15767 0.042915 0.161319 0.062508 0.1801

test 0.060069 0.179784 0.056776 0.175048 0.066791 0.192928 0.075922 0.198024 0.046311 0.166136

n = 1 train 0.057971 0.177584 0.063981 0.183494 0.054805 0.169174 0.068137 0.195043 0.082505 0.221369

alpha = 1000.0

val 0.071432 0.199682 0.071316 0.198563 0.079843 0.193329 0.040529 0.166422 0.057975 0.194866

test 0.073934 0.199132 0.058619 0.176699 0.072944 0.189047 0.085304 0.215487 0.067726 0.193628

n = 2 train 0.014728 0.077051 0.014477 0.075603 0.011307 0.064982 0.014875 0.075016 0.016286 0.077226

alpha = 0.0001

val 0.038758 0.094639 0.193498 0.120891 0.020567 0.083094 1.519067 0.208796 0.035789 0.094547

test 0.033382 0.093365 0.012988 0.073036 0.057436 0.103396 0.734483 0.16895 0.029053 0.099904

n = 2 train 0.015758 0.076718 0.012026 0.06467 0.015877 0.075033 0.014561 0.074509 0.017477 0.08086

alpha = 0.001

val 0.015237 0.071581 0.908221 0.168812 0.015757 0.081493 0.028879 0.085538 0.025711 0.105326

test 0.081943 0.107886 6477.874 10.10355 0.049024 0.09881 0.078543 0.117813 0.032679 0.10372

n = 2 train 0.015769 0.073446 0.015372 0.069704 0.015113 0.072818 0.015146 0.073802 0.019149 0.084355

alpha = 0.01 val 0.018303 0.075674 0.030786 0.099222 0.031997 0.110947 0.013514 0.069619 0.039039 0.091434

test 0.016075 0.077992 0.011977 0.067987 0.031151 0.09289 0.029368 0.094891 0.018144 0.083738

n = 2 train 0.020137 0.086962 0.020114 0.088364 0.013428 0.070797 0.017229 0.08161 0.021972 0.095486

alpha = 0.1 val 0.011702 0.067005 0.020395 0.093463 12.36888 0.418027 0.068525 0.091061 0.026103 0.101155

test 0.018059 0.087829 0.019202 0.094073 17.99166 0.600563 0.062305 0.098877 0.014353 0.074587

Page 43: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

43 | 48

n = 2 train 0.023241 0.092227 0.023196 0.089923 0.015858 0.077653 0.015882 0.072632 0.020534 0.086139

alpha = 1.0 val 0.023714 0.091385 0.023812 0.089338 0.034385 0.110552 142.4085 1.580592 0.028453 0.10147

test 0.026377 0.103599 0.015324 0.079695 0.016777 0.078964 69.04901 0.814191 0.018051 0.086549

n = 2 train 0.029684 0.1185 0.024663 0.104488 0.030168 0.120471 0.029042 0.116263 0.027432 0.111235

alpha = 10.0 val 0.028717 0.118212 0.052246 0.104358 0.035308 0.116038 0.029117 0.117065 0.026951 0.117635

test 0.034638 0.12703 0.049093 0.117155 0.041769 0.136878 0.024037 0.110271 0.02503 0.106055

n = 2 train 0.057553 0.17866 0.046664 0.161381 0.051134 0.166191 0.055601 0.173524 0.06129 0.181362

alpha = 100.0

val 0.053899 0.175967 255.692 2.467469 0.063941 0.18647 0.048454 0.167391 0.05335 0.173927

test 0.050465 0.167912 0.045308 0.158149 0.042331 0.15798 0.059416 0.176707 0.065907 0.189585

n = 2 train 0.084683 0.220517 0.059773 0.177133 0.075334 0.209385 0.069448 0.197707 0.063886 0.186567

alpha = 1000.0

val 0.062058 0.199555 0.057405 0.179813 0.084677 0.208472 0.056278 0.185964 0.066825 0.18722

test 0.059189 0.195877 0.083678 0.208357 0.071181 0.196271 0.067512 0.196884 0.065541 0.192095

n = 3 train 0.012853 0.068769 0.010312 0.057147 0.011731 0.064107 0.012884 0.067972 0.010676 0.059813

alpha = 0.0001

val 0.05114 0.092453 4843857 275.1617 0.069796 0.093791 0.127365 0.112451 0.022697 0.084824

test 0.030245 0.084275 2175092 129.9162 0.010753 0.057572 0.562719 0.125954 0.032949 0.082636

n = 3 train 0.011149 0.060065 0.014238 0.070737 0.0096 0.055558 0.012528 0.065564 0.013828 0.06857

alpha = 0.001

val 0.039424 0.087392 0.020503 0.077204 42834683 580.7974 0.082825 0.095195 0.024618 0.081059

test 211.176 1.878393 0.045695 0.085411 42274300 572.706 0.086539 0.130524 0.026874 0.09089

n = 3 train 0.01226 0.064443 0.018656 0.081283 0.014806 0.069765 0.015931 0.071716 0.014408 0.072143

alpha = 0.01 val 0.023923 0.09305 0.017465 0.080368 0.025098 0.08863 0.024934 0.09781 0.039203 0.104552

test 0.025839 0.095503 0.031276 0.090754 0.024552 0.09774 0.031219 0.100813 0.013913 0.066506

n = 3 train 0.017571 0.07911 0.013956 0.071843 0.015059 0.075635 0.016442 0.079731 0.018199 0.083721

alpha = 0.1 val 0.029211 0.093638 0.02319 0.091684 0.024962 0.091407 0.022969 0.083869 0.017249 0.07951

test 0.015547 0.082261 0.023402 0.093102 0.017263 0.078878 0.014827 0.073931 0.016571 0.08492

n = 3 train 0.019528 0.085383 0.019657 0.084146 0.021212 0.088956 0.014855 0.073546 0.015753 0.075744

alpha = 1.0 val 0.018259 0.080456 0.017425 0.085082 0.012327 0.074108 0.218056 0.133403 0.025574 0.090497

test 0.020657 0.090132 0.020805 0.094042 0.022401 0.093175 1723.608 5.257491 0.026009 0.094757

n = 3 train 0.025972 0.105891 0.028767 0.118024 0.030056 0.119828 0.031099 0.118052 0.030738 0.116421

alpha = 10.0 val 0.026279 0.10694 0.025127 0.114503 0.029959 0.115594 0.038454 0.135325 0.028378 0.111585

test 0.019804 0.08399 0.027563 0.109835 0.029769 0.117457 0.033353 0.117268 0.031266 0.120131

n = 3 train 0.058149 0.169497 0.048701 0.165837 0.059455 0.177352 0.053515 0.166356 0.054294 0.176048

alpha = 100.0

val 0.068989 0.188172 0.060493 0.176343 0.061297 0.17861 0.064678 0.171871 0.03691 0.149947

test 0.065867 0.186432 0.036586 0.146231 0.063105 0.185239 0.119099 0.244062 0.148751 0.194353

n = 3 train 0.063484 0.187622 0.066588 0.192699 0.071355 0.200093 0.072422 0.203657 0.073394 0.204734

alpha = 1000.0

val 0.056176 0.179571 0.075822 0.202822 0.035294 0.157687 0.085019 0.207361 0.057523 0.180654

test 0.0784 0.19899 0.049647 0.170553 0.082029 0.216929 0.077855 0.20473 0.127743 0.241321

n = 4 train 0.011418 0.063862 0.011446 0.06444 0.009028 0.051963 0.010095 0.058059 0.010984 0.060484

alpha = 0.0001

val 0.110646 0.098555 0.083231 0.084176 0.051947 0.08665 0.087437 0.116062 4.851071 0.294152

test 0.089089 0.092912 0.451649 0.151383 0.391805 0.152709 0.038439 0.107783 3.318369 0.312264

n = 4 train 0.013388 0.064549 0.014268 0.066371 0.012394 0.061218 0.012925 0.062738 0.008141 0.051575

alpha = 0.001

val 0.078296 0.081943 0.441684 0.1359 1.155316 0.205619 0.035628 0.083228 6460798 225.1489

test 0.032086 0.088383 0.022798 0.078513 0.014529 0.065617 0.020784 0.071913 6477845 224.197

n = 4 train 0.016479 0.074809 0.009472 0.05905 0.015198 0.068253 0.013806 0.069892 0.012788 0.062047

alpha = 0.01 val 0.037774 0.110671 233709.1 47.83419 0.023806 0.087169 0.024816 0.098211 0.02774 0.096203

Page 44: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

44 | 48

test 0.01713 0.07528 2.13E+11 58424.65 0.019014 0.086193 0.024673 0.081174 0.04649 0.085126

n = 4 train 0.022691 0.092498 0.0124 0.070094 0.019127 0.0898 0.021711 0.092949 0.016867 0.076973

alpha = 0.1 val 0.012225 0.075387 1.6E+08 1119.619 0.0226 0.090141 0.019172 0.088591 0.019793 0.086111

test 0.020044 0.093031 3.21E+08 2231.977 0.019536 0.083035 0.023125 0.0945 0.02768 0.096921

n = 4 train 0.019737 0.087791 0.017436 0.081283 0.017681 0.080424 0.019344 0.083217 0.022222 0.090614

alpha = 1.0 val 0.017524 0.072503 0.016692 0.076532 0.024788 0.089143 0.016842 0.076549 0.020412 0.09425

test 0.020547 0.08333 1271.582 5.418682 0.023468 0.094498 0.034331 0.108018 0.022003 0.086014

n = 4 train 0.024154 0.102926 0.0303 0.120188 0.029631 0.117045 0.028941 0.115619 0.028434 0.113709

alpha = 10.0 val 7.515737 0.475666 0.0236 0.108158 0.026633 0.115686 0.023279 0.108729 0.025996 0.104159

test 0.028043 0.107559 0.021042 0.097219 0.018943 0.09676 0.027274 0.114011 0.025429 0.110044

n = 4 train 0.06367 0.187317 0.049277 0.159735 0.054006 0.169338 0.051691 0.164068 0.061475 0.186641

alpha = 100.0

val 0.086587 0.20719 0.049756 0.164125 0.04733 0.16224 0.064245 0.179995 1961.754 4.141597

test 0.068798 0.187332 0.065577 0.182515 0.051315 0.170723 0.053543 0.166289 1968.294 4.072734

n = 4 train 0.075732 0.205781 0.060416 0.181125 0.076662 0.209624 0.06445 0.187228 0.067958 0.196053

alpha = 1000.0

val 0.093328 0.224861 0.091799 0.219291 0.057026 0.184972 0.068394 0.190095 0.048297 0.166949

test 0.06148 0.190328 0.098415 0.2152 0.079617 0.208535 0.062922 0.189197 0.072311 0.199814

n = 5 train 0.014738 0.068739 0.012312 0.063747 0.011333 0.059891 0.010943 0.0577 0.007952 0.046478

alpha = 0.0001

val 0.117752 0.098965 0.74919 0.177114 0.055447 0.091188 0.054331 0.082093 0.031183 0.083543

test 1.11701 0.174295 0.02188 0.075481 1.182298 0.1967 3.624193 0.295085 3898607 299.1411

n = 5 train 0.011785 0.062231 0.012475 0.066023 0.013094 0.06679 0.008994 0.051265 0.015191 0.06865

alpha = 0.001

val 15445.07 15.67267 0.022173 0.085989 0.043852 0.100594 0.559677 0.173687 0.028335 0.083564

test 0.023088 0.077373 0.039256 0.092053 0.018418 0.077099 5.28E+09 9620.684 0.01981 0.086757

n = 5 train 0.014977 0.072008 0.014407 0.072575 0.013896 0.069461 0.019862 0.08504 0.014425 0.069129

alpha = 0.01 val 189.0897 1.774678 0.018694 0.077841 0.029265 0.086294 0.03463 0.097197 0.016937 0.082529

test 7.9183 0.348368 0.01909 0.092738 0.014368 0.070167 0.01849 0.08203 0.037869 0.100108

n = 5 train 0.014898 0.078158 0.018041 0.079706 0.018484 0.082555 0.020135 0.08891 0.019432 0.090051

alpha = 0.1 val 0.02043 0.081274 0.021353 0.089109 0.017707 0.077925 0.012242 0.06976 0.015578 0.079284

test 0.028556 0.103326 0.028115 0.101376 0.020678 0.086157 0.017463 0.085539 0.015007 0.07645

n = 5 train 0.021273 0.08945 0.023323 0.095119 0.01717 0.07619 0.020045 0.08478 0.026555 0.100627

alpha = 1.0 val 0.020197 0.08971 0.022123 0.083571 0.025296 0.09229 0.018475 0.084123 0.017897 0.086163

test 0.024584 0.092887 0.025819 0.100484 0.017935 0.085162 0.017611 0.084882 0.021931 0.092893

n = 5 train 0.026772 0.10959 0.026 0.109159 0.03125 0.116555 0.024879 0.106579 0.030661 0.117127

alpha = 10.0 val 0.025942 0.099843 0.025083 0.10549 827635.6 80.6446 0.024384 0.102148 0.030651 0.120308

test 0.022345 0.098964 0.029402 0.110514 827767.1 80.38624 0.034334 0.116864 0.025282 0.111504

n = 5 train 0.051445 0.168786 0.045897 0.157817 0.051856 0.167544 0.049741 0.163085 0.045574 0.151348

alpha = 100.0

val 0.049231 0.162911 930.0085 2.862775 0.052947 0.166112 0.074996 0.194954 78.01957 1.266441

test 0.067221 0.185472 2437.234 6.293032 0.06325 0.18068 0.041307 0.152867 34.17256 0.696948

n = 5 train 0.070112 0.19572 0.068903 0.190224 0.070646 0.205056 0.06338 0.188982 0.06625 0.19059

alpha = 1000.0

val 0.06432 0.190414 0.085314 0.21191 0.059819 0.176827 0.048986 0.165468 0.071034 0.202015

test 0.045338 0.168759 0.063928 0.192411 0.092667 0.222073 0.084436 0.212722 0.056736 0.178952

n = 6 train 0.011126 0.060991 0.014274 0.068216 0.00982 0.055993 0.008923 0.052912 0.013802 0.069041

alpha = 0.0001

val 0.0178 0.079376 0.04099 0.08606 3.651646 0.305319 0.039011 0.076156 0.022725 0.07986

test 0.68584 0.136622 6.773182 0.361393 0.026785 0.087962 0.948731 0.229538 0.561786 0.134032

n = 6 train 0.01183 0.062111 0.011481 0.060487 0.011833 0.061141 0.013353 0.063427 0.008481 0.050556

Page 45: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

45 | 48

alpha = 0.001

val 0.043859 0.103746 34979.07 23.51521 0.024383 0.077785 0.050186 0.090673 1.55E+11 35027.31

test 0.016609 0.07247 0.019926 0.089675 0.039758 0.098815 0.055275 0.096958 1.55E+11 34715.07

n = 6 train 0.019362 0.087428 0.013551 0.068685 0.017024 0.079305 0.014105 0.069016 0.014513 0.069707

alpha = 0.01 val 0.017647 0.075421 0.04683 0.099036 0.012347 0.069604 0.103855 0.108822 0.023274 0.088962

test 0.0459 0.096467 0.018369 0.075734 0.024005 0.092684 1805.368 5.361991 0.023459 0.093225

n = 6 train 0.016744 0.081191 0.017691 0.0831 0.020386 0.088581 0.018429 0.084722 0.012505 0.064371

alpha = 0.1 val 0.01954 0.094201 0.017086 0.076794 0.012777 0.066012 3096.751 5.011297 5607.02 6.711211

test 0.020749 0.088845 0.019006 0.088198 0.015769 0.078559 3013.917 4.910588 5536.854 6.659998

n = 6 train 0.018037 0.079071 0.019907 0.088897 0.02162 0.092011 0.017523 0.081126 0.022249 0.091206

alpha = 1.0 val 0.018428 0.087384 0.021778 0.088144 0.031287 0.108192 9.21E+10 26819.36 0.015165 0.072403

test 0.027445 0.100244 0.019981 0.085649 0.024872 0.09794 1.91E+11 54403.28 0.021349 0.088959

n = 6 train 0.027123 0.111539 0.025641 0.105909 0.026595 0.109379 0.026169 0.107388 0.026102 0.1022

alpha = 10.0 val 0.031227 0.119803 0.029473 0.114165 0.023019 0.0965 0.029692 0.120703 0.033426 0.118668

test 0.028239 0.110188 0.031058 0.110033 0.037256 0.131595 0.031141 0.114818 0.029376 0.112211

n = 6 train 0.050794 0.166453 0.060148 0.177994 0.052125 0.171638 0.061166 0.188352 0.057851 0.176533

alpha = 100.0

val 0.062454 0.182276 0.060689 0.180774 0.052156 0.169597 0.040089 0.157784 0.074394 0.195952

test 0.04814 0.164845 0.05851 0.176955 0.060201 0.181926 0.049647 0.171865 0.0667 0.18545

n = 6 train 0.075092 0.205366 0.068469 0.194513 0.059653 0.179116 0.076625 0.207724 0.06071 0.179714

alpha = 1000.0

val 0.071898 0.207647 151.1709 1.286638 0.066725 0.189774 0.308376 0.248046 0.074627 0.202684

test 0.065874 0.199929 157.1688 1.283495 0.075903 0.199592 0.278549 0.214486 0.097159 0.221041

n = 7 train 0.010749 0.058839 0.011419 0.06088 0.015132 0.071702 0.01306 0.065735 0.01049 0.058113

alpha = 0.0001

val 0.028521 0.079695 0.166377 0.101929 0.017527 0.073203 0.721494 0.199142 0.423545 0.159125

test 0.027315 0.081156 0.159773 0.137411 0.262904 0.119848 0.210422 0.126822 0.037862 0.094739

n = 7 train 0.012526 0.062602 0.015342 0.071769 0.012705 0.063121 0.011827 0.062205 0.0145 0.070842

alpha = 0.001

val 0.025265 0.080667 0.027148 0.084999 0.018394 0.080301 0.025622 0.096158 0.135312 0.121799

test 0.026493 0.086414 0.390133 0.134968 0.026735 0.088921 0.092463 0.096304 0.012391 0.058566

n = 7 train 0.017949 0.081525 0.017245 0.08161 0.015678 0.075631 0.012719 0.067773 0.014448 0.07086

alpha = 0.01 val 0.028054 0.086731 0.014179 0.07018 0.022522 0.085828 0.027675 0.093749 5.72E+12 211343.8

test 0.029 0.099946 0.031039 0.089627 0.027534 0.09201 0.091285 0.113806 5.65E+12 212415

n = 7 train 0.01561 0.075193 0.018445 0.082444 0.015134 0.072775 0.018321 0.084158 0.012942 0.070096

alpha = 0.1 val 0.022241 0.087604 0.023372 0.085434 2082987 130.2058 0.019641 0.085848 1.28E+18 1.42E+08

test 0.020846 0.082755 0.020401 0.085897 1.1E+12 130648 0.018009 0.082755 3578391 171.4596

n = 7 train 0.023999 0.095434 0.018764 0.08141 0.02136 0.089114 0.024119 0.095406 0.02017 0.085465

alpha = 1.0 val 0.023198 0.089591 0.023225 0.097206 0.019586 0.084045 0.01355 0.071197 0.016161 0.079094

test 0.023458 0.098776 0.020841 0.081252 0.022316 0.092507 0.014596 0.077687 0.024975 0.097208

n = 7 train 0.033343 0.126892 0.029253 0.115064 0.0312 0.117827 0.0262 0.108463 0.032914 0.121996

alpha = 10.0 val 0.031656 0.125361 0.028266 0.117386 0.036001 0.127538 0.023792 0.106726 0.027658 0.120991

test 0.03109 0.115821 0.024261 0.106753 0.024323 0.109131 0.032477 0.115733 0.035073 0.127273

n = 7 train 0.051785 0.16669 0.049318 0.16261 0.060001 0.181963 0.049626 0.159155 0.057555 0.177815

alpha = 100.0

val 0.063649 0.177076 5.982728 0.489701 0.044285 0.16042 0.05673 0.174137 0.044808 0.156522

test 0.047206 0.162407 0.054478 0.171775 0.046184 0.16142 0.061017 0.177544 0.048476 0.164297

n = 7 train 0.055065 0.176643 0.062936 0.181414 0.063053 0.187492 0.072537 0.202079 0.071422 0.204206

alpha = 1000.0

val 0.074099 0.195726 0.073129 0.204946 2.73E+09 6911.944 2.22419 0.32012 0.067765 0.191719

test 0.084173 0.207194 1724.585 6.380832 2.945306 0.335084 2.297418 0.359539 0.079253 0.205518

n = 8 train 0.010397 0.056033 0.010113 0.052401 0.010942 0.057502 0.012714 0.063997 0.012735 0.06329

Page 46: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

46 | 48

alpha = 0.0001

val 1.524263 0.225304 0.109166 0.089839 0.017698 0.073623 3.757129 0.237081 0.153219 0.12182

test 0.461029 0.131691 0.14588 0.1236 0.037231 0.094525 0.18442 0.101424 0.488 0.18458

n = 8 train 0.010883 0.057796 0.013864 0.067495 0.012621 0.063036 0.01409 0.06751 0.011949 0.060368

alpha = 0.001

val 0.029136 0.094976 0.107303 0.120941 0.022338 0.085119 0.036957 0.083333 0.406399 0.133629

test 0.929395 0.173486 0.028483 0.0753 0.061401 0.103314 0.027729 0.083762 0.032147 0.0904

n = 8 train 0.016677 0.07723 0.013467 0.066934 0.017482 0.079698 0.015709 0.076244 0.01608 0.078532

alpha = 0.01 val 63120338 993.259 0.035146 0.105941 0.044989 0.087664 2.35E+09 4280.962 0.027098 0.093139

test 0.060145 0.087643 0.134283 0.11967 0.018641 0.080802 3.53E+09 7299.06 0.076807 0.113363

n = 8 train 0.017494 0.082515 0.014845 0.073746 0.021261 0.091348 0.016858 0.079014 0.013993 0.067853

alpha = 0.1 val 0.021219 0.093433 372319.2 71.4042 0.023427 0.096335 0.025399 0.101736 7804229 348.5521

test 0.019574 0.083347 56639.08 21.08045 0.02105 0.086349 0.031305 0.105501 3107981 155.3758

n = 8 train 0.02064 0.081014 0.021534 0.088781 0.02005 0.085588 0.019754 0.084074 0.017458 0.076913

alpha = 1.0 val 0.018765 0.08662 2.79E+08 1475.266 0.024451 0.094251 0.023341 0.098949 0.025534 0.096132

test 0.032267 0.107623 2.68E+08 1440.428 0.017563 0.078115 0.019851 0.084559 0.021773 0.096084

n = 8 train 0.029557 0.118009 0.028543 0.113815 0.030106 0.113798 0.029184 0.112573 0.027249 0.112699

alpha = 10.0 val 0.024905 0.109119 0.043229 0.135357 0.027932 0.110589 0.026179 0.111588 0.03729 0.124039

test 0.027518 0.112641 0.022212 0.107277 0.031654 0.122449 0.026377 0.115792 0.019052 0.097286

n = 8 train 0.05058 0.164944 0.05556 0.172133 0.060467 0.18161 0.053908 0.169366 0.059213 0.183095

alpha = 100.0

val 0.062231 0.183147 0.091223 0.216382 0.061272 0.185828 0.05957 0.178322 0.039309 0.160299

test 0.05261 0.167706 0.052347 0.164788 0.070269 0.197425 0.042717 0.158111 0.080524 0.198308

n = 8 train 0.073642 0.202237 0.065454 0.183037 0.063962 0.185698 0.07599 0.203016 0.071087 0.19456

alpha = 1000.0

val 0.051617 0.178667 0.068983 0.190812 0.089923 0.216465 3.927166 0.385992 0.072653 0.19727

test 0.058266 0.191746 0.087156 0.217635 0.08373 0.198633 3.940423 0.370255 0.08674 0.211365

n = 9 train 0.014946 0.06937 0.012175 0.060427 0.01088 0.061175 0.012665 0.061952 0.010195 0.056877

alpha = 0.0001

val 0.106884 0.098405 0.101691 0.127666 0.061348 0.098394 0.190068 0.105469 0.055084 0.104641

test 0.728727 0.170569 0.262514 0.125584 0.315191 0.159311 0.356836 0.153115 0.022646 0.075654

n = 9 train 0.012998 0.068151 0.01198 0.062919 0.014194 0.072325 0.015892 0.075954 0.013433 0.064401

alpha = 0.001

val 0.455013 0.16174 0.078079 0.11744 0.27075 0.141264 0.137592 0.099273 0.058807 0.09608

test 0.080894 0.093003 0.036223 0.096685 0.027821 0.09113 0.09541 0.106901 0.033713 0.083317

n = 9 train 0.01522 0.072385 0.01358 0.069702 0.018648 0.08118 0.014494 0.070909 0.015748 0.077357

alpha = 0.01 val 0.021803 0.087341 0.025085 0.094096 0.023285 0.086045 0.032052 0.085754 0.015866 0.077182

test 0.090766 0.097009 0.022487 0.085945 0.030537 0.099202 0.027309 0.097063 0.025703 0.097104

n = 9 train 0.021576 0.089086 0.013729 0.071039 0.018325 0.08145 0.022035 0.095312 0.018478 0.081868

alpha = 0.1 val 0.030003 0.100235 0.023995 0.092562 0.026632 0.099749 0.018946 0.087228 0.019003 0.083434

test 0.02862 0.089745 0.029922 0.101217 0.032863 0.10671 0.014275 0.076949 0.020413 0.091038

n = 9 train 0.018121 0.084413 0.020509 0.087653 0.021267 0.084078 0.021379 0.086792 0.022746 0.092503

alpha = 1.0 val 2.36E+10 14275.42 0.02467 0.097303 0.020696 0.090207 1.72E+10 11745.59 0.018954 0.07666

test 2.28E+10 13296.43 0.017909 0.079696 0.018628 0.084776 2.12E+17 65919633 0.015569 0.078677

n = 9 train 0.027182 0.112708 0.03101 0.120364 0.029257 0.112726 0.028248 0.113878 0.02536 0.107209

alpha = 10.0 val 0.029755 0.114105 0.023732 0.103456 0.026832 0.114268 0.032881 0.11968 0.036308 0.117753

test 0.026148 0.10823 0.025041 0.113541 0.023495 0.103027 0.017814 0.098367 0.025957 0.108069

n = 9 train 0.051402 0.168199 0.05158 0.169355 0.05172 0.175401 0.055115 0.176543 0.057452 0.180286

alpha = 100.0

val 6.83E+08 2563.929 1.43E+14 1058290 0.059799 0.178887 3.63E+10 23838.08 1.88E+08 1363.189

test 6.78E+08 2292.086 4.42E+17 82859950 0.05779 0.178441 2.24E+10 13171.98 1.78E+08 1174.863

n = 9 train 0.06319 0.184597 0.075651 0.2063 0.069995 0.200417 0.055433 0.16743 0.063644 0.187161

Page 47: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

47 | 48

alpha = 1000.0

val 0.055491 0.175636 0.071921 0.193008 0.071216 0.197009 0.085269 0.212155 0.066784 0.197219

test 0.071378 0.198235 0.06092 0.191866 0.049062 0.173333 0.065013 0.181954 0.067569 0.189894

n = 10 train 0.015841 0.071885 0.010363 0.057149 0.011905 0.061156 0.011177 0.059743 0.010658 0.057841

alpha = 0.0001

val 0.212763 0.115264 1.309118 0.207853 2.673601 0.268988 0.174967 0.143714 2.885116 0.288172

test 1.13194 0.203071 0.08916 0.106506 0.020254 0.07115 0.02861 0.082645 0.179602 0.109063

n = 10 train 0.012443 0.065184 0.012612 0.063399 0.013126 0.067395 0.015412 0.071014 0.013063 0.064179

alpha = 0.001

val 0.022002 0.082607 0.056825 0.092752 0.074383 0.091808 0.860144 0.185729 0.085512 0.111972

test 0.110592 0.113537 0.122826 0.104185 0.01992 0.07758 0.235734 0.123818 0.1972 0.113749

n = 10 train 0.020196 0.084173 0.014959 0.074741 0.012246 0.065196 0.01932 0.082704 0.013961 0.070639

alpha = 0.01 val 0.020366 0.080377 0.021243 0.08034 7.01E+10 26715.22 0.026992 0.095196 0.078345 0.132204

test 0.083721 0.128667 0.101527 0.124841 2.03E+09 3967.591 0.02962 0.087434 1.41E+09 3310.999

n = 10 train 0.017348 0.079596 0.020947 0.089738 0.017217 0.080167 0.018634 0.082284 0.02127 0.09324

alpha = 0.1 val 0.020863 0.088337 0.015854 0.078259 1.99E+09 3941.994 0.030761 0.095965 0.019531 0.086822

test 0.02325 0.092945 0.094677 0.114108 0.019443 0.08501 0.032264 0.107205 0.01871 0.08157

n = 10 train 0.019026 0.079847 0.019973 0.083813 0.019956 0.086916 0.019946 0.080994 0.021985 0.086563

alpha = 1.0 val 0.047158 0.12477 0.020536 0.088086 0.022056 0.082202 0.028593 0.101043 0.027086 0.096144

test 0.024105 0.088725 1.03E+09 2823.259 6.58E+08 2258.785 8.63E+08 2585.916 0.021432 0.091195

n = 10 train 0.029758 0.115585 0.029203 0.115985 0.031338 0.119071 0.028993 0.11228 0.032186 0.119464

alpha = 10.0 val 0.02105 0.106601 0.023044 0.104 0.035399 0.1295 0.028686 0.115765 0.031461 0.119017

test 0.030242 0.124689 0.033298 0.121518 0.022486 0.11002 0.041338 0.132218 0.034269 0.115712

n = 10 train 0.049398 0.162865 0.058701 0.18114 0.05608 0.172344 0.047166 0.154939 0.055018 0.176167

alpha = 100.0

val 0.057391 0.176662 0.056534 0.178086 9.81E+21 8.75E+09 0.057154 0.172014 0.096479 0.189442

test 0.063222 0.178187 0.064331 0.180708 0.069199 0.190883 0.065177 0.180319 2.05E+27 3.98E+12

n = 10 train 0.071145 0.199554 0.073934 0.201463 0.065234 0.191843 0.069928 0.191679 0.060581 0.181691

alpha = 1000.0

val 0.076052 0.204504 6347694 222.8693 2.75E+22 1.47E+10 0.063953 0.175413 7.46E+08 2414.072

test 4403017 184.9439 0.077051 0.202072 6519396 224.9915 0.087514 0.220131 0.068117 0.196761

Page 48: D4.4 Predictive Energy Production and Demand Algorithms Deliverables/D4_4.pdf · 2020. 11. 7. · D.4.4 Predictive Energy Production and Demand Algorithms 10 | 48 2. DATA MINING FOR

WP4: ICT enabled cooperative Demand Response model

D.4.4 Predictive Energy Production and Demand Algorithms

48 | 48

REFERENCES

1 Cao, X., Dai, X., & Liu, J. (2016). Building energy-consumption status worldwide and the state-of-the-art technologies for zero-energy buildings during the past decade. Energy and buildings, 128, 198-213. 2 Abergel, T., Dean, B., & Dulac, J. (2017). Global Status Report 2017: Towards a zero-emission, efficient, and resilient buildings and construction sector. United Nations Environment Programme, 48. 3 https://ec.europa.eu/eurostat/statistics-explained/index.php/Energy_consumption_in_households 4 Energy, S. P., & Heat, G. (2013). Transition to Sustainable Buildings Strategies and Opportunities to 2050 International Energy Agency Buildings are the largest energy consuming sector in the world, and account for over one-third of total final energy consumption and an equally important source of carbon dioxide (CO2) emissions. Achieving significant energy and emissions reduction in the buildings sector is a challenging but achievable policy goal. Transition to Sustainable Buildings presents detailed scenarios and strategies to 2050. 5 Fayyad, U. M., Piatetsky-Shapiro, G., & Smyth, P. (1996, August). Knowledge Discovery and Data Mining: Towards a Unifying Framework. In KDD (Vol. 96, pp. 82-88). 6 Smil, V. (2017). Energy Transitions: Global and National Perspectives. & BP Statistical Review of World

Energy. OurWorlfibData. org/fossil-fuels/CC BY-SA. 7 Antonanzas, J., Osorio, N., Escobar, R., Urraca, R., Martinez-de-Pison, F. J., & Antonanzas-Torres, F. (2016).

Review of photovoltaic power forecasting. Solar Energy, 136, 78-111. 8 Das, U. K., Tey, K. S., Seyedmahmoudian, M., Mekhilef, S., Idris, M. Y. I., Van Deventer, W., ... & Stojcevski, A.

(2018). Forecasting of photovoltaic power generation and model optimization: A review. Renewable and Sustainable Energy Reviews, 81, 912-928. 9 Bacher, P., Madsen, H., & Perers, B. (2011). Short-term solar collector power forecasting. In proceedings of ISES

Solar World Conference. 10 Chumpolrat, K., Sangsuwan, V., Udomdachanut, N., Kittisontirak, S., Songtrai, S., Chinnavornrungsee, P., ... &

Sriprapha, K. (2014). Effect of ambient temperature on performance of grid-connected inverter installed in Thailand. International Journal of Photoenergy, 2014. 11 Duffie, J. A., Beckman, W. A., & Blair, N. (2020). Solar Engineering of Thermal Processes, Photovoltaics and Wind.

John Wiley & Sons. 12 Son, H., & Kim, C. (2017). Short-term forecasting of electricity demand for the residential sector using weather and

social variables. Resources, conservation and recycling, 123, 200-207. 13 Tascikaraoglu, A., Boynuegri, A. R., & Uzunoglu, M. (2014). A demand side management strategy based on forecasting of residential renewable sources: A smart home system in Turkey. Energy and Buildings, 80, 309-320. 14 Lusis, P., Khalilpour, K. R., Andrew, L., & Liebman, A. (2017). Short-term residential load forecasting: Impact of calendar effects and forecast granularity. Applied Energy, 205, 654-669. 15 Zhang, X. M., Grolinger, K., & Capretz, M. A. Forecasting Residential Energy Consumption Using Support Vector

Regressions. 16 Tascikaraoglu, A., & Sanandaji, B. M. (2016). Short-term residential electric load forecasting: A compressive spatio-

temporal approach. Energy and Buildings, 111, 380-392. 17 Lusis, P., Khalilpour, K. R., Andrew, L., & Liebman, A. (2017). Short-term residential load forecasting: Impact of

calendar effects and forecast granularity. Applied Energy, 205, 654-669. 18 Maltais, L. G., & Gosselin, L. (2019). Predicting Domestic Hot Water Demand Using Machine Learning for Predictive

Control Purposes. In Multidisciplinary Digital Publishing Institute Proceedings (Vol. 23, No. 1, p. 6).