Building Electricity Demand Forecasting
SHUBHAM SAINI, PANDARASAMY ARJUNAN, AMARJEET SINGHAs part of the work done at
Mobile and Ubiquitous Computing Group
OVERVIEW
The IIIT – Delhi campus has more than 200 smart meters installed, collecting around 10 electrical parameters every 30 seconds.
Important to calculate an accurate baseline, and monitor any deviations from it.
A forecasting pipeline is proposed for predicting the power consumption of an electric load at any given point of time.
Motivation
Energy Consumption Increasing Worldwide India – Energy Forecasting has important role in
formulation of effective energy policies Electricity consumption analysis useful for
monitoring environmental issues
FORECASTING MODELS
Auto-Regressive Integrated Moving Average (ARIMA)
Artificial Neural Networks (ANN) Hybrid ARIMA+ANN EnerNOC
ARIMA (p,d,q)(P,D,Q)
(p,P) - number of lagged variables (d,D) - difference necessary to make the time
series stationary (q,Q) moving average over the number of last
observations.
Where yt and Et are actual value and random error at time t
Artificial Neural Networks
Popular for flexible non-linear modeling Single hidden layer feed-forward network
Where wj and wi,j are model model parameters called connection weights, p is the number of input nodes and q is the number of hidden nodes.
Hybrid ARIMA+ANN
Power consumption composed of linear and non-linear structure
Yt = Lt + Nt
ARIMA able to model linear component Lt
Residuals modeled by ANN
et = Yt - YFt
Final fitted value:
YFt = LFt + NFt
EnerNOC
Based on averaging the load on X days for each interval
D-3 12-1am 1-2am 2-3am 3-4am 4-5am
D-2 12-1am 1-2am 2-3am 3-4am 4-5am
D-1 12-1am 1-2am 2-3am 3-4am 4-5am
Event Day
12-1am 1-2am 2-3am 3-4am 4-5am
Prediction Pipeline
Multiple models can be learned by using different sub-models at each of these stages.
Initial Parameters - Granularity
Very high resolution data available, sampled every 30 seconds
Too small and too large time intervals detrimental to a model's performance
Experimented with 1Hour, 30Minutes, 15 Minutes
Initial Parameters – Forecast Horizon
Forecasting Horizon implies the number of data points a model forecasts into the future.
Days maybe be divided into working/non-working hours, day/night hours, peak/off-peak hours.
SELECTION OF SIMILAR (Y) DAYS
CRITERIA: Previous Business Days Previous Same Days
Lookback Window 4,7,10
7 similar days
14 similar days
Sub-sampling (X) Days
Criteria High X Days
Makes sense for demand-response
Excluding Highest and Lowest Days anomalies could be either due to load failure, holiday,
unpredicted occupancy etc
X:Y = 8:10
X:Y = 6:10
Adjustments – ARIMA+ANN
Training data used to forecast future values includes an additional 2-4 hours of data from the event day.
For example, in order to forecast consumption on the event day for 12PM - 5PM, we use 10AM - 5PM data on the X similar days, as well as 10AM - 12PM data on the event day.
This additional data more accurately reflect load conditions on the event day.
Adjustments - EnerNOC
To adjust the forecasted value of a time interval, for example 12PM - 1 PM, adjustments are done at 11AM
Mean of difference between actual values and the forecasted values between 8AM - 11AM is added(subtracted) to(from) the 12PM - 1PM forecasted value.
Event Day data not always available !!
Results
Brute-force approach to find optimal parameters Over 700 different combinations of parameters tested Varying Parameters:1. No. of similar days - 4, 7, 102. Similarity Criteria - Previous Business Days, Previous Same
Days3. Sub-sampling: High X of Y4. X:Y Ratio - 6:10, 8:105. Models - Hybrid ARIMA+ANN, EnerNOC, Adjusted EnerNOC6. Time Duration - 12AM - 12AM, 12AM - 7AM, 7AM - 12PM,
12PM - 5PM7. Dates - 13-March-2014, 11-March-2014, 5-March-2014, 3-
March-2014, 28-February-2014
Results (Contd.)
Load #1: Academic Building - Floor Total - First Floor
Sample Result
Load #1: Academic Building - Floor Total - First Floor
Number of Similar Days (Y) – 7 X : Y Ratio – 0.8 Similarity Criteria - Previous Same Days Time Duration - 12AM – 7AM Model - Adjusted EnerNOC
Implementation
Developed using the R language for statistical computing version 3.0(RStudio IDE)
Reasons for choosing R over other statistical computing languages like Matlab are:
1. Free and Open-Source2. Graphics and Data Visualization3. Flexible statistical analysis toolkit4. Powerful, cutting-edge analytics5. Robust, vibrant community
UI Design and Layout
GUI for simple data visualization using Shiny web framework v0.98
Tab layout with a sidebar Sidebar contains options to set the forecasting
parameters Main window - training data, and output of
various forecasting models
Time-Series clustering (In Progress)
Global features extracted from the time series through statistical operations
trend seasonality periodicity serial correlation skew, kurtosis chaos nonlinearity self-similarity
Time-Series clustering (In Progress)
Clustering – K-Means or Heirarchical Using global characteristics, group all available
streams into optimal number of clusters For each cluster, find optimal forecasting model
(through the prediction pipeline) For any new stream – assign the stream to one of
the clusters and apply the optimal forecasting model
Questions ???
Top Related