Wind farm power prediction: a data-mining approach

Wind Farm Power Prediction: A Data-Mining ApproachAndrew Kusiak*, Haiyang Zheng and Zhe Song, Department of Mechanical and Industrial Engineer-ing, 3131 Seamans Center, The University of Iowa, Iowa City, IA 52242–1527, USA

In this paper, models for short- and long-term prediction of wind farm power are discussed. The models are built using weather forecasting data generated at different time scales and horizons. The maximum forecast length of the short-term prediction model is 12 h, and the maximum forecast length of the long-term prediction model is 84 h. The wind farm power prediction models are built with five different data mining algorithms. The accuracy of the generated models is analysed. The model generated by a neural network outperforms all other models for both short- and long-term prediction. Two basic prediction methods are presented: the direct prediction model, whereby the power prediction is generated directly from the weather forecasting data, and the integrated prediction model, whereby the prediction of wind speed is generated with the weather data, and then the power is gener-ated with the predicted wind speed. The direct prediction model offers better prediction performance than the integrated prediction model. The main source of the prediction error appears to be contributed by the weather forecasting data. Copyright © 2008 John Wiley & Sons, Ltd.

Received 25 April 2008; Revised 14 August 2008; Accepted 22 August 2008

WIND ENERGYWind Energ. 2009; 12:275–293Published online 24 September 2008 in Wiley Interscience (www.interscience.wiley.com) DOI: 10.1002/we.295

Research Article

* Correspondence to: Andrew Kusiak, Department of Mechanical and Industrial Engineering, 3131 Seamans Center, The Univer-sity of Iowa, Iowa City, IA 52242–1527, USA.E-mail: [email protected]

IntroductionThe wind power industry is rapidly expanding, and accurate power forecasting is essential. Wind power forecasts are used as input for various simulation tools, including market operations, unit commitment and economic dispatch. Therefore, the short- and long-term wind farm power predictions are signifi cant in transforming a wind farm into a wind power plant.

A number of different approaches have been used in forecasting wind speed and wind farm power on dif-ferent time scales. Landberg et al.1 built a model to predict the power produced by a wind farm using the data from the weather prediction model (HIRLAM – High Resolution Limited Area Model) and the local weather model (WASP – Wind Atlas Analysis and Application Program). Mohandes et al.2 compared the performance of neural network and autoregressive models applied for wind speed prediction. The neural network model outperformed the autoregressive model in both prediction graph and root mean squared errors. Lange et al.3 presented various models for short-term wind power prediction, including physics-based, fuzzy and neuro-fuzzy models. Using meteorological data, Barbounis et al.4 constructed a local recurrent neural network model for long-term wind speed and power forecasting. Hourly wind park forecasts for up to 72 h ahead were pro-duced. Damousis et al.5 developed a fuzzy logic model that was trained with a genetic algorithm. The model was then used to forecast wind speed over horizons ranging from 0.5 to 2 h. Sfetsos6 presented a novel method to forecast the mean hourly wind speed based on a time series analysis, and showed that the developed model

Key words: Wind farm power prediction; data mining; neural network; weather forecasting data; long-term prediction; short-term prediction

Copyright © 2008 John Wiley & Sons, Ltd.

276 A. Kusiak, H. Zheng and Z. Song

Copyright © 2008 John Wiley & Sons, Ltd. Wind Energ 2009; 12:275–293 DOI: 10.1002/we

outperformed the conventional forecasting models. Torres et al.7 built the Auto-Regressive Moving Average (ARMA) model based on time series data after transformation and standardization, and predicted mean hourly wind speed up to 10 h ahead.

Physics-based and statistical modeling approaches have been widely used to forecast wind speed and wind farm power. The two methods have both advantages and disadvantages. Development of prediction models for wind speed and wind power, for either short-term or long-term horizons, pose a challenge because of the stochastic nature of wind. Even assuming that an accurate wind speed prediction exists, the satisfi ed wind farm power forecasting cannot be guaranteed, as the status of each wind turbine determines the ultimate power output. Frequent updates of the prediction models for wind speed or wind farm power pose another challenge.

Data mining is a promising approach to model wind farm performance. Numerous successful applications of data mining in manufacturing, marketing, medical informatics and the energy industry have been reported in the literature.8–12 The models trained and built by data mining algorithms can be easily updated.

In this paper, a data-mining approach is applied to build models for the wind farm power prediction over both a short-term horizon (1 to 12 h ahead) and a long-term horizon (3 to 84 h ahead). The short- and long-term prediction models are constructed with the Rapid Update Cycle (RUC) model3,13 and the North American Mesoscale (NAM)3,14 model, respectively. Both the RUC and the NAM are Numeric Weather Prediction models and can generate weather forecasting data. Two different methodologies for power prediction have been compared and analysed. The models are built using historical data collected by Supervisory Control and Data Acquisition (SCADA) systems installed at a wind farm and weather forecasting data records for 16 loca-tions surrounding the wind farm.

Data Description and Methods for Wind Farm Power PredictionWeather Forecasting DataThe data from two different National Weather Service Forecast models are used for wind farm power predic-tion. The models provided data for different locations surrounding the wind farm. Figure 1 shows the special layout of the 16 model data points around the wind farm considered in this research. The immediate neighbour-hood of the wind farm includes data points 6, 7, 10 and 11. Note that data on the specifi c location, terrain and grid points were not available in this research.

RUC Model and Data

The RUC model is designed to provide accurate short-range numerical forecast guidance to various users. In this research, the RUC model data is the source for constructing a short-term wind farm power prediction model. The basic features of this model are as follows:

Figure 1. Location of the data points surrounding the wind farm

Wind Farm Power Prediction 277


• Short-term predictions with a maximum forecast length of 12 h;• the spacing between model grid points is 20 km;• a new 12 h forecast is issued every hour at 00:00, 03:00, 06:00, 09:00, 12:00, 15:00, 18:00 and 21:00 GMT;

and• at all other hours, a 9 h forecast is issued.

Table 1 describes the parameters of the RUC model. In this research, each model data point has 10 parameters, and there are 16 model data points. Therefore, the RUC data has, in total, 160 variables for predicting short-term wind farm power.

NAM Model and Data

The NAM model is designed to provide day-ahead weather forecast guidance. In this research, the NAM model data is the source for building a long-term wind farm power prediction model. The basic features for this model are as follows:

• A day-ahead forecasting with a maximum forecast length of 84 h;• the spacing between model grid points is 40 km;• a forecast value is issued every 3 h; and• a new 84 h forecast is issued four times daily at 00:00, 06:00, 12:00 and 18:00 GMT.

Table 2 describes the parameters of the NAM model. In this research, each model data point has 12 parameters, and there are 16 model data points. Therefore, the NAM model has, in total, 192 variables for predicting long-term wind farm power.

Table 1. Data description of the RUC model

Parameter Description Unit

Spd_10m Wind speed 10 m above the surface ms−1

Dir_10m Wind direction 10 m above the surface DegSpd_XXmb Average wind speed in the lowest XX mb of the atmosphere (XX is 30, 60 and 90, respectively) ms−1

Dir_XXmb Average wind direction in the lower XX mb of the atmosphere (XX is 30, 60 and 90, respectively)

deg

AD_30mb Average air density in the lowest 30 mb of the atmosphere kg m−3

PTdiff_30mb_sfc

Potential temperature difference between the surface and 30 mb above the surface; measure of atmospheric stability in lower spaces

K

Table 2. Data description of the NAM model

Parameter Description Unit

Spd_10m Wind speed 10 m above the surface ms−1

Dir_10m Wind direction 10 m above the surface DegSpd_XXmb Average wind speed in the lowest XX mb of the atmosphere (XX is 30, 60 and 90, respectively) ms−1

Dir_XXmb Average wind direction in the lower XX mb of the atmosphere (XX is 30, 60 and 90, respectively)

deg

AD_30mb Average air density in the lowest 30 mb of the atmosphere kg m−3

PTdiff_30mb_sfc

Potential temperature difference between the surface and 30 mb above the surface; measure of atmospheric stability in lower spaces

K

SHTFL Sensible heat fl ux at the surface; indicator of surface heating or cooling Wm−2

VEG Percentage of the surface that is covered by vegetation %



SCADA Data DescriptionThe data used in this research were generated at a wind farm with dozens of turbines. The data was collected by a SCADA system installed at each wind turbine. Each SCADA system collects data for more than 120 param-eters. Though the data is sampled at high frequency, e.g. 2 s, it is averaged and stored at 10 min intervals (referred to as the 10 min average data). The data used in this research were collected over a period of 3 months for all turbines of the wind farm. Due to the current industrial data practices, only a 3 month long data set is available in this research. However, the proposed methodology for short- and long-term power prediction applies to data of any magnitude. In this research, the wind speed and wind power are considered as dependent variables for the power prediction model, while the weather forecasting data are used as predictors. The wind farm data used in this research were measured by nacelle anemometers, and is 10 min average SCADA data.

Feature SelectionTo obtain an accurate prediction model with a data-mining approach, the original high-dimension data need to be reduced to low-dimension data. The signifi cant parameters for each of the 16 model data points need to be selected fi rst, as not all the data contribute to an accurate prediction. Data mining is a powerful tool for extracting knowledge and solving problems from voluminous data. Data mining offers different algorithms to perform the feature selection task, e.g. the boosting tree algorithm15,16 and the wrapper approach, integrated with the genetic or the best fi rst search algorithms.17,12

The boosting tree algorithm is used in this research for feature selection. Important predictors are determined by the importance generated by the boosting tree algorithm, and predictors with bigger importance will be selected. It is not surprising to observe that the importance of the predictors of weather forecasting data is ranked according to the closeness of model points to the wind farm. The closer to the wind farm the model point is, the more signifi cant the predictor. Based on the results produced by the boosting tree algorithm, the four closest model points 6, 7, 10 and 11 were selected as predictors for the wind farm power. As the result of the feature selection, the original 192-dimension NAM data was reduced to a 48-dimension predictor for wind farm power prediction, while the RUC data was reduced from 160 dimensions to 40 dimensions.

Principal Component AnalysisEven with the feature selection, the dimensionality of the predictors for power prediction is still high. To gain more insight into the data, the correlation coeffi cient among the weather forecasting parameters (40-dimension RUC data and 48-dimension NAM data) was computed. The results show that the parameters measured with the same unit are highly correlated. To further reduce the input dimensionality, the principal component analysis (PCA)18 was applied. The PCA expresses the variance–covariance structure of a set of variables by a few linear combinations. The basic steps of the PCA are as follows:18

1. Compute a correlation matrix for all parameters.2. Compute the eigenvectors and eigenvalues of the correlation matrix.3. Select the components to form an eigenvector.4. Derive the new data comprised of the principal component of the original data.

The weather forecasting data have different units, and therefore, the principal components are determined for parameters with the same units. The parameters with different units include: wind speed (ms−1), wind direction (º), air density (kg m−3), temperature difference (K), SHFTL (Wm−2) and VEG (%).

Table 3 presents the eigenvalues of the correlation matrix and the related statistics of the RUC wind speed data. Based on the eigenvalue statistics, the fi rst two principal components can explain 94.4% of the total variance, and therefore, a subset (here two) of eigenvalues is selected. Thus, the dimensionality of the wind speed data stream (16 inputs) is reduced to 2. The two principal components, which are uncorrelated linear combinations of the 16 original RUC wind speeds, should form the new coordinate and input for the wind power prediction model discussed in the following section.



Following the same steps of the PCA transformation of the RUC wind speed, all other RUC and NAM parameters with the same unit can be transformed into principal components. Figure 2 shows the PCA trans-formation of wind speed and wind direction. Note that WS is wind speed, WD is wind direction, WSPC is the principal component of wind speed and WDPC is the principal component of wind direction.

Table 4 shows the number of principal components (PCs) transformed by the PCA algorithm. The dimension-ality of the original data has been signifi cantly reduced by integrating the feature selection (the boosting tree algorithm in the previous section) and the PCA algorithm. The dimensionality of the RUC data has been fi nally reduced from 160 to 6, and the dimensionality of the NAM data has been further reduced from 196 to 8.

Wind Farm Power Prediction ModelThe original dimensionality of both the NAM and the RUC data has been signifi cantly reduced by feature selection and PCA transformation. In this research, the RUC model data is used for building short-term wind

Table 3. Eigenvalues of the correlation matrix and the related statistics of the RUC wind speed data

Value number Eigenvalue Total variance (%) Cumulative eigenvalue Cumulative %

1 13.6954 85.5968 13.6955 85.59682 1.4035 8.7719 15.0989 94.36873 0.4166 2.6041 15.5156 96.97274 0.1791 1.1191 15.6947 98.09185 0.1151 0.7196 15.8098 98.81146 0.0837 0.5234 15.8936 99.33487 0.0314 0.1964 15.9249 99.53128 0.0223 0.1394 15.9473 99.67069 0.0194 0.1218 15.9668 99.792410 0.0116 0.0729 15.9784 99.865311 0.0072 0.0473 15.9861 99.912712 0.0048 0.0303 15.9909 99.943113 0.0037 0.0229 15.9945 99.965914 0.0031 0.0195 15.9976 99.985415 0.0014 0.0091 15.9991 99.994516 0.0008 0.0054 16.0001 100

Figure 2. PCA transformation of the wind speed and direction



farm power prediction, and the NAM model data is used for long-term prediction. Two ways to predict the power generated by the wind farm are proposed in this research. One is to directly use the weather forecasting data (direct prediction model described in Section 2.5.1), and the other is to use the weather forecasting data to predict the future wind speed fi rst, and then use it to predict the wind farm power.

The Direct Prediction Model of Wind Farm Power

The direct prediction model is used to predict wind farm power based on the weather forecasting data. The short-term prediction model is expressed in equation (1).

ˆ , , ,y t T f WSPC t T WDPC t T ADPC t T PTPC t T+( ) = +( ) +( ) +( ) +( )[ ] (1)

The function f(.) describes the underlying relationship between the RUC data and wind farm power. The func-tion will be learned in Section 3.1 by the data-mining algorithms using the SCADA and RUC data. In equation (1), the meaning of the variables are as follows

• y is the predicted wind farm power in the future.• T is the prediction horizon.• t is the run-time indicating the time at which a model forecast is started; t + T is the timestamp indicating

the time a particular forecast is valid.• WSPC(t + T) is the PCs transformed from the RUC wind speed data, and there are two PCs for wind speed;

the other predictors in the function f(.) are the PCs of wind direction, air density and PTdiff_30mb_sfc of RUC data, respectively.

• The output of equation (1) is obtained from SCADA while the inputs are the RUC data.

The long-term prediction model is shown in equation (2).

ˆ , , , ,y t T f WSPC t T WDPC t T ADPC t T PTPC t T VEGPC t T+( ) = +( ) +( ) +( ) +( ) +( )) +( )[ ], SHTFLPC t T (2)

The function in equation (2) is the same as the one is equation (1) except for two minor differences. One is that the predictors in this function are the PCs transformed from NAM data. The other is that two more predic-tors, VEG and SHTFL, have been added into this function.

The function f [.] in equations (1) and (2) will be learned in Sections 3.1 and 4.1 by data-mining algorithms. To derive an accurate power prediction model for a wind farm, the prediction performance of the models learned by data-mining algorithms will be evaluated based on the accuracy metrics described in Section 2.6.

The Integrated k-Nearest Neighbour (k-NN) and Wind Speed Prediction Model

The basic equation for the wind power density19 is shown in equation (3)

P vw = 0 5 3. ρ (3)

where Pw is the power density (Wm−2), r is the air density (kg m−3) and v is the horizontal component of the mean free-stream wind velocity (ms−1).

Table 4. PCA transformation of the weather forecasting data

Parameter Unit Original no. of dimensions Number of PCs

Wind speed ms−1 16 2Wind direction º 16 2Air density kg m−3 4 1PTdiff_30 mb_sfc K 4 1SHTFL W m−2 4 1VEG % 4 1



As the nacelle of the turbine is usually located at 60 to 80 m above the ground, the air density r is consid-ered as constant at that height. Though the wind direction changes, the nacelle is controlled to face the wind to capture the maximum energy from the wind. Therefore, the wind speed is a signifi cant predictor of the wind farm power, and thus, a lot of research has been done to predict wind speed for the wind farm. In this research, the wind speed prediction model follows almost the same method as the direct prediction model for wind power in Section 2.5.1; the only difference is that the y in equations (1) and (2) becomes the predicted wind speed. The wind speed prediction model is also learned by data-mining algorithms in Sections 3.2.2 and 4.2.

Predicting wind farm power curve based on the wind speed as input has been discussed in the literature.20 Kusiak et al.20 showed that the k-NN model accurately predicts wind farm power curve, given the wind speed. In this paper, the wind speed prediction model and the k-NN model are combined to predict wind farm power as shown in Figure 3. The wind speed and wind farm power of SCADA data is used to train and test the k-NN model. The wind speed is generated by the turbine anemometers, while the weather forecasting wind data is provided by the RUC or the NAM models.

Metrics for Prediction AccuracyDifferent data-mining algorithms and two different methods are used to build a prediction model for wind farm power. Two metrics, the mean absolute error (MAE) and standard deviation of absolute error (Std) are used as the metrics for prediction accuracy. They are computed to select the accurate models (1) and (2) extracted with data-mining algorithms, and to compare the direct prediction model and integrated prediction model. Absolute error (AE), MAE and Std are expressed in equations (4)–(6).

AEy t T y t T

NRP=

+( ) − +( )×

ˆ%100 (4)

MAE

AE i

Ni

N

=( )

=∑

1 (5)

Std

AE i MAE

Ni

N

=( ) −[ ]

−=∑

1

1

(6)

where y(t + T) is the predicted wind farm power, y(t + T) is the observed (measured) wind farm power, NRP is the nameplate rating power, which is the rating capacity power of the wind farm (the sum of the rate power of all turbines on the wind farm), and is a constant number. N is the number of test data used for the prediction model. The data set for the short- and long-term prediction models will be divided into training and test data sets to train model and test accuracy, respectively.

Figure 3. The structure of the integrated prediction model



Case Study of Short-term PredictionFor short-term prediction, the model is expressed in equation (1), T = 1, 2, 3, . . . 12 h. The prediction model built here follows the same forecasting steps and horizon of the RUC model. To predict wind farm power 1 to 12 h ahead, 12 prediction models will be constructed, one for each prediction horizon. The short-term wind farm power prediction model has the following properties:

• The maximum forecast length is 12 h.• A new forecast is issued every hour.• A new 12 h forecast is issued at 00:00, 03:00, 06:00, 09:00, 12:00, 15:00, 18:00 and 21:00 GMT, and a 9 h

forecast is issued at all other hours.

Note that the output of the prediction model is hourly power (i.e. the average power over an hour), and y (t + T) is the predicted hourly power during t + (T − 1) and t + T. For example, if the run-time t is 00:00, the short-term model can generate predicted hourly power from 01:00 to 12:00. Figure 4 shows an example of the output of the short-term prediction model issued at 00:00 AM. The model used to generate this output is discussed in Sections 3.1 and 3.2. As the calculation time of RUC model takes less than 1 h for each run, the operational forecasting horizon of the short-term wind farm power prediction begins at t + 2. In this paper, a 1 h-ahead prediction is selected for testing, however, a 2 h-ahead power prediction can be also realized.

Direct Prediction Model of Wind Farm PowerAlgorithm Selection

Five data-mining algorithms that appeared to be the most promising have been used to construct the direct prediction model (1) for wind farm power. They include: the support vector machine regression (SVMreg),15,21 multilayer perceptron network (MLP),16,22 radial basis function (RBF) network,16,12 classifi cation and regression (C&R) tree23,24 and random forest25 algorithms. The SVM is a supervised learning algorithm used in classifi cation and regression. It constructs a linear discriminant function that separates instances as widely as possible. The MLP algorithm is usually used to model complex relationships between inputs and outputs or to fi nd patterns in data. The C&R tree builds a decision tree to predict either classes (classifi cation) or Gaussians (regression). The random forest algorithm grows many classifi cation trees to classify a new object from an input vector. Each tree ‘votes’ for every class, and fi nally, the forest chooses the classifi cation having the most votes over all the trees in the forest. The RBF is usually used in non-linear regression and classifi ca-tion modeling.

To fulfi l the task of short-term prediction, 12 prediction models with different forecasting horizons (t + 1, t + 2, . . . , t + 12) need to be constructed. In order to select one uniform algorithm to train the 12 prediction

50000

55000

60000

65000

70000

75000

80000

85000

90000

95000

100000

1:00 2:00 3:00 4:00 5:00 6:00 7:00 8:00 9:00 10:00 11:00 12:00

Pow

er (

kW)

Prediction horizon

Figure 4. Example of the short-term prediction generated at 00:00



models, the prediction model that predicts 6 h ahead (t + 6) was selected to investigate the fi ve data-mining algorithms. As the original power recorded by SCADA data is at 10 min intervals, every six consecutive data points were aggregated into hourly power (the average of six measured power values). The principal compo-nents transformed from RUC data for 6 h-ahead prediction and the hourly power from SCADA data resulted in 2250 instances (data set 1 in Table 5). The run-time of data set 1 in Table 5 considered for analysis starts at ‘5/29/06 00:00’ and continues to ‘8/31/06 17:00’. During this time period, the overall wind farm perform-ance was normal. Data set 1 was divided into two subsets, data set 2 and data set 3. Data set 2 contains 1798 data points and was used to develop a prediction model with data-mining algorithms. Data set 3 includes 452 data points and was used to test the prediction performance of the model learned from data set 2. Note that the time stamp used in Table 5 is the run-time t, which is different from the time stamp t + T.

The MAE in equation (5) and Std in equation (6) were used to select the most suitable algorithm for build-ing the short-term prediction model (1). The small value of the MAE and Std indicate accurate, stable and robust prediction performance. Table 6 summarizes the prediction accuracy of the models trained by different data-mining algorithms in data set 3 of Table 5. The MLP algorithm outperformed the other four algorithms. The C&R tree and random forest algorithms performed worst. The MLP network algorithm was fi nally selected for building the direct prediction model for short-term wind farm power. The RBF and MLP are actually two different NN (neural network) algorithms; however, the MLP outperformed the RBF in short-term prediction. The NN structure, including different numbers of hidden units and different types of activation functions for hidden and output neurons, has a signifi cant impact on its prediction accuracy. In this research, 100 NNs with different structures have been trained for the MLP and RBF, respectively. Only one RBF and one MLP NN with best prediction performance were retained with the prediction error statistics in Table 6.

Figure 5 shows the fi rst 100 observed and predicted wind farm power from data set 3 of Table 5 (the 6 hour-ahead prediction). The predicted power follows the trend of the observed power. Some of the predicted values match the observed values pretty closely, while some do not.

Short-term Prediction Based on MLP Algorithm

The MLP algorithm has been selected for training all 12 prediction models (1) in this research. Following the steps of Section 3.1.1, 12 direct prediction models for short-term wind farm power were built. Table 7 summa-rizes the prediction accuracy of the models built for each different horizon prediction. Note that in Table 7,

Table 5. The data set description of 6 h-ahead prediction

Data set Start time stamp End time stamp Description

1 5/29/06 00:00 8/31/06 17:00 Total data set; 2250 observations2 5/29/06 00:00 8/12/06 17:00 Training data set; 1798 observations3 8/125/06 18:00 8/31/06 17:00 Test data set; 452 observations

Table 6. Error statistics of different models based on data set 3 of Table 5

Algorithm MAE (%) Std (%)

SVMreg 16.89 17.92MLP 10.94 9.99RBF 20.32 19.68C & R tree 25.43 22.57Random forest 22.19 19.89



t + 1 means 1 h-ahead prediction, t + 12 means 12 h-ahead prediction. The direct prediction model for short-term wind farm power from t + 1 to t + 12 can be realized by building 12 MLP prediction models. The results in Table 7 show that the prediction performance is stable and robust at the prediction horizons t + 1 through t + 12.

Figure 6(a)–(c) illustrate the fi rst 100 predicted power and observed power of models with different predic-tion horizons, and they are 3, 8 and 12 h-ahead predictions, respectively. Figure 7(a) and (b) illustrate the MAE and Std, two important metrics of prediction accuracy, for 12 different forecasting horizons of the short-term MLP prediction model.

The Integrated Prediction Model and ComparisonThe k-NN Model for Wind Farm Power Curve Prediction

The previous research20 has shown that the k-NN model is accurate for prediction of wind farm power curve given the wind speed as input. The predictor for the k-NN model is the average wind speed measured at the nacelle of every turbine of the wind farm. Using the average wind speed as input to the k-NN model, the wind farm power can be predicted accurately when the wind farm is operating under normal conditions. The normal conditions exclude wind speed that is too low or high, turbines undergoing maintenance and low power output due to control issues and environment issues.

To predict hourly power, every six consecutive data points were aggregated into hourly power (the average of 6 measured power values), and the hourly wind speed was aggregated in the same way. The data set used

0

20000

40000

60000

80000

100000

120000

140000

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99

Pow

er (

kW)

Testing Data (Hourly average)

Observed power Predicted power

Figure 5. The 6 h-ahead prediction of wind farm power

Table 7. Error statistics of the MLP direct prediction model of short-term wind farm power

Prediction MAE (%) Std (%) Prediction MAE (%) Std (%)

t + 1 9.28 8.12 t + 7 9.82 9.19t + 2 9.35 8.21 t + 8 10.57 9.91t + 3 9.76 8.69 t + 9 8.41 8.73t + 4 9.36 8.32 t + 10 11.06 10.63t + 5 9.97 8.93 t + 11 11.19 9.08t + 6 10.49 9.99 t + 12 11.49 10.53



0

20000

40000

60000

80000

100000

120000

Pow

er (

kW)


Observed power(a)

Predicted power

(b)

0

20000

40000

60000

80000

100000

120000

Pow

er (k

W)



(c)

0

20000

40000

60000

80000

100000

120000

Pow

er (

kW)



1 8 15 22 29 36 43 50 57 64 71 78 85 92 99

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99

1 8 15 22 29 36 43 50 57 64 71 78 85 92 99

Figure 6. Direct prediction of short-term wind farm power by MLP: (a) 3 h-ahead prediction; (b) 8 h-ahead prediction; (c) 12 h-ahead prediction

in the analysis is shown in Table 5. Note that to develop a prediction model with a k-NN algorithm, the weather forecasting data in Table 5 is not used; only the hourly power and wind speed of SCADA data are used. Data set 1 was divided into two data subsets, data set 2 and data set 3. Data set 2 was used to develop a prediction model with the k-NN algorithm. Data set 3 was used to test the prediction performance of the model learned from data set 2. Table 8 shows the error statistics of the k-NN model over data set 3 of Table 5. The prediction model trained by k-NN performs an accurate, stable and robust prediction given the hourly wind speed as predictor.



Comparison of the Integrated and Direct Model

Predicting wind farm power curve with the k-NN algorithm calls for a wind speed prediction model. Follow-ing the method described in Section 2.5.2, a wind speed prediction model can also be built following the same procedure of the direct prediction model for wind farm power. The wind speed prediction model uses the same predictors as the direct prediction model expressed in equation (1) of Section 2.5.1, which is the RUC data after feature selection and PCA transformation. However, the output y in equation (1) is the wind speed other than wind farm power. The fi ve data-mining algorithms used in building the direct prediction model trained different wind speed prediction models, and again, the MLP network algorithm was proved to outperform the other four algorithms. The integrated model for the short-term power prediction model is composed of MLP and k-NN algorithms. Therefore, the basic procedure for integrated model prediction is to predict wind speed with the MLP model fi rst, and then use the predicted wind speed to predict the wind farm power with the k-NN model. As the model-building procedure is obvious, the detail process is not shown in this paper. The statistics of the prediction performance of the integrated model are shown in Table 9. In comparing the error

8

8.5

9

9.5

10

10.5

11

11.5

12

t+1 t+2 t+3 t+4 t+5 t+6 t+7 t+8 t+9 t+10 t+11 t+12

MA

E (%

)

Prediction horizon

(a)

(b)

8

8.5

9

9.5

10

10.5

11

t+1 t+2 t+3 t+4 t+5 t+6 t+7 t+8 t+9 t+10 t+11 t+12

Std

(%)

Prediction horizon

Figure 7. MAE and Std of the direct prediction model for short-term wind farm power by MLP: (a) MAE; (b) Std

Table 8. Error statistics of the k-NN model


k-NN (k = 25) 1.556 1.465



statistics of Tables 7 and 9, it can be found that the integrated model is less accurate than the direct prediction model in all 12 prediction horizons.

The computational experience reported in Section 3.2.1 showed that the k-NN algorithm provided accurate power curve predictions. Though the k-NN model and the wind speed prediction model performed well indi-vidually, the integrated model produced a larger error when predicting future power. This could be due to the fact that the power is a cubic function of the wind speed, as indicated by the wind power density function (3) of Section 2.5.2. In addition, the wind speed in the k-NN model is too sensitive as a predictor for wind farm power, which implies that a small error in wind speed prediction might lead to a large prediction error of wind farm power prediction. The integration of the two models did not improve prediction accuracy. Even accurate wind speed prediction cannot guarantee accurate wind farm power prediction; therefore, it is better to predict wind farm power directly, rather than predict wind speed fi rst.

Case Study of Long-term PredictionFor the long-term prediction model expressed in equation (2), T = 3, 6, 9, . . . , 84 h. The prediction model built here follows the same forecasting steps and horizon of the NAM model. In order to predict wind farm power from 3 to 84 h ahead, 28 prediction models need to be constructed respectively for different long-term prediction horizons. The long-term wind farm power prediction model has the following features:

• Day-ahead prediction with maximum 84 h forecast length.• A new 84 h forecast is issued four times daily at 00, 06, 12, 15, and 18 GMT.• A forecast value is saved every 3 h.

Note that the output of the prediction model is the 3 h-power (average power over a 3 h interval), and y (t + T ) is the predicted 3 h-power during t + (T − 3) and t + T. For example, if the run-time t is 5/29/06 00:00, the long-term prediction can generate predicted 3 h-power from 5/29/06 03:00 to 6/1/06 12:00. Figure 8 shows an

Table 9. Error statistics of the integrated prediction model for short-term wind farm power


t + 1 10.21 8.93 t + 7 10.21 9.56t + 2 10.28 9.03 t + 8 10.99 10.31t + 3 10.54 9.38 t + 9 9.67 10.04t + 4 10.11 8.98 t + 10 12.72 12.22t + 5 11.17 10.01 t + 11 12.08 9.81t + 6 11.75 11.19 t + 12 12.41 11.37

Figure 8. Example of the long-term prediction generated at 00:00



example of the forecasting horizon of the long-term prediction model at 00:00, and this model will be built in Sections 4.1 and 4.2. Note that not all 28 prediction horizons from t + 3 to t + 84 are shown in Figure 8. As the calculation time of NAM model takes less than 6 h in each run, the operational forecasting horizon of the long-term wind farm power prediction starts from t + 6. The long-term wind farm power prediction over 6 h-ahead can be realized in practice; however, in this paper, 3 h-ahead prediction is considered to validate the methodology for long-term prediction.

Direct Prediction Model for Wind Farm PowerAlgorithm Selection

Five data-mining algorithms (the same as used previously) were selected to train the direct prediction model (2) for long-term wind farm power. For long-term prediction, 28 prediction models with different forecasting horizons (t + 3, t + 6, . . . , t + 84) need to be constructed. In order to select one uniform algorithm to train the 28 prediction models, the prediction model that predicts 45 h ahead (t + 45) was selected to investigate the fi ve data-mining algorithms.

As the original power data recorded by SCADA is at 10 min intervals, every 18 consecutive data points were aggregated into a 3 h-power (the average of 18 measured power values). The principal components transformed from the NAM data for the 45 h-ahead prediction and the 3 h-power from the SCADA data resulted in 141 instances (data set 1 in Table 10). As only the 6 week long NAM data set was available in this research, and the power record is aggregated from 10 min data to 3 h data, the number of training and testing data for long-term prediction is much smaller than that of short-term prediction. The run-time of data set 1 in Table 10 starts at ‘5/29/06 00:00 AM’ and continues to ‘7/13/06 6:00 PM’. During this time period, the overall wind farm performance was normal. Data set 1 was divided into two subsets, data set 2 and data set 3. Data set 2 contains 113 data points and was used to develop a prediction model with data-mining algorithms. Data set 3 includes 28 data points and was used to test the prediction performance of the model learned from data set 2. Note that the time stamps used in Table 10 are the run-time t other than the time stamp t + T.

Using the metrics (MAE in equation (5) and Std in equation (6)) for the short-term prediction in Section 3.1.1, fi ve data-mining algorithms are compared. Table 11 summarizes the prediction accuracy of the models trained by different data-mining algorithms in data set 3 of Table 10. The MLP proved to outperform other algorithms in both short- and long-term prediction. Therefore, the MLP network algorithm was selected for building the direct prediction models for long-term wind farm power.

Table 10. The data set description of 45 h-ahead prediction

Data set Start time stamp End time stamp Description

1 5/29/06 00:00 7/13/06 18:00 Total data set; 141 observations2 5/29/06 00:00 6/26/06 12:00 Training data set; 113 observations3 6/26/06 18:00 7/13/06 18:00 Test data set; 28 observations

Table 11. Error statistics of different models based on data set 3 of Table 10


SVMreg 15.93 17.84MLP 11.87 9.87RBF 31.51 27.36C & R tree 28.76 26.94Random forest 25.69 21.86



Figure 9 shows the observed and predicted wind farm power from data set 3 of Table 10 (the 6 h-ahead prediction). The predicted power precisely follows the trend of decreasing and increasing observed power. Some of the predicted values match the observed values pretty closely, while others do not.

Long-term Prediction Results

The MLP network algorithm was selected to train all 28 prediction models (2) in this research. Following the same steps in Section 4.1.1, 28 direct prediction models for long-term wind farm power were built. Table 12 summarizes the prediction accuracy of the models built for different horizon predictions. Note that only 16 of the 28 prediction results are shown in this table, as 16 results are enough to prove the performance of the methods and models built in this research. The direct prediction model for long-term wind farm power from T + 3 to T + 84 can be realized by building 28 MLP prediction models. The prediction performance is stable and robust at different prediction horizons, as illustrated in Table 12.

Figure 10(a)–(c) illustrate the predicted power and observed power of models with different prediction horizons, and they are 21, 57 and 81 h-ahead predictions, respectively. Figure 11(a),(b) illustrate the MAE and standard deviation (Std), two important metrics for prediction accuracy, of different forecasting horizons of the long-term prediction model.

0

20000

40000

60000

80000

100000

120000

1 3 5 7 9 11 13 15 17 19 21 23 25 27

Pow

er (

kW)

Testing data (3-hour average)


Figure 9. The 45 h-ahead prediction of the wind farm power by MLP

Table 12. Error statistics of direct prediction MLP model of long-term wind farm power


t + 3 5.93 4.23 t + 45 12.87 10.23t + 9 9.12 8.91 t + 51 10.97 10.92t + 15 9.92 8.04 t + 57 13.82 9.61t + 21 9.39 7.28 t + 63 11.88 9.95t + 27 10.35 6.41 t + 69 9.56 7.68t + 33 11.81 12.24 t + 75 10.83 9.32t + 39 11.63 7.79 t + 81 6.37 6.19t + 42 11.49 10.06 t + 84 10.57 8.78



Integrated Prediction Model and ComparisonThe integrated prediction model for long-term wind farm power follows the same procedure of Section 3.2 for the short-term model. The long-term wind speed prediction model uses the same predictors as the direct pre-diction model expressed in equation (2) of Section 2.5.1, which is the NAM data after feature selection and PCA transformation. However, the output y in equation (2) is the wind speed other than wind farm power. The MLP network algorithm outperforms the other four algorithms when training the wind speed prediction model. The integrated model for the long-term power prediction model is composed of MLP and k-NN algorithms,

Figure 10. Direct prediction of long-term wind farm power by MLP: (a) 21 h-ahead prediction; (b) 57 h-ahead prediction; (c) 81 h-ahead prediction



which predict wind speed with the MLP model fi rst, and then use the predicted wind speed to predict the wind farm power with the k-NN model. The statistics of the prediction performance of the integrated model is shown in Table 13. In comparing the error statistics of Tables 12 and 13, it can be found that the integrated model does not improve prediction accuracy for all prediction horizons of long-term prediction, compared with the direct prediction model in Section 4.1.

The reason for the difference in predicting the power output has been discussed in Section 3.2.2. Accurate wind speed prediction can improve wind farm power prediction but cannot guarantee it. Determining power based on the predicted wind speed is not a safe way for wind farm power prediction, as the power prediction

Figure 11. MAE and Std of the direct prediction model for long-term wind farm power by MLP: (a) MAE; (b) Std

Table 13. Error statistics of integrated prediction models for long-term wind farm power


t + 3 6.22 4.44 t + 45 14.01 11.13t + 9 9.57 9.35 t + 51 11.93 11.88t + 15 11.11 9.01 t + 57 16.72 11.62t + 21 10.52 8.15 t + 63 14.37 12.04t + 27 11.59 7.18 t + 69 11.09 8.91t + 33 13.34 13.83 t + 75 12.56 10.81t + 39 13.14 8.80 t + 81 7.01 6.81t + 42 12.98 11.37 t + 84 11.62 9.65



accuracy is sensitive to the wind speed. A small error in wind speed prediction can lead to a large error in wind farm power prediction, so even accurate wind speed prediction and k-NN models cannot guarantee accu-rate wind farm power prediction.

ConclusionIn this paper, a short-term prediction model with a maximum 12 h forecast length and a long-term prediction model with a maximum 84 h forecast length were built using weather forecasting data as predictors. The boost-ing tree algorithm and PCA transformation were used to reduce the predictor data dimension and enhance prediction accuracy. Among fi ve data-mining algorithms considered in this research, the MLP network algo-rithm (a neural network algorithm) outperformed the other four algorithms in building both long- and short-term prediction models. Two methods for building prediction models were compared. The integrated models (inte-grated k-NN and MLP models), which used the predicted wind speed of the MLP model as input for the k-NN model to predict wind farm power, turned out to provide less accurate and stable predictions than direct pre-diction models for wind farm power in both the short- and long-term.

Both short- and long-term prediction models predicted the wind farm power well at different time scales and horizons. The accuracy of the prediction model depends highly on its predictors—weather forecasting data, which means the more accurate the weather forecasting data, the better prediction performance the model. Unlike the time series and persistent model, the prediction models based on weather forecasting data had no obvious tendency to increase error as the prediction horizon became longer. However, for predictions within a 5 h horizon, the time series (persistent) prediction model, using wind farm data as input, could outperform the MLP direct prediction model, using weather forecasting data as input. One limitation in this paper is that there is no specifi c information about the terrain, wind farm location and weather-forecasting grid points, and the information allowing explaining the data-mining results using the existing theories. The other limitation is that only a 3 month long SCADA data from a wind farm was available, and thus seasonal performance of the proposed methodology presented in the paper could not be validated. These two limitations are due to the current data-sharing practices of the wind energy industry. However, once more data are accessible, the pro-posed models can be further tested.

The long-term prediction models are powerful tools for operation management of the wind energy market, and the short-term prediction model can be helpful to the on-site management of the wind farm. The ultimate goal of the research presented in this paper is to further improve the accuracy of prediction models for wind farms. One avenue to be pursued in future research is to incorporate on- and off-site observations other than weather forecasting data into the prediction model. It is likely that other data mining algorithms, e.g. cascade neural network or fuzzy logic, would further enhance performance.

References 1. Landberg L. Short-term prediction of the power production from wind farms. Journal of Wind Engineering and Industrial

Aerodynamics 1999; 80:207–220. 2. Mohandes MA, Reham S, Halawani TO. A Neural networks approach for wind speed prediction. Renewable Energy

1998; 13:345–354. 3. Lange M, Focken U, Physical Approach to Short-Term Wind Power Prediction. Springer-Verlag: Berlin, Heidelberg,

2006. 4. Barbounis TG, Theocharis JB, Alexiadis MC, Dokopoulos PS. Long-term wind speed and power forecasting using local

recurrent neural network models. IEEE Transactions on Energy Conversion 2006; 21:273–284. 5. Damousis IG, Alexiadis MC, Therocharis JB, Dokopoulos PS. A fuzzy model for wind speed prediction and power

generation in wind parks using spatial correlation. IEEE Transactions on Energy Conversion 2004; 19:352–361. 6. Sfetsos A. A novel approach for the forecasting of the mean hourly wind speed time series. Renewable Energy 2002;

27:163–174. 7. Torres JL, Garcia A, De M, De A. Francisco. Forecast of hourly average wind speed with ARMA models in Spain. Solar

Energy 2005; 79:65–77.



8. Kusiak A, Song Z. Combustion effi ciency optimization and virtual testing: a data-mining approach. IEEE Transactions on Industrial Informatics 2006; 2:176–184.

9. Harding JA, Shahbaz M, Srinivas S, Kusiak A. Data mining in manufacturing: a review. ASME Transactions: Journal of Manufacturing Science and Engineering 2006; 128:969–976.

10. Berry MJA, Linoff GS, Data Mining Techniques: For Marketing, Sales, and Customer Relationship Management (2nd edn). Wiley: New York, 2004.

11. Backus P, Janakiram M, Mowzoon S, Runger GC, Bhargava A. Factory cycle-time prediction with data-mining approach. IEEE Transactions on Semiconductor Manufacturing 2006; 19:252–258.

12. Tan PN, Steinbach M, Kumar V, Introduction to Data Mining. Pearson Education/Addison Wesley: Boston, 2006.13. Rapid update cycle. [Online]. Available: http://en.wikipedia.org/wiki/Rapid_Update_Cycle. (Accessed 20 March

2008).14. North American mesoscale model. [Online]. Available: http://en.wikipedia.org/wiki/North_American_Mesoscale_

Model. (Accessed 20 April 2008).15. Smola AJ, Schoelkopf B. A tutorial on support vector regression. Statistics and Computing 2004; 14:199–222.16. Bishop CM. Neural Networks for Pattern Recognition. Oxford University Press: New York, 1995.17. Espinosa J, Vandewalle J, Wertz V. Fuzzy Logic, Identifi cation and Predictive Control. Springer-Verlag: London,

2005.18. Johnson RA, Wichern DW. Applied Multivariate Statistical Analysis (4th edn). Prentice Hall: Upper Saddle River, NJ,

2005.19. Spera DA (Ed.) Turbine Wind Technology: Fundamental Concepts of Wind Turbine Engineering. ASME: New York,

1994.20. Kusiak A, Zheng HY, Song Z. Models for Monitoring of Wind Farm Power. Renewable Energy (forthcoming).21. Shevade SK, Keerthi SS, Bhattacharyya C, Murthy KRK. Improvements to the SMO Algorithm for SVM Regression.

IEEE Transactions on Neural Networks 2000; 11:1188–1193.22. Seidel P, Seidel A, Herbarth O. Multilayer perceptron tumor diagnosis based on chromatography analysis of urinary

nucleoside. Neural Networks 2007; 20:646–651.23. Witten IH, Frank E, Data Mining: Practical Machine Learning Tools and Techniques (2nd edn). San Francisco, CA:

Morgan Kaufmann, 2005.24. Breiman L, Friedman J, Olshen RA, Stone CJ. Classifi cation and Regression Trees. Wadsworth International: Monterey,

CA, 1984.25. Breiman L. Random forest. Machine Learning 2001; 45:5–32.26. Friedman JH. Stochastic gradient boosting. Computational Statistics & Data Analysis 2002; 38:367–378.27. Friedman JH. Greedy function approximation: a gradient boosting machine. Annals of Statistics 2001; 29:1189–1232.

Wind farm power prediction: a data-mining approach

Documents

Transcript of Wind farm power prediction: a data-mining approach