Performance assessment of photovoltaic modules based on ... · a,b Coefficients for the calculation...
Transcript of Performance assessment of photovoltaic modules based on ... · a,b Coefficients for the calculation...
1
Performance assessment of photovoltaic modules based on daily
energy generation estimation
Jing-Yi Wang: Beihang University, Beijing, 100191, China (email: [email protected]);
Zheng Qian (corresponding author): Beihang University, Beijing, 100191, China (email:
Hamidreza Zareipour: Department of Electrical and Computer Engineering, University of Calgary, Calgary,
AB T2N1N4, Canada (email: [email protected])
David Wood: Department of Mechanical and Manufacturing Engineering, University of Calgary, Calgary,
AB T2N1N4, Canada (email: [email protected])
Declarations of interest: none.
2
Performance assessment of photovoltaic modules based on daily
energy generation estimation
Abstract: Performance assessment can improve photovoltaic (PV) plant economics by identifying the need
for timely corrective actions. Performance assessment of PV plants is usually based on the comparison
between measured and modeled outputs of PV plants, i.e., an alarm occurs for abnormal operation when a
significant difference is detected. For such methods, it is critical to estimate the potential electricity
generation of PV plants in normal operation. However, unpredictable conditions affecting solar modules
pose challenges to develop reliable PV models. In this paper, a practical approach to improve estimating
daily energy generation of PV plants for performance assessment is proposed, which includes two main
components: (i) a data preprocessing method; (ii) sub-models in different weather conditions. The proposed
data preprocessing method detects outliers by comparing normalized outputs of adjacent inverters
instantaneously. It is robust against erroneous measurements in normal operation. Sub-models in different
weather conditions are developed using a Principal Component Analysis and Support Vector Machine
method for better representation of PV plant outputs. Results show that the proposed method can detect
deviations between the estimated and the measured daily energy generation of 10%. Moreover, false alarms,
i.e., an abnormal point identified while the system is operating normally, are significantly reduced.
Keywords: PV plant, performance assessment, daily energy generation estimation, data preprocessing,
weather conditions
Nomenclature:
|dG/dt| Absolute value of the derivative of irradiance
a,b Coefficients for the calculation of the expected power using Gβ
Apv Area of PV modules (m2)
Eest Daily energy estimation of PV modules (kWh)
Emeas Daily measurements of PV energy generation (kWh)
F1, …, F6 Indices defined in Appendix
f1, f2, f3 Constants to find ηc (%)
F7 7th index defined in equation (4)
G0 Extraterrestrial solar irradiance (W/m2)
G0,i Extraterrestrial solar irradiance of ith sample in a day (W/m2)
G0N,i Normalized value of extraterrestrial solar irradiance of ith sample in a day (W/m2)
Gd The deviation between G0 and Gs (W/m2)
Gs Solar irradiance on horizontal plane (W/m2)
Gs,i Solar irradiance on horizontal plane of ith sample in a day (W/m2)
GsN,i Normalized value of solar irradiance on horizontal plane of ith sample in a day (W/m2)
GSTC Solar irradiance at STC, 1000 W/m2
Gβ Measured solar irradiance on module surface (W/m2)
Gβ,i The measured solar irradiance on module surface (Gβ) of the ith sample (W/m2)
i The number of sample
j The number of day
Kfluc Fluctuation coefficient
kT Temperature coefficient (/C)
N Number of samples in a day
3
Nd Number of days in test dataset
Padj_STC The output power adjusted to a constant PV cell temperature of 25C (W)
PDC_est Estimation of output power of PV modules (W)
PDC_meas PV output power measured by inverters (W)
Pinv Observed output power of one inverter normalized over all inverters at a time point (W)
PSTC Output power of PV modules tested at STC (W)
Q1 First quartile
Q3 Third quartile
resE Relative residual between Emeas and Eest (%)
Ta Ambient temperature (C)
Tc PV cell temperature (C)
TSTC Operating PV cell temperature at STC, 25C
Vf Wind speed (m/s)
w Mounting coefficient
αMP Temperature factor of maximum power (%/C)
ηc PV efficiency (%)
ηref Reference PV efficiency at STC (%)
Abbreviation
CPR Corrected Performance Ratio
FA Firefly Algorithm
GA Genetic algorithms
GWO Grey Wolf Optimizer
IQR Interquartile range (Q3–Q1)
MPP Maximum Power Point
MPPT Maximum Power Point Tracking
PCA Principal Component Analysis
PR Performance Ratio
PSO Particle Swarm Optimization
PV Photovoltaic
RBF Radial Basis Function
RE Relative Error
RMSD Root Mean Square Deviation
RMSE Root Mean Square Error
SCA Sine Cosine Algorithm
SSA Salp Swarm Optimization
STC Standard Test Conditions, solar irradiance=1000W/m2, cell temperature=25C, Air Mass 1.5
SVM Support Vector Machine
1. Introduction
While the cost of photovoltaic (PV) energy is dropping, improving energy production over the life of a
plant is still a key factor in building the business case for a PV plant. Different factors such as shadowing,
soiling, aging of modules, and faults in the components can lead to a significant loss of energy production if
not detected and corrected in time [1].To this end, performance assessment can potentially improve PV plant
economics by identifying the need for timely corrective actions, and thus, preventing or reducing economic
4
loss [1]. Generally, an indicator is selected to represent the performance of a solar plant. Performance
assessment systems are designed to compare the measured indicator with its “potential” value, and produce
an alarm if there are meaningful differences. Potential values of the selected indicator are generally defined
as the expected values of this indicator in normal operation. Normal operation is when the system is
physically intact, there is no soiling, and no unaccounted interruptions interfere with energy production.
Systems with micro-inverters or charge controllers allow assessment of an individual module
performance by observing the measurements of micro-inverters [2]; for PV systems with central or string
inverters, performance assessment is more difficult but equally important. Various methodologies for PV
systems with central or string inverters have been reported [3-6]. Some approaches use statistical analysis of
measurements as the performance indicator, (e.g., conversion efficiency, i.e., the ratio between output power
and incident solar irradiance [3]). Characteristics of a plant’s current-voltage (I-V) curves are also used [4].
Some others are based on comparison of normalized measurements from adjacent PV strings that are
measured simultaneously to detect a fault in one of the strings [5,6].
Another group of performance assessment algorithms is based on the comparison between measured
and modeled energy generation to identify anomalies [1,5,7-12]. For these methods, an accurate estimation
of the potential electricity production in normal operation is of great importance [11-14]. However, a large
number of unpredictable conditions, which affect the PV performance, pose a serious challenge to
developing a reliable model for estimation of PV output, and call for further research. Thus, different
techniques such as physical models [5,7,8], regression models [1,11,13,15], artificial intelligence techniques
[16] or commercial simulation packages [17] all use estimation models for performance assessment of PV
plants. Model development can be based on information about the physical characteristics supplied on a
manufacturer datasheet and necessary meteorological measurements along with adjustable parameters
[5,7,8], remote monitoring obtained from meteorological and satellite observation data [9] or historical
measurements [1,11,13,15]. Fault classification and detection can rely on the rate of power or energy losses,
i.e., the relative deviation of the measured production from the expected value [1,5,7-12].
In this paper, a methodology is proposed to improve the estimation of the potential energy production
of a solar PV plant for performance assessment. A PV plant consists of PV modules, junction boxes, inverters,
sensors, communication devices, etc. In this paper, we focus only on the performance assessment of the PV
modules. A PV performance assessment system typically consists of: (i) measuring PV power and
meteorological variables, (ii) modeling PV output and estimating the potential energy production, and (iii)
performance assessment of PV modules. In this paper, we focus on the second part.
The proposed methodology is focused on two particular issues: making the data preprocessing method
robust against measured abnormal values in normal operation such as erroneous measurements; and
capturing the characteristics of PV behavior in different operating conditions. In the next section, we provide
further elaboration on why data preprocessing and considering weather conditions are important.
Accordingly, the main contributions of this paper are: (i) a data preprocessing method is designed to be
robust against erroneous measurements in normal operation; (ii) sub-models of PV plants are developed for
different weather conditions to improve the estimation of energy generation.
The rest of this paper is organized as follows. In Section 2, the background of existing approaches to
improve potential output estimation for performance assessment is introduced. Section 3 describes the
modeled PV plant and analyzes the specific aspects of the measurements used in developing the proposed
method. Section 4 presents the proposed data preprocessing method and the sub-models development in
different weather conditions, as well as the other stages of the performance assessment of PV systems. In
Section 5, numerical results and discussions are presented. Finally, Section 6 concludes the paper.
5
2. Background
This section presents a literature review of existing approaches improving potential output estimation
of PV plants for performance assessment. There are two commonly used approaches: data preprocessing and
sub-models development.
2.1 Data preprocessing
Data preprocessing is widely applied to exclude outliers from data sets used for developing the model
of PV modules for performance assessment, since that measurements from a PV plant are not always suitable
for model development. For example, the incident irradiance values on certain PV modules would be reduced
significantly by partial shading on PV arrays; and therefore measurements with low irradiance values are
eliminated in the presence of shading to avoid its influence [18]. Observations corresponding to zero power
output have also been removed because they are useless for model development [10]. In addition, some
common erroneous measurements have also been dealt with. For instance, the maximum power point
tracking (MPPT) capabilities of inverters are frequently poor to track the maximum power point (MPP) when
the irradiance changes rapidly [1,19]. Besides, in relatively large PV plants, the local pyranometers are
aligned and maintained regularly. Nevertheless, for residential or commercial PV systems, the pyranometers
may not be aligned with the PV modules; and the analyst may be unaware of the inaccurate measurements
[19].
In [11], the outliers were identified through visual inspection. In [1], the MPPs of PV modules are
roughly estimated using parameters in the datasheet. Then measurements differing by more than ±10% of
the estimation results are eliminated. In [19], the linear relationship between the MPP current and the solar
irradiance is used to check if the pyranometer is aligned with PV modules. The measurements with a non-
linear current-irradiance characteristic are eliminated. In [12,19,20] only the data for clear days were used
for modeling to avoid more uncertainty in changing weather conditions. However, these methods are not
robust against those erroneous measurements in normal operation, and may lead to a possible false alarm.
2.2 Sub-models development
In order to better estimate PV potential output, sub-models of PV plants for performance assessment
are developed in literature. The PV efficiency changes with the irradiance and cell temperature. In [11],
multiple models corresponding to different sunlight levels perform more accurately than a global model. In
[15], regression of different models over irradiance intervals is applied as a piecewise approach to capture
the variability owing to irradiance levels. In [21], the regression model is developed at different irradiance
values ranging from 0 to 1200 W/m2 in 10 W/m2 steps; as well as ambient temperature values from -10 to
+40C in 5C steps .
For PV power forecasting, weather is widely taken into consideration [22, 24]. Solar radiation is stable
on sunny days, but can fluctuate significantly on cloudy and rainy days [ 23 ]. PV power responds
instantaneously to the changes of solar irradiance, accordingly, is intermittent and undispatchable when solar
irradiance is varying [23]. As a result, the forecasting accuracy may be significantly different for sunny days
versus cloudy days [24].
In terms of PV performance assessment, weather is also important. The accuracy of PV models strongly
depends on the accuracy of measurements used for model development. If the pyranometer is misaligned
with the PV array, the errors of solar irradiance measurements are different on a sunny day and a partly
cloudy day. This is analyzed in Section 3.2. Besides, MPPs can be better identified on a sunny day. On a
partly cloudy day, MPPT may not be able to track MPPs accurately due to the fast changing irradiance [1],
which directly influences the accuracy of the developed model.
6
3 Data analysis
In this section, we introduce the PV plant and provide a discussion on its historical data used for
numerical simulations. Our method is proposed based on the discussions.
3.1 PV plant
The methodology is developed and tested using historical measurements from a 9.8 MW roof-mounted
PV plant located in Southeast China (longitude 120°33’37”E, latitude 30°35’53”N, altitude 46m). It has a
subtropical monsoon climate with two seasons: a rainy summer and a dry winter.
Figure 1 shows the layout of the PV plant consisting of 38,786 TSM-PC05A polycrystalline silicon
modules installed on the roofs of eight buildings marked in the figure. Every 22 modules are connected in
series to form a string and every 16 strings are in parallel and connected to one junction box. The plant
contains 17 inverters. The numbers of junction boxes connected to the inverters are different.
Fig. 1 The overall layout of the PV plant
The characteristics of the solar module are in Table 1. The electrical parameters are found in the
datasheet provided by the manufacturer. The modules have been in operation since completion and
commissioning before Jan, 2014. In this case study, we use DC output power of inverters and local
meteorological data collected from June 24, 2015 to June 23, 2016. Therefore, the modules have been light-
exposed for more than one year before the start of the dataset, and the performance of modules is relatively
stable. The power generation of the PV modules degrades by 0.8% every year. We did not consider the
impacts of the degradation because the duration of the data is relatively short over which the degradation is
negligible.
For longer term use, the impact of performance degradation needs to be considered. The performance
degradation of PV modules is commonly caused by two types of reasons. The first type is long-term use
which is unavoidable and affect all the PV modules in the PV plant. The presented method does not aim to
detect this type of performance degradation. Consequently, data with this type of degradation will not be
considered as outliers. One potential solution to deal with this issue is to fit the model annually to update the
parameters of the model. The second type is damage on certain PV modules, such as cracks. In the data
preprocessing stage, which is presented later in Section 4.2, the comparison among the measurements of DC
output power of the different inverters is able to identify the salient performance degradation due to damage
7
on certain modules.
Table 1 Characteristics of PV module
Manufacturer Trina Solar
Type Polysilicon
Model TSM-PC05A
Peak power 255 W
Reference PV efficiency (ηref) 15.6%
Temperature coefficient of power (αMP) -0.41%/C
The PV modules are installed close to the flat roofs which are all tilted at the same angle as the latitude
(30°35’53”) and facing due South (azimuth = 180°). There is no mutual shading or shading by adjacent
objects.
The power of PV modules measured by inverters and the meteorological measurements (one
pyranometer and one anemometer) are both averaged and sampled every minute. The measurements and
equipment are shown in Table 2.
Table 2 Data and Measurement Methods
Measurements Equipment
Power of PV modules MPPT of inverters (Sungrow SG630MX)
Solar irradiance Delta ohm LP PYRA 02 Pyranometer
Ambient temperature Pt100 (class B) temperature sensors (-50C to +70C)
Wind speed Thies Clima small wind sensor (0-40m/s) at 2m above of the PV array
3.2 Discussion of practical issues with measurements
The implementation of the PV model development poses several challenges that we discuss in this
section. One such problem is erroneous measurements. Among all the measurements used for PV model
development, the estimation accuracy is strongly dependent on irradiance and PV array output power
measured by inverters. For those PV plants without regular checks, erroneous measurements of irradiance
may be due to a misaligned or faulty pyranometer. Errors may likewise occur on the measurements of PV
array’s MPPs, since that the MPPT may not track the real MPPs precisely in fast changing irradiance
conditions [1,19]. This section analyzes the error sources of the used PV plant. Based on the analysis, our
methods are proposed.
1) Erroneous measurements of irradiance
The measured solar irradiance and output power of one inverter on a sunny day and a partly cloudy day
are presented in Figure 2. Observe the peaks of the measured solar irradiance and output power occur at
different time points on the sunny day while synchronous on the partly cloudy day.
8
a. Mar 1, 2016, sunny day, the peaks are indicated by vertical lines
b. Nov 3, 2015, partly cloudy day
Fig. 2. Measured output power and solar irradiance
There are several reasons to explain the difference between the two figures. The PV power peaks at
12:18, and the measured solar irradiance peaks at 12:36 in Figure 2-a. The real local solar noon was 12:18
on March 1, 2016. This indicates a possible error in the measurement of the solar irradiance. This error could
be caused by an error in the azimuth angle of the pyranometer relative to the PV array. Note that solar
modules receive more direct solar radiation in clear days and more diffuse solar radiation in cloudy days.
Direct solar radiation is easily affected by the changes of the azimuth angle, while diffuse solar radiation is
barely affected by it. Therefore, the time difference between the peaks of the measured solar irradiance and
power is more obvious on a sunny day.
On the other hand, PV cell temperature strongly affects the PV performance, which should also be
analyzed. PV cell temperature (Tc) can be estimated using:
0.32
( )8.91 2
c a
f
T T w GV
(1)
Where Ta is ambient temperature, Gis irradiance incident on PV modules, Vf is wind speed, and w is
mounting coefficient and given as 1, 1.2, 1.8 and 2.4 for free standing, flat roof, sloped roof, and façade
integrated installation types, respectively [25]. We adjust the measured DC power of inverters (PDC_meas) to
a constant cell temperature of 25°C (TSTC), which is the Standard Test Conditions (STC), using the
temperature coefficient of maximum power (αMP). The adjusted power (Padj_STC) is calculated using:
_
_1 ( )
DC meas
adj STC
MP C STC
PP
T T
(2)
Figure 3-a depicts the measurements of G, Vf and Ta for March 1, 2016 which was a sunny day. Figure
3-b shows the adjusted power at STC versus the measurements of power. The power at STC peaks at 12:19,
which coincides closely with the local solar noon. This indicates that the PV modules are installed facing
south.
9
a. Measured meteorological parameters
b. Measured power versus adjusted power at STC
Fig. 3 Mar 1, 2016, Meteorological measurements and output power
In order to further validate the azimuthal misaligned pyranometer, we used TRNSYS software [26] to
simulate irradiance on two surfaces with different azimuthal angles. The results showed that an azimuthal
angle difference of 25° between the two surfaces would lead to a displacement of 18 min between the peaks
of the two simulated curves of irradiance. Accordingly, we concluded that the pyranometer was misaligned
with an azimuthal angle error of about 25° to the PV array.
2) Erroneous measurements of output power measured by inverters
The accuracy of the power of MPPs measured by inverters may be low due to poor MPPT when the
irradiance changes quickly [1, 19]. As shown in Figure 4, the solar irradiance has a quasi-linear dependence
with the adjusted power of the PV array at STC on a partly cloudy day. However, there are some points
which lie relatively dispersed, especially those with fast changing irradiance. It is because the MPPT is not
able to track the MPPs accurately on a partly cloudy day due to the fast changing irradiance [1]. We use the
absolute value of the derivative of irradiance to quantify the fluctuation of irradiance, which is denoted by
|dG/dt|. The value of |dG/dt| of the ith sample, i.e., |dG/dt|i, is calculated by equation (3).
, , 1
0 1
1
i
i i
idG dt
G G i (3)
where i denotes the ith sample; and Gβ,i denotes the measured solar irradiance on module surface (Gβ) of the
ith sample. Note that a majority of data with high values of |dG/dt| cause non-linear power-irradiance
characteristics in Figure 4.
10
Fig. 4 Nov 3, 2015, example of output power at STC versus solar irradiance on a partly cloudy day
In summary, the erroneous measurements come from two sources: (i) the azimuthal misalignment of
the pyranometer relative to the PV array, and (ii) poor MPPT for fast changing irradiance. As stated in
Section 2.1, the common data preprocessing methods may detect the erroneous measurements in normal
operation as outliers, and thus lead to false alarms. Therefore, there is a need to develop a proper data
preprocessing method which is not only able to exclude outliers but also robust against the erroneous
measurements in normal operation, which is the focus of this paper.
3) Data analysis using Corrected Performance Ratio (CPR)
We need to investigate whether there are anomalies among the collected raw data primarily for further
research. The most important metric for the performance assessment of PV plants is Performance Ratio (PR)
defined in IEC 61724 [27]. PR is defined as follows [18]:
_
1
1
( )
N
DC meas
i
N
STC STC
i
P
PR
P G G
(4)
where PSTC denotes the output power of PV modules at STC; GSTC denote the STC irradiance, i.e., 1000W/m2;
and N denotes the number of samples within a day.
Since the definition of PR does not incorporate any factors except irradiance,a more precise metric
daily CPR considering the effects of PV cell temperature [20] is used instead, which is defined as follows:
_
1
1
( )(1 ( ))
N
DC meas
i
N
STC STC MP C STC
i
P
CPR
P G G T T
(5)
CPR values are compared to a predefined threshold to detect significant energy production loss. The
CPR values of the collected dataset are shown in Figure 5. The threshold of CPR is determined of 70% which
is related to its empirical base. There are three days detected as with significant degradation of energy
production: January 27, 2016; February 1, 2016 and February 2, 2016. Note that some CPR values exceed 1
in Figure 5, which seems to be unreasonable, but most of them are caused by the erroneous measurements
of irradiance whose values are lower than the real values.
11
Fig. 5 Daily CPR of the PV plant
4 Methodology
In this section the proposed framework for estimation of daily PV energy production for performance
assessment is introduced.
4.1 General Framework
The flowchart of performance assessment is shown in Figure 6, in which the proposed PV power
production model is highlighted in red. Briefly, the proposed method to model and estimate potential PV
power production is divided into three steps:
Step 1) Data Collection and preprocessing: Data for 240 days are available from June 24, 2015 to June
23, 2016. The collected data are divided into a training dataset and a test dataset. Because there are merely
three days with significant degradation on CPR, these three abnormal days have been placed in the test
dataset in order to validate the effectiveness of the proposed method on detecting anomalies. The remaining
data are divided randomly in a uniformly distributed manner. The training dataset contains 167 days (about
70% of the total data), and the test dataset contains 73 days (about 30% of the total data). Then the proposed
data preprocessing approach is applied on the training dataset to eliminate outliers.
Step 2) Weather classification: In this stage, a weather classification model is constructed to classify
the weather conditions for each day. The features used to train the classification model are extracted from
solar irradiance measurements. Then Principal Component Analysis (PCA) is used to select the most
informative features. A machine learning algorithm, i.e., Support Vector Machine (SVM) is used to develop
a classifier.
Step 3) PV system modeling: In this stage, sub-models for different weather conditions are developed
using least square fits.
12
Fig. 6 Structure of PV performance assessment
The performance assessment block is not the focus here, but it is used to demonstrate the proposed
method. This block is further discussed in Section 4.5.
4.2 Data preprocessing
The model development requires a pre-filtering of the collected historical data. Observations
corresponding to low irradiance values are eliminated for which the measurement accuracy is significantly
reduced [11]. Accordingly, we define the lower limit of irradiance values as 400 W/m2, and eliminate data
with irradiance values less than 400 W/m2. The limit of irradiance is found by analyzing the regression
results of PV output power, which are based on the quasi-linear relationship between PV power and
irradiance at a range of irradiance values. We use different lower limits of irradiance values ranging from
100 to 800 W/m2 in 100 W/m2 steps. Different regression models of PV power are then developed using data
after filtering the observations with irradiance values lower than the different limits. Our findings showed
that the deviations of the estimated PV power values calculated using the developed regression models from
the real measured PV power decreased when the lower limit of irradiance increased. However, this
downward trend flattened when the lower limit of irradiance higher than 400 W/m2. Besides, the number of
data points used for model development in some days would be too small when using a lower limit of
irradiance higher than 400 W/m2 for data filtering. Accordingly, the lower limit of irradiance is roughly set
to 400 W/m2.
Subsequently, we compare the outputs of different inverters measured simultaneously to detect outliers
[5, 6], which aims to be robust against the erroneous measurements in normal operation. Note that the
erroneous measurements of irradiance have equal effect on the overall PV plant, not certain inverters.
13
Therefore, the erroneous measurements of irradiance will not be detected by comparing the outputs of
different inverters. Accordingly, this method is robust against the erroneous measurements of irradiance. The
fast changing irradiance may have different effects on the MPPT techniques of different inverters. It may
lead to different degrees of error on the measurements of DC power of inverters. Then some of the erroneous
measurements of power may be detected as outliers by the proposed data preprocessing method. However,
it is hard to quantify and evaluate the different errors. This issue poses a challenge for the proposed data
preprocessing method to be robust against the erroneous measurements of power. We further discuss it in
Section 5.1 in details.
In this paper, we use boxplot rule for comparison among normalized DC power of all the different
inverters that are measured simultaneously. The boxplot rule classifies an observation as an outlier if
3 1.5invP Q IQR or1 1.5invP Q IQR [3]. Here, Pinv is the observed power of one inverter normalized over all
inverters at a time point; Q1 and Q3 are the first and third quantiles of all values of Pinv at this time point; and
IQR is the interquartile range (i.e., Q3-Q1).
It is noteworthy that the boxplot rule may erroneously classify normal data as outliers. For example,
given the area of the plant, some may be shaded by clouds but not others. In this case, the shaded modules
may show a significant degradation in output power with respect to other inverters such that the data of the
shaded modules are classified as outliers. To reduce this mistake, we consider that multiple consecutive
points outside the limits represent an outlier [11]. This strategy reduces the probability that brief low power
production periods are falsely considered as outliers. At the same time, it will impact the outlier detection
rate as well.
A proper selection of the number of the multiple consecutive points can reduce the negative influence
on the outlier detection rate of the strategy. An analysis of the shading periods caused by clouds in the
irradiance data reveals that the duration of shading periods hardly exceed five minutes [28]. Therefore, we
propose the strategy that five consecutive points outside the limits represent an outlier.
The irradiance and power measurements follow a strong linear relationship in normal operation at a
range of irradiance values [11]. In order to distinguish the erroneous measurements in normal operation from
abnormal measurements according to faults, we apply a rough and simple rule based on the Relative Error
(RE) between the output power adjusted to 25C and the expected power [19]:
_ ( )
( )
adj STCP aG bRE
aG b (6)
15% 15% RE (7)
where a and b are coefficients calculated by least square regression with the training dataset in normal
operation; the expected power is roughly estimated using the measured solar irradiance, which is denoted by
aGβ+b. As noted earlier, the erroneous measurements cause non-linear power-irradiance characteristics, so
does faulty data [11]. What differs from the erroneous measurements in normal operation to faulty data is
the degree of the deviation of the measured power from the expected power. Commonly, the faulty data will
lead to more significant RE than the erroneous measurements in normal operation do. Thus, we need to leave
some room for the erroneous measurements when detecting outliers. The data points inside the ±15% limits
are considered as normal operation; otherwise, they will be detected as anomalies and excluded from the
training dataset.
Figure 7 shows the collected data and the ±15% limits. The detected faulty data are mostly below the -
15% limit. Some faulty data represent the faults due to which the PV system produces significantly less
power than it should at the given irradiance level. Besides, faults may happen in the process of data recording
14
and storage. In these circumstances, the records of measurements are commonly completed using
interpolation methods. Consequently, it may cause significant errors when there is a large time difference
between two discrete known points used for interpolation. For example, the faulty points fall in the green
ellipse are such interpolated points.
Fig. 7 Faults detection based on the linear irradiance-power characteristics.
Moreover, the outliers in the meteorological measurements are also excluded during the fitting process
of the regression model. At first, the PV model is fitted using the whole training dataset after excluding
outliers in the measurements of PV output power. Then the points whose confidence intervals of residuals
do not cover zero are considered as outliers, and are excluded from the training dataset. The process is
repeated until there is no outlier.
4.3 Classification based on weather identification
The goal of this stage is to classify measurements into different categories, and develop sub-models to
better estimate the energy production of a PV plant. Since the relation between solar irradiance and PV power
varies significantly with different weather conditions for the case described in Section 3.2, sub-models are
developed for different weather conditions. In order to assign each measurement correctly, it is necessary to
choose proper attributes of objects based on which the discrepancy between different categories of
measurements is maximized. Then we propose a PCA plus SVM method for classification of weather
conditions.
4.3.1 Feature extraction
In this stage, inputs to the classifier called feature vectors are extracted to represent each class. The
features are extracted from the incident solar irradiance on PV modules in this work to represent weather
conditions, referring to [23].
The distinctions between global solar radiation on the earth’s surface Gs and the extraterrestrial solar
radiation at the top of aerosphere G0 reflect the variation of solar irradiation and weather conditions [23].
Indexes defined by the extraterrestrial and surface solar irradiance are therefore used as feature vectors to
describe different weather conditions.
The extracted features are defined in [23] in detail. The first feature, named clearness index, is the
proportion of the extraterrestrial solar radiation passing through the aerosphere to the earth’s surface. The
second one is the root mean square deviation (RMSD) of G0 and Gs, describing the shape difference between
them. The third one is the maximum value of the 3rd-order derivative of Gd ( 0d sG G G ), related to the
fluctuations of Gs. The ratio of maximum Gs to maximum G0 is selected as the fourth index. The fifth index
is the mean of the sum of squares of the differences of Gd from the mean of Gd. The sixth index describes
the variation tendency inconsistency of Gs and G0.
15
In addition to the six indexes described in [23] (see Appendix), the seventh index denoted byflucK is
the indicator proposed in this paper. It is defined by
1
7 ,
2
, , 1 , , 1
,
, , 1 , , 1
1 0
0 0
N
fluc i s i
i
s i s i s i s i
i s i
s i s i s i s i
F K ff G
where
if G G G Gff G
if G G G G
(8)
where Gs,i-1, Gs,i, Gs,i+1 denote Gs of the i-1th, ith and i+1th samples respectively. Note that the function
,i s iff G is the indicator describing the fluctuation tendency of Gs in the sampling interval [i-1, i+1]. If
, 0i s iff G , Gs does not fluctuates at the sampling point i; and if , 1i s iff G , it does. The fluctuation
tendency of Gs in a day can be measured by F7; the bigger the fluctuations of Gs in a day, the bigger the value
of F7 for this day.
4.3.2 Classification model
In this stage, we implement PCA plus SVM to develop a classifier for better estimation of energy
production considering weather conditions. The classification model is implemented in MATLAB.
1) Data dimension reduction
In this step, the PCA algorithm is used for variable selection to avoid overfitting of the classification
model [29]. PCA is a statistical procedure that uses an orthogonal transformation to convert a set of
observations of possibly correlated variables into a set of values of linearly uncorrelated variables called
principal components (PC). The importance of a variable in a PC model is indicated by the size of its residual
variance [30]. In this paper, the candidate PCs are ranked, and then the first 95% of the candidate PCs are
selected. The 95% limit comes from trial and error and should be carefully chosen based on the available
data.
2) Classification model development
In this step, SVM is used to develop a classifier to classify measurements into accurate groups. Note
that SVM requires inputs whose category membership is known [23]. The local weather can be divided into
two typical types: sunny and cloudy. The relationship between the output power of PV modules and the
measured solar irradiance in a sunny day is different from it in a cloudy day in the case in Section 3.2. An
example is given in Figure 8 to demonstrate this.
16
a. Mar 1, 2016, sunny day
b. Nov 3, 2015, partly cloudy day
Fig. 8 Examples of weather classification based on the linear irradiance-power characteristics.
Figure 8-a shows that the misalignment of the pyranometer could cause a non-linear power-irradiance
characteristic on a sunny day because the measurements of irradiance are different with the incident
irradiance on the PV modules. Besides, the deviations of the values of the measurements of irradiance from
the incident irradiance on the PV modules occur at the global irradiance range. As for partly cloudy days, it
is low accuracy in measuring the PV array maximum power due to poor MPPT of the inverter merely when
irradiance changes fast. Then nonlinearities between power and irradiance shows at the points with the fast
changing irradiance. Accordingly, the daily percentage of the data with non-linear power-irradiance
characteristics on a sunny day is higher than it on a partly cloudy day.
As noted above, the training dataset can be classified as a sunny day or a cloudy day based on the linear
power-irradiance characteristics. To this end, we use RE between the measurements of power and the
expected power, and set limits of ±5% for RE to classify weather conditions:
_ ( )
( )
adj STCP aG bRE
aG b (9)
5% 5% RE or RE (10)
where the coefficients a and b used for equation (9) are fitted by least square regression with daily
measurements without excluding low irradiance values.
RE is a metric to measure the linear relationship between the measurements of irradiance and output
power of PV modules. The greater the absolute value of RE is, the further the data away from the irradiance-
power straight line is. We consider that if the daily percentage of the data coincident with the rule in equation
(10) exceeds a threshold set to 70%, that analyzed day is considered as a sunny day and labeled as Class 1.
Or else it will be considered as cloudy and labeled as Class 2.
We use Figure 8 to explain the selection of the ±5% limits and the threshold of 70%. More than 70% of
the data points on a sunny day exceed the ±5% limits owing to inaccurate measurements of irradiance, while
less than 70% of the data points on one partly cloudy day outside the limits.
For a non-linear classification, SVM uses kernel functions mapping the inputs into high-dimensional
feature spaces to convert a not separable problem to a separable problem [23]. In the present work, a Radial
Basis Function (RBF) kernel is applied.
To optimize the result of classification by SVM, two parameters, i.e., limiting term and the kernel
parameter can be controlled. In the present work, these two parameters are selected using Genetic algorithms
17
(GA). The application of SVM plus GA are introduced in detail in [31].
4.4 Derivation of the output estimation equation
We apply linear regression models based on historical measurements for estimation of energy
production of a PV plant in this work for their lower degree of complexity and high accuracy [11]. Generally,
the PV efficiency (c ) [25] can be estimated as:
[1 ( )]Mc ref c STCP T T (11)
where ηref is the reference PV efficiency under STC listed in Table 1.
We modify (11) to allow c to be estimated using Gβ, Ta and Vf. Tc is estimated from equation (1).
Thenc can be formulated as:
0.32
(1 ) ( )8.91 2
c ref MP STC ref MP a ref MP
f
T T w GV
(12)
Then simplify the symbolic constants in equation (12) by:
1 (1 )ref MP STCf T (13)
2 ref MPf (14)
3 0.32 ref MPf w (15)
Finally, equation (12) is simplified to:
3
1 28.91 2
c a
f
f Gf f T
V
(16)
The actual conversion efficiency of PV modules can be calculated by equation (17) where Apv denotes
the area of the PV array:
_
(%) 100DC meas
c
pv
P
A G
(17)
Accordingly, the coefficients f1, f2 and f3 of (16) are determined based on the calculated c and the
measurements ofaT , G
and fV by the least squares regression method. Then the estimation of the daily
energy production of the overall PV modules (Eest) in a PV plant on a new day is calculated based on the
estimated ηc by:
1
( )N
est c pv
i
E A G
(18)
4.5 Performance assessment of PV systems
In this stage, we apply the method which sets normal operation limits on the residual between the
estimated and measured daily energy production (Emeas) to detect abnormal operations. The residual (resE) is
formulated as:
18
_
1
1
( )
100%
N
c pv DC meas
est meas i
E N
estc pv
i
A G PE E
resE
A G
(19)
The normal operation limits are determined by analysis of the probability distribution of residuals. If
the calculated value of residual lies outside the normal operation limits, this point will be labelled as
abnormal.
5 Numerical Results
In order to prove the effectiveness of the proposed data preprocessing and the sub-models, we use them
for performance assessment of the PV modules described in Section 3. Moreover, several data preprocessing
and model development methods in literature are applied for comparison. We evaluate the applied methods
from two aspects: (i) the accuracy of the estimation of the potential energy output, and (ii) the results of
performance assessment of PV modules.
5.1 Results of estimation of the potential energy output
In this section, the proposed methods are used to estimate the potential energy output for the test days.
1) The results of the proposed data preprocessing method
We applied the proposed data preprocessing method to the training data set to exclude outliers. The
distribution of the percentages of the detected outliers of one inverter within one day is shown in Figure 9.
Note that in most cases, the percentages of the detected outliers in one day are less than 2%.
It is noteworthy that larger values of the daily percentages of the detected outliers occur a little more
frequently on cloudy days than on sunny days. As mentioned in section 4.2, the proposed data preprocessing
method may detect erroneous measurements of power of certain inverters as outliers. The erroneous
measurements of power occur mostly on cloudy days when irradiance changing fast. Therefore, the detected
outliers on cloudy days include some of the erroneous measurements of power, while the detected outliers
on sunny days do not. However, the difference between sunny days and cloudy days is not significant.
Consequently, the erroneous measurements of power on cloudy days have not been extensively detected as
outliers. From this point of view, the proposed data preprocessing method is quasi-robust against the
erroneous measurements of power.
Fig. 9 Distribution of the percentages of the detected outliers within one day
As an illustration, the daily result of the proposed data preprocessing method of one inverter is shown
in Figure 10. This unit’s data for February 3, 2016 is chosen to present the result of data preprocessing since
it has a large number of outliers, i.e., 10.89% of the data are detected as outliers.
19
Fig. 10 Identification of outliers in February 3, 2016
Figure 10 shows that the upper bound and lower bound fluctuate strongly from 9:00 to 10:30. It is
probably caused by large changes of incident irradiance on the PV arrays due to moving clouds. Then the
bounds change more slowly from 10:30 to 13:00 since irradiance tends to be smoother then. However, the
DC power of the inverter still fluctuates. Accordingly, the points outside the limits from about 10:30 to 13:00
are detected as outliers, which may be caused by system faults or interpolation errors. Meanwhile, the
abnormal measurements caused by the fast changing irradiance but normal operation from 9:00 to 10:00 are
kept. Thus, the outliers owing to possible abnormal operation are detected, and erroneous measurements in
normal operation are kept.
2) The comparison of estimation results with other methods
In order to demonstrate the effectiveness of the proposed method, we also apply other methods for
comparison. Another data preprocessing method, i.e., excluding erroneous measurements owing to poor
MPPT of the inverters when the irradiance changes rapidly from the training dataset [12,19,20], is used to
demonstrate the effectiveness of the proposed data preprocessing method. Two other methods, i.e., a global
regression model and separate models for different irradiance ranges, are used to demonstrate the
effectiveness of the proposed sub-models. For the separate models for different irradiance ranges, we assume
that irradiance above 250W/m2 is high irradiance. If the percentage of time with high irradiance during a day
exceeds a threshold that is empirically set to 40%, the day is used to train the first model. The remaining
days are used to train the second model.
There are several scenarios by combinations of certain data preprocessing methods and model
development methods as follows:
Scenario (i) EECG: Excluding erroneous measurements by poor MPPT of inverters [12,19,20,32]
combined with the global model [11,33,34];
Scenario (ii) DPCG: The proposed data preprocessing method combined with the global model
[11,33,34];
Scenario (iii) DPCI: The proposed data preprocessing method combined with the separate models for
different irradiance ranges [11];
Scenario (iii) DPCW: The proposed data preprocessing method combined with the proposed sub-
models for varying weather conditions.
These methods are measured based on the estimation accuracy of output energy production of the PV
plant in this Section. We apply Root Mean Square Error (RMSE) to measure the accuracy of the models,
which is defined as:
20
2
1100%
dN
E
j
d
res
RMSEN
(20)
where Nd is the number of the test days, resE is the residual between the estimated and the measured daily
energy production of day j.
The RMSE values calculated for the four scenarios, i.e. EECG, DPCG, DPCI and DPCW, are 18.57%,
14.03%, 13.78% and 13.25% respectively. Observe that the proposed DPCW method performs better than
others.
Note that the RMSE value of DPCG method is lower by 4.54% than for the EECG method although
they use the same global model. This is mainly because the model of EECG is developed merely on data in
clear days after excluding the erroneous measurements of power in fast changing irradiance conditions.
Consequently, this model inaccurately estimates the energy generation on cloudy days. Thus, the DPCG
method using the proposed data preprocessing leads to a more accurate estimation of energy production of
the PV array.
Moreover, the proposed method for sub-models development considering weather conditions improves
the estimation of daily energy production compared to other methods. The improvements of the proposed
method are presented by comparing the estimation results of the proposed DPCW method with DPCG
method and DPCI method, since that they use the same data preprocessing method. Although the
improvement of DPCW method is not significant from a RMSE viewpoint, the effectiveness of the proposed
sub-models development method is demonstrated in the application of performance assessment of the PV
modules in Section 5.2.
We have also applied several metaheuristic optimization methods, i.e., Genetic Algorithm (GA) [35],
Particle Swarm Optimization (PSO) [36], Firefly Algorithm (FA) [37], Grey Wolf Optimizer (GWO) [38],
Sine Cosine Algorithm (SCA) [39] and Salp Swarm Optimization (SSA) [40], to DPCW method for
parameter optimization of the models. Compared to the least square fits method, the metaheuristic techniques
did not have significant improvement on the accuracy of the developed models, and operated slower than
the least square fits method did. However, this may not always be the case depending on the dataset. We
suggest the appropriate method for model fitting to be customized for the data at hand.
5.2 Results of performance assessment
We apply the estimated potential energy outputs for performance assessment of PV modules in this
section. The residual between the measured and the estimated daily energy production is used as the indicator
representing the power losses. The days with residual exceeding the predefined threshold are defined as
abnormal. Based on our analysis, over 80% of the residuals fall within the (-10%, 10%) interval. Thus, we
arbitrarily use this interval as the boundaries of normal operation.
Figure 11 depicts the results of the residuals of daily energy production (resE) expressed as a percent
for the test days. As Figure 11 reveals, residuals of the three days with abnormal operation, i.e., January 27,
2016; February 1, 2016 and February 2, 2016, exceed the 10% limit. Besides, it is noteworthy that these days
are among those detected by either of EECG, DPCG, DPCI and DPCW methods. However, the methods
falsely identify a number of other normal operation days as abnormal - see other days whose residuals fall
outside the ±10% bands.
21
Fig. 11 Performance assessment results using EECG, DPCG, DPCI and DPCW methods based on residual of daily energy
production
Thus, we determine the rate of false alarms for the models. False alarms are defined as those where a
normal operation day is identified as an abnormal operation day. High false alarm rate would result in a great
waste of human and material resources. Therefore, reducing false alarms is critical. There are 70 normal
operation points in the test dataset; the false alarms detected by EECG, DPCG, DPCI and DPCW methods
are 33, 13, 11 and 7 respectively. The proposed DPCW method leads to fewer false alarms compared to the
other approaches.
Comparison among the four scenarios can show the effectiveness of the proposed data preprocessing
method. The DPCG method and the EEPG method apply different data preprocessing methods, while they
both use the global model. The results show that, by using the proposed data preprocessing method, the
DPCG method reduces more than 70% false alarms when compared to the EEPG method.
Moreover, the proposed sub-models development shows significant improvements in the final
performance assessment results when compared to other methods. Note that the false alarms detected by the
DPCW method reduces, respectively, by 6 and 4 false alarms when compared to the DPCG and DPCI
methods. Since that DPCG, DPCI and DPCW use the same proposed data preprocessing method, the
improvements are made by the proposed sub-models development considering weather conditions.
Note that once the misalignment error of the pyranometer is quantified, the erroneous measurements of
irradiance can be corrected prior to developing the estimation model to further improve the estimation
accuracy. To this end, future work will primarily divide global irradiance into direct beam and diffuse
irradiance; then correct the azimuthal error on the direct irradiance. The division of global irradiation into
direct beam and diffuse irradiation can depend upon variables such as the clearness index, solar elevation,
atmospheric precipitable water, etc. [41].
6 Conclusion
This paper develops a method to improve the estimation of daily energy generation of PV modules for
performance assessment. Our method has two main contributions, i.e., a data preprocessing method and a
sub-models development methodology considering weather conditions. The proposed data preprocessing
method is robust against the erroneous measurements in normal operation. It identifies outliers in the
measured DC power of certain inverters by comparing the DC power of all inverters that are measured
simultaneously when there is no shading. Besides, we propose separate sub-models for sunny days and
cloudy days. The data are classified by a Principal Component Analysis and Support Vector Machine
algorithm.
22
The proposed method is evaluated using data from a roof-mounted PV plant located in Southeast China.
The results show that the proposed methods are capable to improve the performance assessment of PV
modules, when compared to other methods in literature.
Appendix. Equations used for feature extraction in weather conditions
identification
The following formulas are the first six indexes defined in [23]:
24 1
, 1 ,0 1
1 24 1
0, 1 0,0 10
( ) ( )
( )( )
N
s s i s ii
N
i ii
G t dt G GF
G GG t dt
(1)
2
2 0 , ,
1
1( )
N
N i sN i
i
F G GN
(2)
where 0,
0 ,
0,1, ,
100max
i
N i
ii N
GG
G
,
,
,
,1, ,
100max
sN i
sN i
sN ii N
GG
G
, 1, ,i N
3
0, ,
3 31, ,
( )max { }
i s i
i N
d G GF
dt
(3)
,
1, ,
4
0,1, ,
max { }
max { }
s ii N
ii N
GF
G
(4)
2
5 0, , 0, ,
1 1
1 1( ) ( )
N N
i s i i s i
i i
F G G G GN N
(5)
6 0 , 0,
2
, ,N
s i s i i
i
F f G G f G G
(6)
where
, , 1 0, 0, 1
, 0,
, , 1 0, 0, 1
1 0,
0 0
s i s i i i
i s i i
s i s i i i
if G G G Gf G G
if G G G G
Acknowledgment
We would like to thank the financial support from the Natural Science Foundation of China (61573046).
Reference
[1] Spataru S, Sera D, Kerekes T and Teodorescu R. Photovoltaic array condition monitoring based on
online regression of performance model. In Photovoltaic Specialists Conference (PVSC). 2013. p. 815-
20.
[2] Kim Y S, Winston R. Power conversion in concentrating photovoltaic systems: central, string, and
micro‐inverters[J]. Progress in Photovoltaics Research & Applications, 2015, 22(9):984-92.
[3] Mallor F, León T, De Boeck L, Van Gulck S, Meulders M and Van der Meerssche B. A method for
detecting malfunctions in PV solar panels based on electricity production monitoring. Solar Energy,
2017. 153(Supplement C): p. 51-63.
[4] Firth SK, Lomas KJ and Rees SJ. Lomas and S.J. Rees, A simple model of PV system performance and
its use in fault detection. Solar Energy, 2010. 84(4): p. 624-35.
[5] Chouder A and Silvestre S. Automatic supervision and fault detection of PV systems based on power
23
losses analysis. Energy Conversion and Management, 2010. 51(10): p. 1929-37.
[6] Miwa M, Yamanaka S, Kawamura H, Ohno H and Kawamura H. Diagnosis of a Power Output
Lowering of PV Array with a (-dI/dV)-V Characteristic. In 2006 IEEE 4th World Conference on
Photovoltaic Energy Conference. 2006. p. 2442-5.
[7] Hariharan R, Chakkarapani M, Ilango GS and Nagamani C. A method to detect photovoltaic array faults
and partial shading in PV systems. IEEE Journal of Photovoltaics, 2016. 6(5): p. 1278-85.
[8] Dhimish M and Holmes V. Fault detection algorithm for grid-connected photovoltaic plants. Solar
Energy, 2016. 137(Supplement C): p. 236-45.
[9] Tadj M, Chouder A, Guerriero P, Pavan AM, Mellit A, Moeini R, et al.. Improving the performance of
PV systems by faults detection using GISTEL approach. Energy Conversion and Management, 2014.
80(Supplement C): p. 298-304.
[10] Daliento S, Chouder A, Guerriero P, Pavan AM, Mellit A, Moeini R, et al.. Monitoring, diagnosis, and
power forecasting for photovoltaic fields: a review. International Journal of Photoenergy, 2017: p.13.
[11] Platon R, Martel J, Woodruff N and Chau TY. Online fault detection in PV systems. IEEE Transactions
on Sustainable Energy, 2015. 6(4): p. 1200-7.
[12] Ventura C and Tina GM. Utility scale photovoltaic plant indices and models for on-line monitoring and
fault detection purposes. Electric Power Systems Research, 2016. 136(Supplement C): p. 43-56.
[13] Clarke CA and Golnas A. Photovoltaic performance characterization: Optimization by regression basis
with application to health management. In 2013 IEEE 39th Photovoltaic Specialists Conference (PVSC).
2013: p. 759-62.
[14] Rus-Casas C, Aguilar JD, Rodrigo P, Almonacid F, Pérez-Higueras PJ. Classification of methods for
annual energy harvesting calculations of photovoltaic generators, Energy Conversion and Management,
2014. 78: p. 527-36.
[15] Hooda N, Azad AP, Kumar P, Saurav K, Arya V and Petra MI. PV power predictors for condition
monitoring. In 2016 IEEE International Conference on Smart Grid Communications (SmartGridComm).
2016. p. 212-7.
[16] Olivencia Polo FA, Ferrero Bermejo J, Gómez Fernández JF, Crespo Márquez A. Failure mode
prediction and energy forecasting of PV plants to assist dynamic maintenance tasks by ANN based
models. Renewable Energy, 2015. 81(Supplement C): p. 227-38.
[17] Quesada B, Sánchez C, Cañada J, Royo R, Payá J, Experimental results and simulation with TRNSYS
of a 7.2kWp grid-connected photovoltaic system. Applied Energy, 2011. 88(5): p. 1772-83.
[18] Kichou S, Silvestre S, Nofuentes G, Torres-Ramírez M, Chouder A and Guasch D. Characterization
of degradation and evaluation of model parameters of amorphous silicon photovoltaic modules under
outdoor long term exposure. Energy, 2016, 96(3):231-241.
[19] Spataru SV, Gavriluta A, Sera D, Maaloe L and Winther O. Development and implementation of a PV
performance monitoring system based on inverter measurements. In 2016 IEEE Energy Conversion
Congress and Exposition (ECCE). 2016: p. 1-7.
[20] Ventura C and Tina GM. Development of models for on-line diagnostic and energy assessment analysis
of PV power plants: the study case of 1 MW Sicilian PV plant. Energy Procedia, 2015. 83: p. 248-57.
[21] Santos-Martin D and Lemon S. SoL–A PV generation model for grid integration analysis in distribution
networks. Solar Energy, 2015. 120: p. 549-64.
[22] Yang H, Huang CM, Huang YC and Pai YS. A weather-based hybrid method for 1-day ahead hourly
forecasting of PV power output. IEEE transactions on sustainable energy, 2014. 5(3): p. 917-26.
[23] Wang F, Zhen Z, Mi Z, Sun H, Su S and Yang G. Solar irradiance feature extraction and support vector
machines based weather status pattern recognition model for short-term photovoltaic power forecasting.
Energy and Buildings, 2015. 86(Supplement C): p. 427-38.
[24] Shi J, Lee WJ, Liu Y, Yang Y and Wang P. Forecasting power output of photovoltaic systems based on
weather classification and support vector machines. IEEE Transactions on Industry Applications, 2012.
48(3): p. 1064-9.
[25] Gökmen N, Hu W, Hou P, Chen Z, Sera D and Spataru S. Investigation of wind speed cooling effect on
PV panels in windy locations. Renewable Energy, 2016. 90(Supplement C): p. 283-90.
[26] Beckman W A, Broman L, Fiksel A, Stanford A K, Thornton J. TRNSYS The most complete solar
energy system modeling and simulation software. Renewable Energy, 2014, 5(1): p. 486-8.
[27] International Electrotechnical Commission (IEC). Photovoltaic system performance monitoring-
guidelines for measurement, data exchange and analysis. IEC 61724, 1998.
[28] Lappalainen K, Valkealahti S. Analysis of shading periods caused by moving clouds. Solar Energy,
2016, 135: p. 188-196.
24
[29] Peng H, Long F and Ding C. Feature selection based on mutual information criteria of max-dependency,
max-relevance, and min-redundancy. IEEE Transactions on pattern analysis and machine intelligence,
2005. 27(8): p.1226-38.
[30] Wold S, Esbensen K and Geladi P. Principal component analysis. Chemometrics and intelligent
laboratory systems, 1987. 2(1-3): p.37-52.
[31] Wu CH, Tzeng GH and Lin RH. A Novel hybrid genetic algorithm for kernel function and parameter
optimization in support vector regression. Expert Systems with Applications, 2009. 36(3, Part 1):
p.4725-35.
[32] Shireen T, Shao C, Wang H, Li J, Zhang X and Li M. Iterative multi-task learning for time-series
modeling of solar panel PV outputs. Applied Energy, 2018. 212: p.654-62.
[33] Tahri F, Tahri A, Oozeki T. Performance evaluation of grid-connected photovoltaic systems based on
two photovoltaic module technologies under tropical climate conditions. Energy Conversion &
Management, 2018. 165: p.244-52.
[34] Stegner C, Dalsass M, Luchscheider P, Brabec CJ. Monitoring and assessment of PV generation based
on a combination of smart metering and thermographic measurement. Solar Energy, 2018. 163: p.16-
24.
[35] Maulik U, Bandyopadhyay S. Genetic algorithm-based clustering technique. Pattern Recognition, 2000.
33(9): p.1455-65.
[36] Shi Y, Eberhart RC. Empirical study of particle swarm optimization. 2002: p. 1945-50.
[37] Yang XS. Multiobjective firefly algorithm for continuous optimization. Engineering with Computers,
2013, 29(2): p.175-84.
[38] Mirjalili S, Mirjalili S, Lewis A. Grey Wolf Optimizer. Advances in Engineering Software, 2014. 69: p.
46-61.
[39] Mirjalili S. SCA: A Sine Cosine Algorithm for solving optimization problems. Knowledge-Based
Systems, 2016. 96: p.120-33.
[40] Mirjalili S, Mirjalili SZ, Saremi S, Faris H, Mirjalili S. Salp Swarm Algorithm: A bio-inspired optimizer
for engineering design problems. Advances in Engineering Software, 2017. 114: p. 163-91.
[41] Garrison J, Sahami K. A study of the division of global irradiance into direct beam and diffuse irradiance
at seven Canadian sites. Fuel & Energy Abstracts, 1995. 55(37): p. 201.