Forecasting solid waste generation in Juba Town, South ...

Forecasting weekly SW generation using ANNs and ARMA models in Juba Town, South Sudan

JEWM

Forecasting solid waste generation in Juba Town, South Sudan using Artificial Neural Networks (ANNs) and Autoregressive Moving Averages (ARMA) 1David Lomeling* and 2Santino Wani Kenyi

1,2Department of Agricultural Sciences, College of Natural Resources and Environmental Studies (CNRES), University of Juba, P.O. Box 82, Juba, South Sudan

Prediction of solid waste generation is critical for any long term sustainable waste management, especially of a fast-growing municipality. Lack of, or inaccurate solid waste generation records poses unparalleled challenges in developing cohesive and workable waste management strategies for any concerned authorities, as this is influenced by several interlinked demo-graphic, economic, and socio-cultural factors. The objective of this study was to compare two models in forecasting of MSW generation and how this would be built into an effective MSW management program. Two models, the Autoregressive Moving Average (ARMA 1,1) and the Artificial Neural Networks (ANNs) were tested for their ability to predict weekly waste generation of 14 households in Juba Town, Central Equatoria State (CES), South Sudan. Results showed that both the artificial intelligence model ANNs and the traditional ARMA model had good prediction performances; where for ANNs the RMSE, MAPE and r² were 0.080, 10.64%, 0.238 respectively, whereas for ARMA the RMSE, MAPE and r² were 0.102, 6.98% and 0.274 respectively. Both models showed no significant differences and could be therefore be used for Solid Waste (SW) forecasting. Based on the results, the weekly SW generated 52 weeks later (end of year) had reached 0.365 kg/capita indicating a 18.2% rise from 0.3 kg/capita at the start of the study. Under the current consumption rate, the weekly SW per capita in Juba Town is expected to reach 0.596 kg by 2020.

Keywords: Artificial Neural Networks, Autoregressive Moving Averages, Continuous Wavelet Transform, Waste Generation Forecasting, INTRODUCTION

South Sudan witnessed rapid economic growth rates with Gross Domestic Product (GDP) estimates of over $16billion (World Bank Report 2008) immediately after the signing of the Comprehensive Peace Agreement, CPA in 2005. Huge oil revenues from crude oil sales attracted investments and rapid population growth especially in urban centers like Juba. Conversely, this economic and population growth put enormous strain on the local environment and on the availability of natural resources (Lomeling et al., 2016) in terms of increased demand for areas for settlement as well as depleting the forest resources as cheap energy source. With the rapid population increase in Juba, waste generation

inadvertently also increased. This meant that more land and resources would have to be required for planned waste disposal site in the short to long term, if serious environmental pollution through indiscriminate waste disposal is to be abated.

*Corresponding author: David Lomeling, Department of Agricultural Sciences, College of Natural Resources and Environmental Studies (CNRES), University of Juba, P.O. Box 82, Juba, South Sudan. Email: [email protected]

Journal of Environment and Waste Management

Vol. 4(2), pp. 211-223, August, 2017. © www.premierpublishers.org. ISSN: XXXX-XXXX

Research Article

http://www.premierpublishers.org/


X2

Xn

w1

w2

wn

y

b

Lomeling and Kenyi 212 Prediction or forecasting of solid waste generation in Juba Town has become indispensable and crucial, if available resources are to be effectively deployed in the sustainable management of SW. Time series offers an important area of stochastic forecasting in which past observations of a specific variable are analyzed to develop a model that can be used to make future projections. Over the past decades, much effort has been undertaken in the development and improvement of time series models that can be applied for forecasting like the Artificial Neural Networks (ANNs) and the Auto-Regressive Moving Averages (ARMA) as compared to classical and traditional methods like linear and multiple regressions or polynomial functions. During the last two decades, several stochastic models have been used to predict SW waste like the support vector machine, (Abbasi et al., 2013); artificial neural networks (Abdoli et al., 2011); (Antanasijevic et al., 2013); hybrid procedure (Xu et al., 2013); time series analysis, (Mwenda et al., 2014); multi-step chaotic model (Song and He, 2014); principal component analysis and gamma testing (Noori et al., 2010); grey fuzzy dynamic modeling (Chen and Chang, 2010); Fourier series (Darko et al., 2016); simulated annealing based hybrid forecast (Song et al., 2014). In general, most ANN prediction models have clearly outlined architecture with specific number of input variables at the input layer and corresponding number of expected outputs at the output layer. Depending on the problem to be modeled, several input variables may be chosen for a given number of anticipated outputs. The choice of any one single, all-purpose model under any prevailing conditions is therefore unrealistic. Structurally, such a multi-purpose model would require more complex algorithms capable of handling several calculations simultaneously for some desired number of input variables and then come up with optimal predictions. The objective of this study was to compare the effectiveness of machine learning method (Artificial Neural Networks ANNs) and the stochastic linear model (Autoregressive Moving Average – ARMA) for medium to long-term weekly forecasting of solid waste generation in some households of Juba Town, South Sudan. The integration of Continuous Wavelet Transform (CWT) with both ANNs and ARMA models was primarily used for easy visualization and interpretation of input signals as well as frequencies in the time series. Using both models, a good estimate of the weekly SW per capita was to be made during the 2012-2020 forecasting period.

Forecasting Models

Artificial Neural Networks (ANN) ANNs is basically a computational approach whose architecture is mostly composed of three layers: an input, hidden and output layers mimicking the way biological neurons receive, transfer and output signals. Each neural

unit or perceptron is linked with many others and can either be enforcing once the summation function has surpassed some threshold value to be propagated or inhibitory in their effect once below the summation function value. The single neuron is illustrated by the McCulloch-Pitts Model (1943).

(input e.g. SW (b, bias that increases (output or

data) or lowers the net input of forecasted activation/sigmoid function) value) Figure 1: A model of a three-layers perceptron.

Mathematically, the output variable, y is the sum of the

individual weighted variables and bias that influences the

activation function S: (x1w1 + x2w2 + ⋯ xnwn)+b=y=𝑺 ∑ (xj

nj=1 wj + b)

Equation (1) The underlying concept is to arrive at a function that minimizes the error (E) between the input (actual) and output (forecasted) variables thereby enhancing the accuracy in the forecasting or prediction.

Emin(x,b)=𝑓[∑ x;w,b−yn

j=1 ]² Equation (2)

Usually the sums of each input signal (x1, x2, …xn) and intensity or weighted values (w1, w2,…wn) are passed on through a non-linear function known as an activation or transfer function that usually has a sigmoid shape, that is bounded, and differentiable as:

𝑺(x) =1

(1+e−x) Equation (3)

Many theoretical and experimental works have shown that a single hidden layer (with one or more several hidden nodes) is sufficient for ANN to approximate any complex nonlinear function (Dreiseitl and Ohno-Machado, 2002; Chattopadhyay and Bandyopadhyay 2007; Matias et al., 2013; Vishwakarma and Gupta, 2011; Aggarwal and Kumar 2015). A more plausible argument is that, the low number of nodes in the hidden layer directly linked to the input neuron have a low bias (b) and would tend to increase the input of the activation function. This in turn would enhance large changes in their weights and learn very quickly and so incur less errors as manifested by the high correlation coefficients for both training and test sets. In this study, a model based on a feedforward neural network with a single hidden layer was used. Hereby, the learning process in understanding hidden and strongly non-linear dependencies in the time series of the observed

X1

S

https://en.wikipedia.org/wiki/Non-linear_function

https://en.wikipedia.org/wiki/Activation_function

https://en.wikipedia.org/wiki/Transfer_function

https://en.wikipedia.org/wiki/Sigmoid_function


J. Environ. Waste Manag. 213

Figure 2a. Run plot sequence of weekly disposed solid waste showing seasonality in the time series

Figure 2b. Effects of alpha term (α) on exponential smoothing or the exponentially weighted moving average (EWMA)

and modeled data in the training and test were faster and the forecasting made easier. However, such forecasts can only be made for shorter prediction times and not for extensively longer future times as the error would tend to increase. Auto-Regressive Moving Average, ARMA model

The first step in developing the ARMA model was determining the stationarity of the time series in which case the mean and variance are time invariable. The autocorrelation function (ACF) may signify stationarity of the time series, if it cuts off or decomposes quite rapidly towards zero. Conversely, if the ACF decomposes very slowly and gradually towards zero, this would indicate non-stationarity and would need to be transformed or differenced to obtain stationarity by stabilizing the variance of a time series. Differencing helps stabilize the mean of a time series by removing changes in the level of a time series, and so eliminating trend and seasonality.

On the other hand, a time series that shows seasonality as in Figure 2a can be exponentially smoothened by an exponentially weighted non-parametric value (α) to “smoothen” the value X_t to a new value X ̂ recursively (Figure 2b):

X̂ = αXt + (1 − α)Xt−1 Equation (4)

where 0 ≤ α ≤ 1 Our data showed that the best seasonal exponential smoothing was when α=0,2 with r²=0,26; α=0,5 with r²=0,08 and α =0,8 with r²=0,04. The value α =0,2 best approximated the mean value and regression constant of the time series and therefore gave a better trend analysis. Owing to its simplicity, the exponential smoothing only “de-seasonalizes” a time series thereby assuming the nature of a linear regression equation between two variables. However, it´s predictive ability is inadequate in more complex stochastic and non-linear processes with more

y = 0,0011x + 0,3077r² = 0,04

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1 8 15 22 29 36 43

Kg

of S

W p

er

ho

use

ho

ld

Time.: Weeks

Obs. a=0,2 a=0,5 a=0,8

y = 0,0011x + 0,3077r² = 0,04

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1 8 15 22 29 36 43 50

Wee

kly

SW

/ho

use

ho

ld

Time. Weeks


Lomeling and Kenyi 214

than 3 or more input variables. Although some authors (Mwenda et al., 2014; Petridis et al., 2016); Rimaityte et al., 2011; Karpušenkaitė et al., 2016) have mentioned the theory behind single exponential smoothing, however, not much in terms of its practical applicability have yet been reported. The ARMA (p,q) model used herein is made up of an Autoregressive AR(p) and Moving Average MA(q) components. This means that, the forecast value of (X) at time (t) in a time series is a function of both linear combination of past X-innovations and a moving average

of series (𝜀t), known as white noise process characterized

by zero mean (𝜇 ) and variance (𝜎).

Xt = c + 𝜀t + ∑ αiXt−1 + ∑ βi𝑞i=1

𝑝i=1 𝜀t−i

Equation (5) With the values p and q identified from the ACF and PACF, the model parameters (αi) and (βi) can then be estimated. Once we had confirmed the stationarity of the time series, the autocorrelation (ACF) and partial autocorrelation functions (PACF) were used to determine the correlation and model structure of the data. MATERIALS AND METHOD This study focused on solid waste generated by single persons in 14 households in Kator residential area of Kator Payam, Juba County of Central Equatoria State in South Sudan. Collected waste was placed into a container whose tare weight was initially determined using the hanging scale. The net weight of the solid waste was then determined. The data were collected weekly over a period of 31 weeks as from June 2010 till January 2011. Each household had on average 6 persons with monthly income of about 650 SDG (Sudanese Guinee equivalent to 145$/month as of June 2010). About 40-60% of the collected waste was made up of predominantly degradable organic component consisting mainly of food residues and partly cartons and newspapers. The rest was made up of PET plastic water bottles. The waste was collected at the end of each week, weighed and the daily amounts per capita generated (kg/capita/week) was then calculated This work attempted to predict the weekly waste generated in the remaining 21 weeks till July 2011 based on the previous data. For the training set, data from the first 20 weeks were used representing 90% of the actual data. The rest 10 weeks representing 10% of the actual data were then used as test set. Data description and analysis From the weekly solid waste data reports, the Excel-based Alyuda Forecaster XL software was used to make future projections in the time series. Its algorithm allows an easy

data preprocessing of the neural networks. Additionally, the Continuous Wavelet Transfer (CWT) using the PAST3 software was used to illustrate through the spectral power the peaks or spikes of weekly solid waste disposal in the time series. Model identification

The effect of model choice on both correlation functions is shown in Figure 3 (a) and (b). The spikes presented in the ACF and PACF showed a correlation in the data every 2 lag units. The model identification revealed that with the cut-off at lag 1, the autoregressive of order p and the moving average of order q was also 1 as the ACF was got below zero after first lag. For illustrative purposes, the ARMA (1,2) as opposed to ARMA (1,1) was also used to compare the parameter values and how these influenced the model choice. The ARMA (1,2) in Figure 4 (a) and (b) as compared to ARMA (1,1) showed dissimilar AC and PAC functions at lag 1 and was outside the 95% confidence limits. The ARMA (1,1) was then chosen and there was therefore no need for any differencing.

Training Set: From this, 32 data entries of the weekly solid waste disposed per household were trained through a process of finding values for the weights (w) and biases (b) whereby the error between the measured and predicted values was minimized. The back-propagation algorithm was used here. The accuracy of the resulting training process was then applied for making projections in neural network model. Figure 5 shows the accuracy of the training set between the observed and forecasted values with a high r²=0.99.

Testing Set: This data set consisted of entries of the last weekly solid waste disposed per household and was used to test whether, or not the accuracy derived from the training process would provide the most appropriate solution and hence confirm the predictive power of the network model. Similarly, the test set showed high correlation coefficient of r²=0.99 as shown in the training set (Table 2).

Table 2: Comparison of the learning ability (MSE) between the training and test set of a ANN.

Training set Test set

Nr. of rows 32 12 Nr. of Good Forecast 30(94%) 12(100%) Average MSE 8,91E-05 1,25E-05 r² 0,99 0,99

Validation Set: Generally, this data set is used to minimize overfitting in the model. As in our case, given the high correlation coefficients of both the training and test sets and the high predictive power of the neural network model, there was therefore no need for a validation set. Figure 6 shows simulation runs of both the training and test sets and confirmed the accuracy of the ANN in predicting future solid waste data.



Figure 3. Autocorrelation plot of weekly per capita solid waste disposed in Juba (red dotted lines are upper and lower 95% confidence

limits). ACF plot of residuals of ARMA (1,1) with α1 = -0,999 and β1 = -0,800 (a) and PACF plot of residuals of ARMA (1,1) with α1 = -

0,999 and β1 = -0,800 (b).

Figure 4: ACF and PACF plot of residuals of ARMA (1,2) with AR (1) at α1 = -0,999 and AM (2) at β1 = -0,526 and β2 = -0,800

-0.60

-0.40

-0.20

0.00

0.20

0.40

0.60

0.80

1.00

1.20

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Au

toc

orr

ela

tio

n F

un

cti

on

, A

CF

Lag (a)

ACF residuals ARMA (1,1)

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Part

ial A

uto

co

rrela

tio

n

Fu

nc

tio

n, P

AC

F

Lag (b)

PACF residuals ARMA (1,1)

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Au

toc

orr

ela

tio

n F

un

cti

on

, A

CF

Lag (a)

acf residuals of ARMA (1,2)

-0.6

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

0 1 2 3 4 5 6 7 8 9 10 11 12 13

Au

toc

orr

ela

tio

n F

un

cti

on

, A

CF

Lag (b)

pacf residuals of ARMA (1,2)



Figure 5. Scatter plot of a ANN for both training and test set of a weekly per capita solid waste in Juba Town

Figure 6. Simulation of the training and test data sets

Figure 7. Error distribution of training and test sets Figure 7 shows the error distribution of both the training and test sets. Whereas the error margin by the training set varied between -0,03 and 0,03, for the test set, this was between -0,006 and 0,006. The error magnitude in the training set was ten-fold more than that in the test set. Presumably, the larger the data set, the larger the variance between the observed and forecasted and hence the larger is the error margin and vice versa.

RESULTS ANN Model Our simulation was based on a simple network architecture that returned the smallest MSE and therefore the best prediction accuracy. The MSE and (r²) recorded for both the training and test sets have already been

y = 0,9798x + 0,0069r² = 0,99

0.1

0.2

0.3

0.4

0.5

0.6

0.1 0.2 0.3 0.4 0.5 0.6

Fore

cast

ed

Actual

Training set

y = 0,9855x + 0,0069r² = 0,99

0.43

0.45

0.47

0.49

0.51

0.43 0.48 0.53

Fore

cast

ed

Actual

Test set

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1 4 7 10 13 16 19 22 25 28 31 34 37

Fore

scas

ted

Act

ual

Rows

Actual Forecasted

0.4

0.45

0.5

0.55

0.4

0.45

0.5

0.55

1 4 7 10 13

Fore

cast

ed

Act

ual

Rows

Actual Forecasted

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

1 7 13 19 25 31 37Erro

r

Rows

Error by training set n=40

-0.006

-0.004

-0.002

0

0.002

0.004

0.006

1 2 3 4 5 6 7 8 9 10 11 12 13 14Err

or

Rows

Error by test set n=12


J. Environ. Waste Manag. 217 Table 3. Parameters of ARMA (1,1) and (1,2) models of weekly SW generated in Juba Town, South Sudan

Parameters Standard Error, SE t-statistics p-Value AIC LLF

ARMA

(1,1)

AR (1)

AM (1)

-0,999 (α1)

-0,888 (β1)

0,073

7,643

7,66E-10

-

96,56

50,28

ARMA

(1,2)

AR (1)

AM (1)

AM (2)

-0,999 (α1)

-0,526 (β1)

-0,278 (β2)

0,000

12,738

1,08E-16

-

96,34

51,17

ARMA

(1,3)

AR (1)

AM (1)

AM (2)

AM (3)

-0,999 (α 1)

-0,632 (β1)

-0,420 (β2)

0,289 (β3)

0,000

13,464

3,304E-17

-

99,05

53,53

presented in Table 2, from where we observed 1-1-1 (1 input layer, 1 hidden layer, and 1 output layer) gives an accurate prediction of the weekly solid waste output. Applying the rule-of-thumb method for estimating the number of neurons in the hidden layer reported by Karsoliya (2012), we can assume that the number of neurons in the hidden layer is approximately 1. (or 70-90%) of the input layer. Noticeably, the number of hidden neurons is equal to the number of input nodes whereby the larger number of neurons would ostensibly lead to “overfitting” whereas with a relatively smaller number of neurons would lead to “underfitting”. The good fit (r²) for both training and test sets shows that the ANN modeled the observed data quite accurately. There is generally no “golden rule” for the number of hidden layers that is applicable for all non-linear time series. Whereas during the training process other problems are best predicted with two hidden layers (Srinivasan et al., 1994; Zhang, 1994; Baron, 1994). Other studies (Wanas, et al., 1998) showed that the best performance of a neural network occurred when the number of hidden neurons was equal to log (N), where N is the number of training samples. Another study conducted by Mishra and Desai (2006) showed that the optimal number of hidden neurons is (2n+1), where n is the number of input neurons. ARMA Model

Although model identification as per Box-Jenkins methodology clearly showed an ARMA (1,1) autoregressive (p) and moving average (q), we experimented with different q values to see to what extent this influenced not only model estimation but also the diagnostic checking and consequently the forecast. Three different model MA values were varied while the AR was kept constant. i.e. ARMA (1,1); (1,2) and (1,3) respectively. The best model parameters were selected based on the model that gave the least Akaike Information Criterion

(AIC) value and highest likelihood estimation here denoted as Logarithmic Likelihood Function (LLF) Table 3. Judging by the values of AIC and LLF, it is evident that experimented values at ARMA (1,2) and (1,3) are close to the actual ARMA (1,1) values and suggested the adequacy of the ARMA (1,1) model. The scatter plot in Figure 8 shows a positive trend with several points around the trend line. The relatively low (r²) values suggested that there was neither an under- or overestimation for both ANN and ARMA models with most points between the 0,3-0,4 kg/week for both the observed and ANN-ARMA models. Whereas the ANN model had a r²-value of 0,238 this was 0,274 for ARMA model, these r²-values for both models were not significantly different from each other. The comparatively lower r²-value of the ANN would suggest the inability of a linear function in describing an entirely non-linear time series data. Performance evaluation The performance of either of the models was determined by measuring the difference between the observed and predicted values in the time series. Best estimates were those error values that were closer to zero, indicating less differences between the measured and observed values. Three accuracy measures were used as follows:

(1) Mean Absolute Percentage Error (MAPE):

MAPE

=1

𝑛∑ |

Pobs − Ppred

Pobs| 100

𝑛

𝑡=1

Equation (6)

(2) Root Mean Square Error (RMSE):



Figure 8. Comparing the ARMA (p, q) and ANN for the observed and forecasted weekly SW per capita

RMSE = √1

n(Pobs − Ppred)² Equation (7)

Weekly SW generation for the first 32 weeks which included both the training and test sets were later combined with the lead time 20 weeks and the errors between observed and predicted assessed as in Table 3. Based on the MAPE, RMSE performance comparison between both models, there was however no significant difference between both models. The slightly higher MAPE value for the ANN model could be due to inherent drawback of the MAPE in overestimating error value especially when the difference between observed and predicted values is zero. (Pobs. = 0). Table 4. Comparison of the model performances in terms of MAPE and RMSE of both ANNS and ARMA models.

ANN ARMA

MAPE (%) 10,60 6,98

RMSE 0,080 0,102

Ideally, the ARMA model would have shown poor performances in both MAPE and RMSE due to its inability to model non-linear variables as the ANN model. In complex and non-linear data in the time series, the ARMA model would inevitably lose predictive accuracy as it is unable to adequately capture the errors between the observed and forecasted values in the time series and so the prediction errors would increase (see Figure HH). Conversely, the ANN model is capable of identifying nonlinearity in the time series by having an additional computation or hidden layer that allows for better curve-

fitting with minimal errors between the forecasted and observed data. Diagnostic checking Based on the autocorrelation plot, the Ljung-Box (1978) test attempts to establish the overall data randomness within a time series at some chosen or predetermined lags. Basically, the null hypothesis (H0) assumes that the data are random or independently distributed whereas this is

the contrary for the alternative hypothesis (Ha) that assumes the non-randomness nature of the data. We then performed the Ljung-Box statistic test as:

Q (LB)= ∑𝑝𝑘

2

𝑛−𝑘

𝑁𝐾

𝑘=1 Equation (8)

Where n=sample size, �̂�𝑘2=sample correlation at lag k,

𝑁𝐾=number of lags being tested. Choosing the �̂�𝑘2 at lag 7

and testing at the p=0,05 confidence level, the Q(LB) value at 0,645 was less than the (chi-square) at 2.013 thereby

reaffirming the H0 and adequacy of the ARMA (1,1) model. As aforementioned the Q(LB) for randomness is reinforced by the autocorrelation functions as in Figures 3 and 4 Hereby, all the autocorrelation at the subsequent lags fall within the 95% confidence limits, other than that at lag 0. Model verification Model verification dealt with ascertaining whether the residuals of the ARMA model as expressed by the ACF and PACF had any discernible and systematic patterns

y = 0,3139x + 0,2314r² = 0,274

y = 0,3355x + 0,2367r² = 0,238

0

0.1

0.2

0.3

0.4

0.5

0.6

0

0.1

0.2

0.3

0.4

0.5

0.6

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7

AR

MA

AN

N

Actual

ARMA Forecast ANN Forecast



Figure 9. Forecasting SW using the ANN and ARMA (1, 1) models

Figure 10. Weekly SW forecast of households in Juba Town using the ARMA (1,1) model

with respect to the lags. Our study showed that, none of these correlations was significantly different from zero at the 95% confidence limit indicating the goodness of fit and appropriateness of the ARMA model. Forecasting The two forecasting models ARMA and ANNs presented in this paper (Figures 9 and 10) allowed us to predict the weekly generated SW with mean value of about 0.35 kg/household and upper and lower limits of about 0.63 and 0.16 kg/household respectively. Towards the 51st week, the generated SW was well below the 0.4 kg level and would under constant economic and political conditions remain below the 0.5 kg level till 2020. This forecasting is useful in mobilizing and optimizing available financial

resources and personnel needed for effective SW management.

2.2 Continuous Wavelet Transform (CWT)

One of the main goals of a CWT is to enable the easy visualization and interpretation of input signals and frequencies as a function of time. The CWT decomposes a continuous time function of a time series into the components called wavelets each with a different localized frequency. Although CWT are usually applied for non-stationary signals, we tried to apply both the Morlet wavelet transform and the Derivative of Gaussian (DOG) to account for the high instantaneous amplitude or signal outbursts in the time series as well as in the recognition of inherent frequency patterns. The Morlet wavelet transform (Goupillaud, et al., 1984) is given as:

y = 0,0011x + 0,3077r² = 0,04

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1 8 15 22 29 36 43 50

AN

N

AR

MA

(1,

1)

Time. Weeks

Obs. ARMA (1, 1) ANN

y = 0,0011x + 0,3033r² = 0,06

-

0.10

0.20

0.30

0.40

0.50

0.60

0.70

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

1 6 11 16 21 26 31 36 41 46 51

Fo

reca

st

Actu

al

Time: Weeks

SW forecast using ARMA (1, 1) model

Actual Forecast



𝜗𝑤0(𝑡) = 𝐾𝑒𝑖𝑤0𝑡𝑒

−𝑡2

2 Equation (9)

Where 𝑤0 is the non-dimensional frequency and the vertical scale corresponds to the length of wavelet, i.e., the number of time steps used for the CWT. The cone of influence (COI) is the region where the wavelet power spectra are limited due to the influence of the end points of finite length signals also known as the e-folding time. Here, the signal discontinuity drops by a factor of (e-2) and ensures that edge effects are negligible beyond this point. The DOG with m=derivative was set at 6 and expressed as:

DOG = (−1)𝑚+1

√(𝑚+1

2)

𝑑𝑚

𝑑𝑚 (𝑒−2

2 ) Equation (10)

Values of m=2 or 4 using DOG basis functions did not describe the spectral decomposition in the time series adequately. The choice of the wavelet used for time-frequency decomposition in a time series is critical. The use of the Morlet wavelet for example, showed that the frequency resolution was “lumped” together and was localized within a 95% confidence limit. On the contrary, using Derivative of Gaussian (DOG) wavelet with m=6, (Figure 11 a and b), the result was good time localization with strong frequencies. The Morlet and Derivative of Gaussian (DOG) wavelets are plotted below. The choice for the type of wavelets in interpreting the frequency and intensity of data entries (xn) in a time series are shown in Figures 11 and 12. We used the forecasted data from both models as input data to generate the respective wavelets. For both models, (Figure 11a and b), the dominant power spectrum was characterized by smaller spikes between log scale 1.2 and 3.2 from week15 to 30. This time coincided with the highest weekly waste generation around Christmas season where expectedly, more households had much disposable incomes enabling an increased consumption and so increased solid waste generation. However, the ARMA as opposed to the ANNs model showed certain areas outside the COI, clearly an indication of model overestimation by the former. In both cases, using the Morlet basis function (Figure 11c and d) showed poor spectral decomposition as a function of time than the DOG m=6.

As aforementioned with the Ljung-Box randomness test in a time series with the ARMA model, the resulting power spectrum using the CWT can as well exhibit coherent and significant structures. A test of significance can be used to distinguish between significant and random structures (Mohr, 2003) in which case values in a power spectrum may be considered as statistically significant at the 95% level and therefore not random.

For the null hypothesis (H0), we assumed that the time series had a mean power spectrum. Spikes or peaks in the wavelet power spectrum above this background spectrum were shown as black contoured spectrum “significant at the 5% level” or equivalent to “the 95% confidence level”. Therefore, if the peak in the power spectrum of the Morlet and DOG wavelets are significantly larger than the background spectrum, it is then assumed to be a true feature. There was better interpretation of the spectral decomposition for the observed data as well as for both models when the DOG m=6 as opposed to Morlet.basis function was applied. SUMMARY AND CONCLUSION In this study on time series models, we analyzed and compared the Artificial Neural Networks (ANNs) and the Autoregressive Moving Averages (ARMA) in forecasting the weekly amounts of solid waste generated by single persons in fourteen households of Kator residential area of Juba town. For the ANNs model, the input training data used were the average weekly amounts of solid waste collected from during the month of June 2011. The test data of July 2011 were then used to validate the ANNs model. The result showed that ARMA (1,1) slightly outperformed the ANNs model in terms of MAPE but not in terms of the RMSE. However, considering both performance indicators, there was no significant differences between both models and so either model could be used to forecast the amount of the solid waste generation for the next weeks and years. Using both models is, however, for comparative reasons imperative in order qualify and quantify the extent of deviation of the estimated values from the observed mean. The projected values showed that by 2020, the weekly generation according to both models will have reached about 0.596 kg/capita with a 95% confidence interval lying between 0.2-0.6 kg/capita. Such projections may be used by Juba municipal or town council in the proper planning and management of solid waste. ACKNOWLEDGEMENTS My special gratitude goes to Mr. Santino Wani Kenyi for data collection as part of his BSc dissertation. We the authors also wish to extend our thanks to the households in Kator Payam for allowing the conduction of the interviews.



(a) (b)

(c) (d)

Figure 11. (a) Weekly solid waste output signals for the ARMA model when using the DOG m=6 (b) for the ANNs models. Whereas when using the Morlet basis function for ARMA (d) for ANNs model. Wavelet power spectrum showing cone of influence (COI) at the 95% confidence interval with areas of intense peaks and signals in red contoured in (black) while those with poor signals in (blue).



(a) (b)

Figure 12. (a) Weekly solid waste output signals using the Morlet basis function for the observed data and (b) using the DOG m=6 basis function. Wavelet power spectrum showing cone of influence (COI) at the 95% confidence interval with areas of intense peaks and signals in red contoured in (black) while those with poor signals in (blue).

REFERENCES Abbasi M, Abduli MA, Omidvar B, Baghvand A (2013).

Results uncertainty of support vector machine and hybrid of wavelet transform-support vector machine models for solid waste generation forecasting. Environmental Progress and Sustainable Energy. 33:220.

Abdoli, MA, Nezhad MF, Sede RS, Sadegh B (2011). Environmental Progress and Sustainable Energy. 31 (4): 628–636.

Aggarwal R, Kumar K (2015). Effect of Training Functions of Artificial Neural Networks (ANN) on Time Series Forecasting. International Journal of Computer Applications. (0975 – 8887), 109(3): 14-17.

Antanasijevic D, Pocajt, V, Popovic I, Redzic N, Ristic M (2013). The forecasting of municipal waste generation

using neural networks and sustainability indicators. Sustain Science. 8: 37-46.

Asante-Darko D, Adabor ES, Amponsah SK (2012). A Fourier series model for forecasting solid waste generation in the Kumasi metropolis of Ghana. Transactions on Ecology and The Environment. 202: 173-185.

Barron, AR (1994). A comment on ‘‘Neural networks: A review from a statistical perspective’’. Statistical Science. 9 (1): 33–35.

Chattopadhyay, S and Bandyopadhyay, G (2007). Single hidden layer artificial neural network models versus multiple linear regression model in forecasting the time series of total ozone. Int. J. Environ. Sci. Tech. 4 (1): 141-149.

Chen HW and Chang NB (2000). Prediction analysis of solid waste generation based on grey fuzzy dynamic modeling. Resour. Conserv. Recycling. 29, 1-18.


J. Environ. Waste Manag. 223 Dreiseitl S, Ohno-Machado L. (2002). Logistic regression

and artificial neural network classification models: a methodology review. J Biomed Inform. 35: 352–359.

Foster G (1996). Wavelets for period analysis of unevenly sampled time series. Astron J. 112: 1709-1729.

Frick P, Baliunas SL, Galyagin, D (1997). Wavelet analysis of stellar chromospheric activity variations. Astrophysics J. 483: 426-434.

Goupillaud, P., Grossman, A., Morlet, J (1984). Cycle-octave and related transforms in seismic signal analysis. Geo-exploration. 23: 85.102.

Karpušenkaitė A, Denafas G, Ruzgas T. (2016). Forecasting Hazardous Waste Generation using Short Data Sets: Case Study of Lithuania. Environmental Protection Engineering. 8(4): 357–364.

Karsoliya S. (2012). Approximating Number of Hidden layer neurons in Multiple Hidden Layer BPNN Architecture. International Journal of Engineering Trends and Technology. 3(6): 714-717.

Ljung, GM, Box, GEP (1978). On a Measure of a Lack of Fit in Time Series Models”. Biometrika. 65(2), 297–303.

Lomeling D, Modi, A. L., Kenyi, M. S., Kenyi, M. C., Silvestro, G. M., Yieb, J. L. L. (2016): Comparing the Macroaggregate Stability of Two Tropical Soils: Clay Soil (Eutric Vertisol) and Sandy Loam Soil (Eutric Leptosol). International Journal of Agriculture and Forestry. 6(4): 142-151.

Matias T, Souza F. Araújo R, Carlos AH (2013). Learning of a single-hidden layer feedforward neural network using an optimized extreme learning machine. Neurocomputing. 129: 428–436.

McCulloch W and Pitts W (1943). A logical calculus of the ideas immanent in nervous activity. Bulletin of Mathematical Biophysics. 5:115–133.

Mishra AK, Desai VR. (2006). Drought forecasting using feed-forward recursive neural network. Ecological Modelling. 198 (1–2): 127–138.

Mohr, LB (2003). Understanding significance testing. Sage Pub., Inc., 2003.

Mwenda A, Kuznetsov D, Mirau S (2014). Time series forecasting of solid waste generation in Arusha city- Tanzania. Mathematical Theory and Modelling. 4 (8): 29–39.

Noori R, Karbassi A, Salman SM. (2010). Evaluation of PCA and Gamma test techniques on ANN operation for weekly solid waste. J Environ Management. 91(3):767-71. doi: 10.1016/j.jenvman.2009.10.007.

Petridis NE, Stiakakis E, Petridis K, Dey P (2016). Estimation of computer waste quantities using forecasting techniques. Journal of Cleaner Production. 112: 3072-3085.

Rimaityte I, Ruzgas T, Denafas G, Racys V, Martuzevicius D (2011). Application and evaluation of forecasting methods for municipal solid waste generation in an eastern-European city. Waste Management & Research. 30(1) 89–98.

Song J, He J, Zhu M, Tan D, Zhang Y, Ye S, Shen D, Zou P (2014). Simulated Annealing Based Hybrid Forecast for Improving Daily Municipal Solid Waste Generation Prediction. The Scientific World Journal. 1-7.

Song J and He J (2014). A Multistep Chaotic Model for Municipal Solid Waste Generation Prediction. Environmental Engineering Science. 31(8): 461-468.

Srinivasan D, Liew AC, Chang, CS. (1994). A neural network short-term load forecaster. Electric Power Systems Research. 28: 227–234.

Sweldens W. (1998). The lifting scheme, a construction of second generation wavelets. SIAM J Math. Anal. 8(29): 511-546.

Vishwakarma VP and Gupta MN (2011). A New Learning Algorithm for Single Hidden Layer Feedforward Neural Networks. International Journal of Computer Applications. (0975 – 8887), 28(6):26-33.

Wanas N, Auda G, Kamel MS, Karray F. (1998). On the optimal number of hidden nodes in a neural network. Proceedings of the IEEE Canadian Conference on Electrical and Computer Engineering., 918–921.

Xu L, Gao P, Cui S, Liu C (2013). A hybrid procedure for MSW generation forecasting at multiple time scales in Xiamen City, China. Waste Management, 33(6):1324-31.

Zhang X. (1994). Time series analysis and prediction by networks. Optimization Methods and Software. 4: 151–170.

Accepted 23 August 2017

Citation: Lomeling D and Kenyi SW (2017) Forecasting

solid waste generation in Juba Town, South Sudan using

Artificial Neural Networks (ANNs) and Autoregressive

Moving Averages (ARMA). Journal of Environment and

Waste Management 4(2): 211-223.

Copyright: © 2017 Lomeling and Kenyi. This is an open-

access article distributed under the terms of the Creative

Commons Attribution License, which permits unrestricted

use, distribution, and reproduction in any medium,

provided the original author and source are cited.

http://dx.doi.org/10.1093/biomet/65.2.297



http://www.sciencedirect.com/science/article/pii/S0925231213009314

http://www.sciencedirect.com/science/article/pii/S0925231213009314

http://www.sciencedirect.com/science/journal/09252312

http://www.sciencedirect.com/science/journal/09252312/129/supp/C

https://en.wikipedia.org/wiki/Warren_McCulloch

https://en.wikipedia.org/wiki/Walter_Pitts

https://www.ncbi.nlm.nih.gov/pubmed/?term=Noori%20R%5BAuthor%5D&cauthor=true&cauthor_uid=19913989

https://www.ncbi.nlm.nih.gov/pubmed/?term=Karbassi%20A%5BAuthor%5D&cauthor=true&cauthor_uid=19913989

https://www.ncbi.nlm.nih.gov/pubmed/?term=Salman%20Sabahi%20M%5BAuthor%5D&cauthor=true&cauthor_uid=19913989

https://www.ncbi.nlm.nih.gov/pubmed/19913989/

Forecasting solid waste generation in Juba Town, South ...

Documents

Transcript of Forecasting solid waste generation in Juba Town, South ...