What is crude oil? Crude oil -.. What is crude oil? Crude oil - Oil we find underground.
Testing Crude Oil Market Efficiency Using Artificial ... · 2. Test of WTI crude oil market weak...
Transcript of Testing Crude Oil Market Efficiency Using Artificial ... · 2. Test of WTI crude oil market weak...
1
Testing Crude Oil Market Efficiency Using Artificial Neural Networks
Manel HAMDI
International Finance Group Tunisia, Faculty of Management and Economic Sciences of
Tunis, Tunisia, El Manar University, Tunis cedex, C.P. 2092, El Manar Tunisia
Phone: +21697551459
Email: [email protected]
Abstract
This paper evaluates the weak-form efficiency of the crude oil markets using the artificial
neural network (ANN) model. Based on the daily historical data of the West Texas
Intermediate (WTI) crude oil spot price over the period (02 January 1986- 31 December
2013), the model was trained using backpropagation algorithm. The output of the neural
network represents the predicted prices which are considered as trading signals (buy or sell)
for investors. Furthermore, an empirical investigation of profitability has been conducted.
Compared to a naïve trading strategy as the Random Walk (RW), the profitability results
show that ANN model outperformed the RW model. Therefore, the crude oil market is
inefficient according to the Efficient Market Hypothesis (EMH). From these findings, we can
argues that is possible to earn excess profits by making trading strategy based on the
information embedded in the historical crude oil prices. Finally, the proposed neural network-
based approach becomes an interesting trading rule for the practitioners to make or to support
their investment decisions.
JEL Classification: G14, Q47, C45.
Keywords: Efficient market hypothesis, crude oil price prediction, artificial neural network.
2
1. Introduction
Efficient Market Hypothesis (EMH) is of special interest for the financial institutions and
organisms. According to Fama (1970, 1991), a market is efficient in a weak form if the asset
price reflect immediately all new available and relevant information.
Despite it’s an interesting topic of research, there is no numerous studies that focused on this
field especially for the oil commodity. To first, Tabak and Cajueiro (2007) have investigated
the crude oil markets efficiency, including WTI and Brent. The authors have estimated the
fractal structure of these time series using the time-varying Hurst exponent test for a data
sample of daily closing prices covering the period (May, 1983 - July 2004). Their results
pointed out the presence of a fractal structure in oil price time series. Moreover, they showed
that crude oil market efficiency is changing over time and they have become more efficient
over time. According to these authors, the WTI crude oil price seems to be more weak form
efficient than Brent prices. Similar conclusions were provided by Alvarez-Ramirez et al.
(2008), which concluded that for long times the crude oil market is consistent with the weak-
form efficient market hypothesis (WFEMH hereafter). The authors have examined the auto-
correlation of international crude oil markets by estimating the Hurst exponent dynamics for
several typical oil mixtures (Europe Brent, WTI Cushing and Dubai) commodity time series
returns for the sample period (1987-2007). In another research, Maslyuk and Smyth (2008)
have also analyzed the WFEMH in the crude oil markets based on unit root tests. Using
weekly spot and future prices including WTI and Europe Brent (January 1991-December,
2004), the authors showed that future spot and futures prices cannot be predicted based on
historical price data and concluded that crude oil markets seem to be weakly efficient. In a
more recent research, Charles and Darné (2009) have investigated the WFEMH by testing the
random walk hypothesis from variance ratio tests. More precisely, the authors have employed
a non-parametric variance ratio tests suggested by Wright (2000), Belaire-Franch and
Contreras (2004) as well as the wild-bootstrap variance tests developed by Kim (2006). Using
daily closing spot prices for two crude oil markets (US WTI and the UK Brent) over the
period (June 1982- July 2008), the authors revealed that the Europe Brent crude oil market is
weak-form efficiency while the WTI crude oil market seems to be inefficient over the period
(1994-2008). In another work, Alvarez-Ramirez et al. (2010) used a lagged version of the
detrended fluctuations analysis to study the efficiency of crude oil market and to detect delay
effects in spot WTI prices autocorrelations over the period 1986 to 2009. Based on their
3
empirical findings, they concluded that negative or positive autocorrelations can be concealed
by delay effects. Using weekly spot FOB crude oil prices for four OPEC members as also
represent four countries of the golf cooperation council (GCC): Kuwait, Qatar, Saudi Arabia,
and the United Arab Emirates (UAE); Arouri et al. (2010) applied a state space model to
prove strong evidence of short-term predictability in crude oil price movements over time.
Nevertheless, the hypothesis of convergence towards weak-form informational efficiency
cannot be verified for all markets. More recently, Ortiz-Cruz et al. (2012) employed
multiscale entropy analysis techniques to investigate the informational efficiency of the crude
oil markets. Results based on daily closing spot prices of WTI running from January, 1st,
1986 to March 15, 2011 shown that crude oil market is an informational efficiency market
overall the period except the early 1990s and the late 2000s US economic recessions. In this
context, neural networks-approach is applied for market forecasting and trading issue.
In section 2, we describe the ANN proposed model to verify EMH and the data sample used
for this purpose. Moreover, an empirical investigation and results are explored in the same
section. Finally, we conclude in section 3.
2. Test of WTI crude oil market weak form efficiency : Empirical investigation
2.1. Data sample description
A sample of WTI crude oil spot price (see Fig.1), for the period running from 2nd January,
1986 to December 31, 2013 ; is used to predict the future value of WTI crude oil price. The
daily data was provided by US Energy Information administration website. 80% of the data
set (5651 observations) represents the training sample that is used to estimate the parameters
of network (synaptic weights and bias) and the remainder (1413 observations) is the checking
sample which is designed to test the predictive ability of network.
Figure 1. The crude oil spot price of WTI (Time:7064 working day)
4
As illustrated in figure 1, the crude oil market is characterized by high volatility and also
marked by outstanding peaks and falls due to the effect of the unpredictable events (wars,
embargoes, crisis, revolution…) which have occurred in the history of oil market.
Corresponding to Ghaffari and Zare (2009), we introduce a smoothing algorithm in order to
reduce the effect of unforeseen short term disturbances of oil market while maintaining the
main and long term characteristics of the dynamic of crude oil market. In this study, we take
the default method of smoothing provided by the Matlab software packages as the 5th order of
moving average filter (see Fig.2)
Figure 2. The smoothed crude oil spot price of WTI (Time:7064 working day)
The observed actual and smoothed times series of WTI ranging from January 02, 1986 to
December 31,2013 are depicted in Fig.3. Thus, the error (see Fig.4) is the difference between
actual and smoothed price value.
Figure 3. Actual vs. Smoothed crude oil spot price of WTI (Time:7064 working day)
Figure 4. Error between actual and smoothed crude oil spot price of WTI (Time:7064
working day)
0 1000 2000 3000 4000 5000 6000 7000 80000
50
100
150
WTI
cru
de o
il pric
e (U
S$ p
er b
arre
l)
Working day
Actual crude oil price
Smoothed crude oil price
0 1000 2000 3000 4000 5000 6000 7000 8000-10
-5
0
5
10
15
Error
Working day
5
In order to demonstrate the utility of the proposed smoothing function, we choose to illustrate
the two plot of actual and smoothed price (see Fig.5) only over the month of July/2008,
during which the price of crude oil reached its highest value (145.31 US$/barrel). After
introducing the smoothing procedure, the price has decreased of 3.81 US$/barrel. We can
conclude that the smoothing procedure has advantage to reduce the short term noises effect.
Figure 5. Actual vs. Smoothed crude oil spot price of WTI (July 2008)
2.2. Artificial neural network model
2.2.1. ANN Structure
Artificial neural network is a nonlinear model inspired from human brain functioning by
adopting the same mode of acquiring knowledge through learning process. The standard
design of ANN consists generally of an input layer (that contains, in our case, the historical
smoothed prices (Pt) of WTI crude oil), one or more hidden layers and an output layer (that
presents the predicted future prices (Pt+1) of WTI crude oil) interconnected among them as
depicted in Figure 6.
Figure 6. Fully interconnected neural network with one hidden layer
0 5 10 15 20 25120
125
130
135
140
145
150
X: 3
Y: 145.3
X: 14
Y: 131.4
X: 3
Y: 141.5
X: 14
Y: 128.2
X: 9
Y: 145.2
X: 9
Y: 141
X: 21
Y: 126.7
X: 21
Y: 124.6
July 2008
WTI
cru
de o
il pr
ice
(US$
per
bar
rel)
Actual oil price
Smoothed oil price
6
The state of the output neuron is determined by the following formula :
(2) (1)
1 2 1 1 2
=0 =0
=m N
t kj ij t
j i
P g w g w P b b
(7)
Where; tP are the inputs of the network; N is the total observations of input prices; m is the
number of nodes in the hidden layer; k is the number of units in the output layer; g is the
Transfer/activation function; )1(w is the weights matrix of the hidden layer; )2(w is the weights
matrix of the output layer; 1b and
2b are the bias vectors of the hidden layer and output layer,
respectively.
2.2.2. ANN Topology
A lot of expert knowledge and several combinations of experiments are needed to obtain an
optimal ANN topology because there are no scientific rules to find the best configuration of
ANN for a particular problem (Lackes et al., 2009).
Then, several factors must be controlled to select the optimal network architecture :
- The number of hidden layers
In our experiment, we used one or/and two hidden layers that is the ideal architecture for
providing a good forecasting results (Zhang et al., 1998).
- The choice of activation function
According to Haykin (1999), the sigmoid and the hyperbolic tangent functions are the mostly
used in financial applications. In this study, we use the hyperbolic tangent function as transfer
function (g1) of the network similarly to the recent financial applications (Yonaba et al, 2010 ;
Jammazi and Aloui, 2012).
- Learning rate & training algorithms
The network was trained with backpropagation algorithm, precisely with Levenberg-
Marquardt algorithm as it’s the fastest training function (Kulkarni and Haidar, 2009). After
saveral experiments, we choose the learning rate equal to 0.01 because based on this value we
found the best solution. We note that, also Haidar et al. (2008) has used the same value of
learning rate.
- The number of hidden nodes
There are no universal standards to define the number of hidden neurons. The ideal is to use
the least amount of units which allow to achieve the best prediction results, as too many nodes
7
could deduce an overfitting problem and too few could cause an underfitting problem
(Kaastra and Boyd, 1996).
In this study, we follow the similar approach employed by Rosiek and Batlles (2010) and
Haidar et al. (2008) to determine the number of hidden neurons. This approach consists of
training and testing the network to a fixed number of iterations (1000 iterations in our
experiment), beginning with small number of units and add the number gradually until the
optimal number of nodes is reached. In this work, we try with a maximum number of hidden
neurons equal to 20 and started with one hidden node. According to Table 1 and Figure 7, the
best results of out-of-sample (minimum of
2
1
1Target Output
n
i i
i
MSEN
) is obtained with
10 hidden neurons.
Nbre of hidden nodes MSE value Nbre of hidden nodes MSE value
1 1,8426294 11 2,0841049
2 1,8986168 12 2,2036816
3 1,9962819 13 2,204476
4 2,1632104 14 2,9094379
5 2,1104935 15 1,8470987
6 2,1625132 16 2,5390865
7 2,2531985 17 1,8554263
8 2,3961847 18 2,7154749
9 2,5702552 19 2,3407171
10 1,3171001 20 1,7725174
Table 1. MSE statistics vs. the increase of hidden neurons
Figure 7. MSE evolution vs. the increase of hidden neurons
0 2 4 6 8 10 12 14 16 18 20
1.4
1.6
1.8
2
2.2
2.4
2.6
2.8
3
Nbre of nodes
M S
E
8
As conclusion, the proposed model is a single layer backpropagation neural network with ten
hidden neurons, the hyperbolic tangent as activation function in the hidden layer and the
linear transfer function (g2) in the output layer.
2.3.Empirical results and Interpretation
Once the training process was completed, we proceed to judge the quality of prediction of
ANN model. To do this, we will subsequently compare between the estimated (predicted) and
the real (actual/target) values of crude oil price.
Before starting this step, we must verify the quality of ANN training. Therefore, the
performance of a trained network can be measured by the correlation coefficient (R-value)
derived from the regression analysis presented in Fig. 8.
Figure 8. Comparison between the estimated and actual values of crude oil price : in training
part
According to Fig. 8, the dashed line indicates the best linear fit (ANN outputs equal to
targets). The circles represent the data points and the solid line colored blue reflects the best
fit between network responses and targets. In this empirical study, it is difficult to distinguish
the best linear fit line from the perfect fit line because the fit is so good. Moreover, more the
R-value is close to 1 more there is perfect correlation between targets and outputs. In our
study, the R-value (0.99972) is very close to 1, which indicates a good fit.
After checking the quality of the learning network, we analyse the predictions results based on
the Root Mean Square Error (RMSE), the Mean Absolute Error (MAE) performance
measures, and also based on the correlation coefficient (R-value) as a decisive factor in this
specific problem see (Table. 2).
9
MSE RMSE MAE R
1.3171
1.1476
0.8740 0.9988
Table 2. Performance criteria
As accuracy is the most important criteria to judge the forecasting models, we select the two
main metrics RMSE and MAE which can be expressed as follow :
2
1
1 Target Output (1)
n
i i
i
RMSEN
1
1 Target Output (2)
n
i i
i
MAEN
Where N= (i=1…..1413) is the total of checking sample.
According to equation 1 and 2, the RMSE and MAE values of the proposed neural network
are found as 1.1476 and 0.8740, respectively. The MAE is the absolute value of the difference
between target and output values divised by the total observations in test part, therefore the
value (0.8740) reflects the noticeable accuracy of neural network model. The RMSE will
always be larger or equal to the MAE (Caner et al., 2011), in our case the RMSE value
(1.1476) is slightly larger than the MAE value. This finding confirm the performance of the
network in forecasting task. Another indicator was chosen to verify this point as the R-value.
Similarly to the regression analysis carried out in the learning phase, the same analyse has
conducted in the testing phase (see Fig. 9).
Figure 9. Comparison between the estimated and actual values of crude oil price : in checking
part
10
The network responses are plotted versus the targets in Fig. 9. Three variables are returned by
the analysis regression plot. The first variable x represents the slope of the best linear
regression relating targets to network outputs. The slope would be 1 If there were a perfect fit
(outputs exactly equal to targets). The second variable y is the intercept of the best linear
regression relating targets to network outputs, and the intercept constant would be around 0.
Finally, the third variable is the R-value which would be very close to one. Our results (x=
1.0012, y=-0.5061 and R=0.9988) show that ANN is a good prediction model.
The following figure illustrates the plot of comparison between actual and forecasted crude oil
price.
Figure 10. A plot of actual and forecasted crude oil price : in checking part
In order to illustrate our empirical results, we have arbitrarily selected 30 consecutive trading
days from our test sample (see Table. 1).
0 500 1000 150020
40
60
80
100
120
140
160
WTI cr
ude oil
price (U
S$ per
barrel)
Checking part (1413 observations)
Predictive value
Real value
Date Predictive value Real value Error
févr 13, 2012 100,2859769 100,318 0,03202314
févr 14, 2012 100,6994656 100,808 0,1085344
févr 15, 2012 101,1467217 101,726 0,5792783
févr 16, 2012 102,3083843 102,824 0,51561568
févr 17, 2012 103,9067445 103,858 -0,04874452
févr 21, 2012 105,0155665 104,982 -0,03356653
févr 22, 2012 105,7721108 106,394 0,62188922
févr 23, 2012 106,8475541 107,438 0,59044593
févr 24, 2012 108,1399367 107,58 -0,55993669
févr 27, 2012 108,339128 107,798 -0,54112803
févr 28, 2012 108,6459273 108,062 -0,58392733
11
Table 3. The difference between actual and forecasted WTI crude oil price for 30 consecutive
trading days
According to the above table, the neural network is a perfect forecasting model, as the
maximum difference between actual and forecast is 1.5$.
These findings prove the perfect ability of the neural network model to predict the crude oil
market, however, an efficient market would have no predictability. Therefore, the crude oil
market is inefficient in the Fama sence.
févr 29, 2012 109,0106511 107,52 -1,49065113
mars 01, 2012 108,2547535 107,162 -1,09275348
mars 02, 2012 107,7618855 106,786 -0,97588546
mars 05, 2012 107,2833911 106,602 -0,68139107
mars 06, 2012 107,0700915 106,18 -0,8900915
mars 07, 2012 106,6394228 106,324 -0,3154228
mars 08, 2012 106,777172 106,252 -0,52517198
mars 09, 2012 106,707126 106,65 -0,05712604
mars 12, 2012 107,1242766 106,516 -0,60827661
mars 13, 2012 106,9756504 106,224 -0,75165041
mars 14, 2012 106,6805212 106,15 -0,53052119
mars 15, 2012 106,6118926 106,5 -0,11189259
mars 16, 2012 106,9584571 106,296 -0,66245707
mars 19, 2012 106,7496508 106,572 -0,17765077
mars 20, 2012 107,03676 106,53 -0,50676002
mars 21, 2012 106,9907919 106,41 -0,58079191
mars 22, 2012 106,8639597 106,206 -0,65795974
mars 23, 2012 106,6636039 106,534 -0,12960394
mars 26, 2012 106,9951347 106,24 -0,75513473
12
Corresponding to Shambora and Rossiter (2007), the favorite way of testing predictability is
to see if the model would have been more profitable than a naïve model as the random walk
(RW).
To verify the conclusions drawn from the neural analysis, we proceed to an analysis of
profitability.
To compute profitability we took each day’s prediction and made a “trade” based on this
prediction. For example, if the model predicts a down day, investor would sell one unit of oil
short. we add each day’s profit (loss) in percentage terms to come to the grand total
percentage gain or loss. We did this with each of the trading strategies (see Table 4).
Period 2008 2009 2010 2011 2012 2013 Total
period
(2008-
2013)
Trading
strategies
ANN 0.9852 1.2123 1.5698 0.9635 1.1114 0.8457 6.6879
RW -0.1669 0.5234 0.8126 0.7391 0.3335 -0.2487 1.9930
Table 4. Analysis of profitability (%)
According to Table 4, the active ANN-based strategy far out-performed the naïve trading
strategies, as the profitability over the all test period equal to 668.79% whereas 199.30% for
ANN and RW, respectively. Moreover, by examining the profitability on year to year basis ;
we can conclude that there are no losing years for ANN while two losing years for RW.
Overall we conclude that ANN is the best in terms of both predictability and profitability,
therefore, we can reject the EMH due the presence of arbitrage opportunities among crude oil
energy markets.
3. conclusion
In this paper, we applied an ANN to test the EMH. As inputs we introduced a smoothed crude
oil price time series in order to reduce the noise effects. Moreover, we proceeded in this study
to determine the optimal ANN design to obtain the best predictions results. In term of
13
predictability, the model prove high accuracy therefore, we can reject the EMH. In another
hand, the network responses (the forecasted prices) are considered as trading signals (buy or
sell) which are used in profitability analysis. Our empirical results shown strong evidence of
short-term predictability in crude oil price variations and the weak-form EMH cannot be
verified. These findings are compatible with the results of Elder and Serletis (2008), Alvarez-
Ramirez et al. (2008) and Shambora and Rossiter (2007), who find evidence of oil price
predictability for short time horizons.
14
References
Alvarez-Ramirez, J., Alvarez, J., Rodriguez, E. (2008). Short-term predictability of crude oil
markets: a detrended fluctuation analysis approach. Energy Economics, 30, 2645-2656.
Alvarez-Ramirez, J., Alvarez, J., Solis, R. (2010). Crude oil market efficiency and modeling:
Insights from the multiscaling autocorrelation pattern. Energy Economics, 32, 993-1000.
Arouri, M.H., Dinh, T.H., Nguyen, D.K. (2010). Time-varying predictability in crude-oil
markets: The case of GCC countries, Energy Policy, 38, 4371-4380.
Belaire-Franch, J., Contreras, D.,2004. Ranks and signs-based multiple variance ratio tests.
Working paper, Department of Economic Analysis, University of Valencia.
Caner, M., Gedik, E., Keçebaş, A. (2011). Investigation on thermal performance calculation
of two type solar air collectors using artificial neural network. Expert Systems with
Applications, 38(3), 1668–1674.
Charles, A., Darné, O. (2009). The efficiency of the crude oil markets: Evidence from
variance ratio tests. Energy Policy, 37, 4267-4272.
Elder, J., Serletis, A. (2008). Long memory in energy futures prices. Review of Financial
Economics, 17, 146-155.
Fama, E.F. (1970). Efficient capital markets: a review of theory and empirical work. Journal
of Finance, 25, 383-417.
Fama, E.F. (1991). Efficient capital markets: II. Journal of Finance, 46, 1575-1617.
Ghaffari, A., Zare, S. (2009). A novel algorithm for prediction of crude oil price variation
based on soft computing. Energy Economics, 31, 531-536.
Haidar, I., Kulkarni, S, Pan, H. (2008). Forecasting model for crude oil prices based on
artificial neural networks. Proceedings of the International Conference on Intelligent Sensors,
Sensor Networks and Information Processing (ISSNIP ‘2008), 103-108.
Jammazi, R., Aloui, C. (2012). Crude oil price forecasting: Experimental evidence from
wavelet decomposition and neural network modeling. Energy Economics, 34(3), 828-841.
Kaastra, I., Boyd, M. (1996). Designing a neural network for forecasting financial and
economic time series. Neurocomputing, 10, 215-236.
Kim, J.H., 2006. Wild bootstrapping variance ratio tests. Economics Letters, 92, 38–43.
15
Kulkarni, S., Haidar, I. (2009). Forecasting model for crude oil price using artificial neural
networks and commodity futures prices. International Journal of Computer Science &
Information Security, 2(1), 81-88.
Lackes, R., Börgermann, C., Dirkmorfeld, M. (2009). Forecasting the Price Development of
Crude Oil with Artificial Neural Networks. Lecture Notes in Computer Science, 5518, 248-
255.
Maslyuk, S., Smyth, R. (2008). Unit root properties of crude oil spot and futures prices,
Energy Policy, 36, 2591-2600.
Ortiz-Cruz, A., Rodriguez, E., Ibarra-Valdez, C., Alvarez-Ramirez, J. (2012). Efficiency of
crude oil markets: Evidences from informational entropy analysis. Energy Policy, 41, 365-
373.
Rosieka, S., Batllesa, F. J. (2010). Modelling a solar-assisted air-conditioning system installed
in CIESOL building using an artificial neural network. Renewable Energy, 35(12), 2894–
2901.
Shambora, W.E., Rossiter, R. (2007). Are there exploitable inefficiencies in the futures
market for oil?. Energy Economics, 29, 18-27.
Tabak, B.M., Cajueiro, D.O. (2007). Are the crude oil markets becoming weakly efficient
over time? A test for time-varying long-range dependence in prices and volatility. Energy
Economics, 29, 28-36.
Wright, J.H., 2000. Alternative variance-ratio tests using ranks and signs. Journal of Business
and Economic Statistics, 18, 1–9.
Yonaba, H., Anctil, F., Fortin, V. (2010). Comparing sigmoid transfer functions for neural
network multistep ahead streamflow forecasting. Journal of Hydrologic Engineering, 15(4),
275– 283.
Zhang, B. (2013). Are the crude oil markets becoming more efficient over time? New
evidence from a generalized spectral test. Energy Economics, 40, 875-881.
Zhang, G., Patuwo, E. B., & Hu, M. Y. (1998). Forecasting with artificial neural network: The
state of the art. International Journal of Forecasting, 14, 35-62.