Testing Crude Oil Market Efficiency Using Artificial ... · 2. Test of WTI crude oil market weak...

1

Testing Crude Oil Market Efficiency Using Artificial Neural Networks

Manel HAMDI

International Finance Group Tunisia, Faculty of Management and Economic Sciences of

Tunis, Tunisia, El Manar University, Tunis cedex, C.P. 2092, El Manar Tunisia

Phone: +21697551459

Email: [email protected]

Abstract

This paper evaluates the weak-form efficiency of the crude oil markets using the artificial

neural network (ANN) model. Based on the daily historical data of the West Texas

Intermediate (WTI) crude oil spot price over the period (02 January 1986- 31 December

2013), the model was trained using backpropagation algorithm. The output of the neural

network represents the predicted prices which are considered as trading signals (buy or sell)

for investors. Furthermore, an empirical investigation of profitability has been conducted.

Compared to a naïve trading strategy as the Random Walk (RW), the profitability results

show that ANN model outperformed the RW model. Therefore, the crude oil market is

inefficient according to the Efficient Market Hypothesis (EMH). From these findings, we can

argues that is possible to earn excess profits by making trading strategy based on the

information embedded in the historical crude oil prices. Finally, the proposed neural network-

based approach becomes an interesting trading rule for the practitioners to make or to support

their investment decisions.

JEL Classification: G14, Q47, C45.

Keywords: Efficient market hypothesis, crude oil price prediction, artificial neural network.

mailto:[email protected]

2

1. Introduction

Efficient Market Hypothesis (EMH) is of special interest for the financial institutions and

organisms. According to Fama (1970, 1991), a market is efficient in a weak form if the asset

price reflect immediately all new available and relevant information.

Despite it’s an interesting topic of research, there is no numerous studies that focused on this

field especially for the oil commodity. To first, Tabak and Cajueiro (2007) have investigated

the crude oil markets efficiency, including WTI and Brent. The authors have estimated the

fractal structure of these time series using the time-varying Hurst exponent test for a data

sample of daily closing prices covering the period (May, 1983 - July 2004). Their results

pointed out the presence of a fractal structure in oil price time series. Moreover, they showed

that crude oil market efficiency is changing over time and they have become more efficient

over time. According to these authors, the WTI crude oil price seems to be more weak form

efficient than Brent prices. Similar conclusions were provided by Alvarez-Ramirez et al.

(2008), which concluded that for long times the crude oil market is consistent with the weak-

form efficient market hypothesis (WFEMH hereafter). The authors have examined the auto-

correlation of international crude oil markets by estimating the Hurst exponent dynamics for

several typical oil mixtures (Europe Brent, WTI Cushing and Dubai) commodity time series

returns for the sample period (1987-2007). In another research, Maslyuk and Smyth (2008)

have also analyzed the WFEMH in the crude oil markets based on unit root tests. Using

weekly spot and future prices including WTI and Europe Brent (January 1991-December,

2004), the authors showed that future spot and futures prices cannot be predicted based on

historical price data and concluded that crude oil markets seem to be weakly efficient. In a

more recent research, Charles and Darné (2009) have investigated the WFEMH by testing the

random walk hypothesis from variance ratio tests. More precisely, the authors have employed

a non-parametric variance ratio tests suggested by Wright (2000), Belaire-Franch and

Contreras (2004) as well as the wild-bootstrap variance tests developed by Kim (2006). Using

daily closing spot prices for two crude oil markets (US WTI and the UK Brent) over the

period (June 1982- July 2008), the authors revealed that the Europe Brent crude oil market is

weak-form efficiency while the WTI crude oil market seems to be inefficient over the period

(1994-2008). In another work, Alvarez-Ramirez et al. (2010) used a lagged version of the

detrended fluctuations analysis to study the efficiency of crude oil market and to detect delay

effects in spot WTI prices autocorrelations over the period 1986 to 2009. Based on their

3

empirical findings, they concluded that negative or positive autocorrelations can be concealed

by delay effects. Using weekly spot FOB crude oil prices for four OPEC members as also

represent four countries of the golf cooperation council (GCC): Kuwait, Qatar, Saudi Arabia,

and the United Arab Emirates (UAE); Arouri et al. (2010) applied a state space model to

prove strong evidence of short-term predictability in crude oil price movements over time.

Nevertheless, the hypothesis of convergence towards weak-form informational efficiency

cannot be verified for all markets. More recently, Ortiz-Cruz et al. (2012) employed

multiscale entropy analysis techniques to investigate the informational efficiency of the crude

oil markets. Results based on daily closing spot prices of WTI running from January, 1st,

1986 to March 15, 2011 shown that crude oil market is an informational efficiency market

overall the period except the early 1990s and the late 2000s US economic recessions. In this

context, neural networks-approach is applied for market forecasting and trading issue.

In section 2, we describe the ANN proposed model to verify EMH and the data sample used

for this purpose. Moreover, an empirical investigation and results are explored in the same

section. Finally, we conclude in section 3.

2. Test of WTI crude oil market weak form efficiency : Empirical investigation

2.1. Data sample description

A sample of WTI crude oil spot price (see Fig.1), for the period running from 2nd January,

1986 to December 31, 2013 ; is used to predict the future value of WTI crude oil price. The

daily data was provided by US Energy Information administration website. 80% of the data

set (5651 observations) represents the training sample that is used to estimate the parameters

of network (synaptic weights and bias) and the remainder (1413 observations) is the checking

sample which is designed to test the predictive ability of network.

Figure 1. The crude oil spot price of WTI (Time:7064 working day)

4

As illustrated in figure 1, the crude oil market is characterized by high volatility and also

marked by outstanding peaks and falls due to the effect of the unpredictable events (wars,

embargoes, crisis, revolution…) which have occurred in the history of oil market.

Corresponding to Ghaffari and Zare (2009), we introduce a smoothing algorithm in order to

reduce the effect of unforeseen short term disturbances of oil market while maintaining the

main and long term characteristics of the dynamic of crude oil market. In this study, we take

the default method of smoothing provided by the Matlab software packages as the 5th order of

moving average filter (see Fig.2)

Figure 2. The smoothed crude oil spot price of WTI (Time:7064 working day)

The observed actual and smoothed times series of WTI ranging from January 02, 1986 to

December 31,2013 are depicted in Fig.3. Thus, the error (see Fig.4) is the difference between

actual and smoothed price value.

Figure 3. Actual vs. Smoothed crude oil spot price of WTI (Time:7064 working day)

Figure 4. Error between actual and smoothed crude oil spot price of WTI (Time:7064

working day)

0 1000 2000 3000 4000 5000 6000 7000 80000

50

100

150

WTI

cru

de o

il pric

e (U

S$ p

er b

arre

l)

Working day

Actual crude oil price

Smoothed crude oil price

0 1000 2000 3000 4000 5000 6000 7000 8000-10

-5

0

5

10

15

Error

Working day

5

In order to demonstrate the utility of the proposed smoothing function, we choose to illustrate

the two plot of actual and smoothed price (see Fig.5) only over the month of July/2008,

during which the price of crude oil reached its highest value (145.31 US$/barrel). After

introducing the smoothing procedure, the price has decreased of 3.81 US$/barrel. We can

conclude that the smoothing procedure has advantage to reduce the short term noises effect.

Figure 5. Actual vs. Smoothed crude oil spot price of WTI (July 2008)

2.2. Artificial neural network model

2.2.1. ANN Structure

Artificial neural network is a nonlinear model inspired from human brain functioning by

adopting the same mode of acquiring knowledge through learning process. The standard

design of ANN consists generally of an input layer (that contains, in our case, the historical

smoothed prices (Pt) of WTI crude oil), one or more hidden layers and an output layer (that

presents the predicted future prices (Pt+1) of WTI crude oil) interconnected among them as

depicted in Figure 6.

Figure 6. Fully interconnected neural network with one hidden layer

0 5 10 15 20 25120

125

130

135

140

145

150

X: 3

Y: 145.3

X: 14

Y: 131.4

X: 3

Y: 141.5

X: 14

Y: 128.2

X: 9

Y: 145.2

X: 9

Y: 141

X: 21

Y: 126.7

X: 21

Y: 124.6

July 2008

WTI

cru

de o

il pr

ice

(US$

per

bar

rel)

Actual oil price

Smoothed oil price

6

The state of the output neuron is determined by the following formula :

(2) (1)

1 2 1 1 2

=0 =0

=m N

t kj ij t

j i

P g w g w P b b

(7)

Where; tP are the inputs of the network; N is the total observations of input prices; m is the

number of nodes in the hidden layer; k is the number of units in the output layer; g is the

Transfer/activation function; )1(w is the weights matrix of the hidden layer; )2(w is the weights

matrix of the output layer; 1b and

2b are the bias vectors of the hidden layer and output layer,

respectively.

2.2.2. ANN Topology

A lot of expert knowledge and several combinations of experiments are needed to obtain an

optimal ANN topology because there are no scientific rules to find the best configuration of

ANN for a particular problem (Lackes et al., 2009).

Then, several factors must be controlled to select the optimal network architecture :

- The number of hidden layers

In our experiment, we used one or/and two hidden layers that is the ideal architecture for

providing a good forecasting results (Zhang et al., 1998).

- The choice of activation function

According to Haykin (1999), the sigmoid and the hyperbolic tangent functions are the mostly

used in financial applications. In this study, we use the hyperbolic tangent function as transfer

function (g1) of the network similarly to the recent financial applications (Yonaba et al, 2010 ;

Jammazi and Aloui, 2012).

- Learning rate & training algorithms

The network was trained with backpropagation algorithm, precisely with Levenberg-

Marquardt algorithm as it’s the fastest training function (Kulkarni and Haidar, 2009). After

saveral experiments, we choose the learning rate equal to 0.01 because based on this value we

found the best solution. We note that, also Haidar et al. (2008) has used the same value of

learning rate.

- The number of hidden nodes

There are no universal standards to define the number of hidden neurons. The ideal is to use

the least amount of units which allow to achieve the best prediction results, as too many nodes

7

could deduce an overfitting problem and too few could cause an underfitting problem

(Kaastra and Boyd, 1996).

In this study, we follow the similar approach employed by Rosiek and Batlles (2010) and

Haidar et al. (2008) to determine the number of hidden neurons. This approach consists of

training and testing the network to a fixed number of iterations (1000 iterations in our

experiment), beginning with small number of units and add the number gradually until the

optimal number of nodes is reached. In this work, we try with a maximum number of hidden

neurons equal to 20 and started with one hidden node. According to Table 1 and Figure 7, the

best results of out-of-sample (minimum of

2

1

1Target Output

n

i i

i

MSEN

) is obtained with

10 hidden neurons.

Nbre of hidden nodes MSE value Nbre of hidden nodes MSE value

1 1,8426294 11 2,0841049

2 1,8986168 12 2,2036816

3 1,9962819 13 2,204476

4 2,1632104 14 2,9094379

5 2,1104935 15 1,8470987

6 2,1625132 16 2,5390865

7 2,2531985 17 1,8554263

8 2,3961847 18 2,7154749

9 2,5702552 19 2,3407171

10 1,3171001 20 1,7725174

Table 1. MSE statistics vs. the increase of hidden neurons

Figure 7. MSE evolution vs. the increase of hidden neurons

0 2 4 6 8 10 12 14 16 18 20

1.4

1.6

1.8

2

2.2

2.4

2.6

2.8

3

Nbre of nodes

M S

E

8

As conclusion, the proposed model is a single layer backpropagation neural network with ten

hidden neurons, the hyperbolic tangent as activation function in the hidden layer and the

linear transfer function (g2) in the output layer.

2.3.Empirical results and Interpretation

Once the training process was completed, we proceed to judge the quality of prediction of

ANN model. To do this, we will subsequently compare between the estimated (predicted) and

the real (actual/target) values of crude oil price.

Before starting this step, we must verify the quality of ANN training. Therefore, the

performance of a trained network can be measured by the correlation coefficient (R-value)

derived from the regression analysis presented in Fig. 8.

Figure 8. Comparison between the estimated and actual values of crude oil price : in training

part

According to Fig. 8, the dashed line indicates the best linear fit (ANN outputs equal to

targets). The circles represent the data points and the solid line colored blue reflects the best

fit between network responses and targets. In this empirical study, it is difficult to distinguish

the best linear fit line from the perfect fit line because the fit is so good. Moreover, more the

R-value is close to 1 more there is perfect correlation between targets and outputs. In our

study, the R-value (0.99972) is very close to 1, which indicates a good fit.

After checking the quality of the learning network, we analyse the predictions results based on

the Root Mean Square Error (RMSE), the Mean Absolute Error (MAE) performance

measures, and also based on the correlation coefficient (R-value) as a decisive factor in this

specific problem see (Table. 2).

9

MSE RMSE MAE R

1.3171

1.1476

0.8740 0.9988

Table 2. Performance criteria

As accuracy is the most important criteria to judge the forecasting models, we select the two

main metrics RMSE and MAE which can be expressed as follow :

2

1

1 Target Output (1)

n

i i

i

RMSEN

1

1 Target Output (2)

n

i i

i

MAEN

Where N= (i=1…..1413) is the total of checking sample.

According to equation 1 and 2, the RMSE and MAE values of the proposed neural network

are found as 1.1476 and 0.8740, respectively. The MAE is the absolute value of the difference

between target and output values divised by the total observations in test part, therefore the

value (0.8740) reflects the noticeable accuracy of neural network model. The RMSE will

always be larger or equal to the MAE (Caner et al., 2011), in our case the RMSE value

(1.1476) is slightly larger than the MAE value. This finding confirm the performance of the

network in forecasting task. Another indicator was chosen to verify this point as the R-value.

Similarly to the regression analysis carried out in the learning phase, the same analyse has

conducted in the testing phase (see Fig. 9).

Figure 9. Comparison between the estimated and actual values of crude oil price : in checking

part

10

The network responses are plotted versus the targets in Fig. 9. Three variables are returned by

the analysis regression plot. The first variable x represents the slope of the best linear

regression relating targets to network outputs. The slope would be 1 If there were a perfect fit

(outputs exactly equal to targets). The second variable y is the intercept of the best linear

regression relating targets to network outputs, and the intercept constant would be around 0.

Finally, the third variable is the R-value which would be very close to one. Our results (x=

1.0012, y=-0.5061 and R=0.9988) show that ANN is a good prediction model.

The following figure illustrates the plot of comparison between actual and forecasted crude oil

price.

Figure 10. A plot of actual and forecasted crude oil price : in checking part

In order to illustrate our empirical results, we have arbitrarily selected 30 consecutive trading

days from our test sample (see Table. 1).

0 500 1000 150020

40

60

80

100

120

140

160

WTI cr

ude oil

price (U

S$ per

barrel)

Checking part (1413 observations)

Predictive value

Real value

Date Predictive value Real value Error

févr 13, 2012 100,2859769 100,318 0,03202314

févr 14, 2012 100,6994656 100,808 0,1085344

févr 15, 2012 101,1467217 101,726 0,5792783

févr 16, 2012 102,3083843 102,824 0,51561568

févr 17, 2012 103,9067445 103,858 -0,04874452

févr 21, 2012 105,0155665 104,982 -0,03356653

févr 22, 2012 105,7721108 106,394 0,62188922

févr 23, 2012 106,8475541 107,438 0,59044593

févr 24, 2012 108,1399367 107,58 -0,55993669

févr 27, 2012 108,339128 107,798 -0,54112803

févr 28, 2012 108,6459273 108,062 -0,58392733

11

Table 3. The difference between actual and forecasted WTI crude oil price for 30 consecutive

trading days

According to the above table, the neural network is a perfect forecasting model, as the

maximum difference between actual and forecast is 1.5$.

These findings prove the perfect ability of the neural network model to predict the crude oil

market, however, an efficient market would have no predictability. Therefore, the crude oil

market is inefficient in the Fama sence.

févr 29, 2012 109,0106511 107,52 -1,49065113

mars 01, 2012 108,2547535 107,162 -1,09275348

mars 02, 2012 107,7618855 106,786 -0,97588546

mars 05, 2012 107,2833911 106,602 -0,68139107

mars 06, 2012 107,0700915 106,18 -0,8900915

mars 07, 2012 106,6394228 106,324 -0,3154228

mars 08, 2012 106,777172 106,252 -0,52517198

mars 09, 2012 106,707126 106,65 -0,05712604

mars 12, 2012 107,1242766 106,516 -0,60827661

mars 13, 2012 106,9756504 106,224 -0,75165041

mars 14, 2012 106,6805212 106,15 -0,53052119

mars 15, 2012 106,6118926 106,5 -0,11189259

mars 16, 2012 106,9584571 106,296 -0,66245707

mars 19, 2012 106,7496508 106,572 -0,17765077

mars 20, 2012 107,03676 106,53 -0,50676002

mars 21, 2012 106,9907919 106,41 -0,58079191

mars 22, 2012 106,8639597 106,206 -0,65795974

mars 23, 2012 106,6636039 106,534 -0,12960394

mars 26, 2012 106,9951347 106,24 -0,75513473

12

Corresponding to Shambora and Rossiter (2007), the favorite way of testing predictability is

to see if the model would have been more profitable than a naïve model as the random walk

(RW).

To verify the conclusions drawn from the neural analysis, we proceed to an analysis of

profitability.

To compute profitability we took each day’s prediction and made a “trade” based on this

prediction. For example, if the model predicts a down day, investor would sell one unit of oil

short. we add each day’s profit (loss) in percentage terms to come to the grand total

percentage gain or loss. We did this with each of the trading strategies (see Table 4).

Period 2008 2009 2010 2011 2012 2013 Total

period

(2008-

2013)

Trading

strategies

ANN 0.9852 1.2123 1.5698 0.9635 1.1114 0.8457 6.6879

RW -0.1669 0.5234 0.8126 0.7391 0.3335 -0.2487 1.9930

Table 4. Analysis of profitability (%)

According to Table 4, the active ANN-based strategy far out-performed the naïve trading

strategies, as the profitability over the all test period equal to 668.79% whereas 199.30% for

ANN and RW, respectively. Moreover, by examining the profitability on year to year basis ;

we can conclude that there are no losing years for ANN while two losing years for RW.

Overall we conclude that ANN is the best in terms of both predictability and profitability,

therefore, we can reject the EMH due the presence of arbitrage opportunities among crude oil

energy markets.

3. conclusion

In this paper, we applied an ANN to test the EMH. As inputs we introduced a smoothed crude

oil price time series in order to reduce the noise effects. Moreover, we proceeded in this study

to determine the optimal ANN design to obtain the best predictions results. In term of

13

predictability, the model prove high accuracy therefore, we can reject the EMH. In another

hand, the network responses (the forecasted prices) are considered as trading signals (buy or

sell) which are used in profitability analysis. Our empirical results shown strong evidence of

short-term predictability in crude oil price variations and the weak-form EMH cannot be

verified. These findings are compatible with the results of Elder and Serletis (2008), Alvarez-

Ramirez et al. (2008) and Shambora and Rossiter (2007), who find evidence of oil price

predictability for short time horizons.

14

References

Alvarez-Ramirez, J., Alvarez, J., Rodriguez, E. (2008). Short-term predictability of crude oil

markets: a detrended fluctuation analysis approach. Energy Economics, 30, 2645-2656.

Alvarez-Ramirez, J., Alvarez, J., Solis, R. (2010). Crude oil market efficiency and modeling:

Insights from the multiscaling autocorrelation pattern. Energy Economics, 32, 993-1000.

Arouri, M.H., Dinh, T.H., Nguyen, D.K. (2010). Time-varying predictability in crude-oil

markets: The case of GCC countries, Energy Policy, 38, 4371-4380.

Belaire-Franch, J., Contreras, D.,2004. Ranks and signs-based multiple variance ratio tests.

Working paper, Department of Economic Analysis, University of Valencia.

Caner, M., Gedik, E., Keçebaş, A. (2011). Investigation on thermal performance calculation

of two type solar air collectors using artificial neural network. Expert Systems with

Applications, 38(3), 1668–1674.

Charles, A., Darné, O. (2009). The efficiency of the crude oil markets: Evidence from

variance ratio tests. Energy Policy, 37, 4267-4272.

Elder, J., Serletis, A. (2008). Long memory in energy futures prices. Review of Financial

Economics, 17, 146-155.

Fama, E.F. (1970). Efficient capital markets: a review of theory and empirical work. Journal

of Finance, 25, 383-417.

Fama, E.F. (1991). Efficient capital markets: II. Journal of Finance, 46, 1575-1617.

Ghaffari, A., Zare, S. (2009). A novel algorithm for prediction of crude oil price variation

based on soft computing. Energy Economics, 31, 531-536.

Haidar, I., Kulkarni, S, Pan, H. (2008). Forecasting model for crude oil prices based on

artificial neural networks. Proceedings of the International Conference on Intelligent Sensors,

Sensor Networks and Information Processing (ISSNIP ‘2008), 103-108.

Jammazi, R., Aloui, C. (2012). Crude oil price forecasting: Experimental evidence from

wavelet decomposition and neural network modeling. Energy Economics, 34(3), 828-841.

Kaastra, I., Boyd, M. (1996). Designing a neural network for forecasting financial and

economic time series. Neurocomputing, 10, 215-236.

Kim, J.H., 2006. Wild bootstrapping variance ratio tests. Economics Letters, 92, 38–43.

http://www.researchgate.net/researcher/70822899_Murat_Caner

http://www.researchgate.net/researcher/70348545_Engin_Gedik

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=p_Authors:.QT.Haidar,%20Imad.QT.&searchWithin=p_Author_Ids:38186200300&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=p_Authors:.QT.Kulkarni,%20Siddhivinayak.QT.&searchWithin=p_Author_Ids:37275754300&newsearch=true

http://ieeexplore.ieee.org/search/searchresult.jsp?searchWithin=p_Authors:.QT.Pan,%20Heping.QT.&searchWithin=p_Author_Ids:38187858200&newsearch=true

http://ideas.repec.org/s/eee/eneeco.html

15

Kulkarni, S., Haidar, I. (2009). Forecasting model for crude oil price using artificial neural

networks and commodity futures prices. International Journal of Computer Science &

Information Security, 2(1), 81-88.

Lackes, R., Börgermann, C., Dirkmorfeld, M. (2009). Forecasting the Price Development of

Crude Oil with Artificial Neural Networks. Lecture Notes in Computer Science, 5518, 248-

255.

Maslyuk, S., Smyth, R. (2008). Unit root properties of crude oil spot and futures prices,

Energy Policy, 36, 2591-2600.

Ortiz-Cruz, A., Rodriguez, E., Ibarra-Valdez, C., Alvarez-Ramirez, J. (2012). Efficiency of

crude oil markets: Evidences from informational entropy analysis. Energy Policy, 41, 365-

373.

Rosieka, S., Batllesa, F. J. (2010). Modelling a solar-assisted air-conditioning system installed

in CIESOL building using an artificial neural network. Renewable Energy, 35(12), 2894–

2901.

Shambora, W.E., Rossiter, R. (2007). Are there exploitable inefficiencies in the futures

market for oil?. Energy Economics, 29, 18-27.

Tabak, B.M., Cajueiro, D.O. (2007). Are the crude oil markets becoming weakly efficient

over time? A test for time-varying long-range dependence in prices and volatility. Energy

Economics, 29, 28-36.

Wright, J.H., 2000. Alternative variance-ratio tests using ranks and signs. Journal of Business

and Economic Statistics, 18, 1–9.

Yonaba, H., Anctil, F., Fortin, V. (2010). Comparing sigmoid transfer functions for neural

network multistep ahead streamflow forecasting. Journal of Hydrologic Engineering, 15(4),

275– 283.

Zhang, B. (2013). Are the crude oil markets becoming more efficient over time? New

evidence from a generalized spectral test. Energy Economics, 40, 875-881.

Zhang, G., Patuwo, E. B., & Hu, M. Y. (1998). Forecasting with artificial neural network: The

state of the art. International Journal of Forecasting, 14, 35-62.

http://www.informatik.uni-trier.de/~ley/pers/hd/b/B=ouml=rgermann:Chris.html

http://www.informatik.uni-trier.de/~ley/pers/hd/d/Dirkmorfeld:Matthias.html

http://www.sciencedirect.com/science/article/pii/S0960148110001850

http://www.sciencedirect.com/science/article/pii/S0960148110001850

http://www.sciencedirect.com/science/article/pii/S0960148110001850#aff1

http://www.sciencedirect.com/science/journal/09601481

Testing Crude Oil Market Efficiency Using Artificial ... · 2. Test of WTI crude oil market weak...

Documents

Transcript of Testing Crude Oil Market Efficiency Using Artificial ... · 2. Test of WTI crude oil market weak...