Equity Trading With an Ensemble Neural Network System

Equity Trading with an Ensemble Neural Network System

Combining factor selection by non-parametric statistics with

evolutionary algorithms and artificial intelligence

Emil Tingström

April 16, 2012

Abstract

This paper relates to the use of mathematical models in order to produce excess return, or

alpha, when trading equity derivatives. With the advent of computers speculators and

investors have been trying to harness the computation power to create more efficient and

profitable portfolio allocation strategies. Computers provide the option to test complex and

calculation-intensive trading methods that are objective in contrast to the usual investment

decision made by humans which is subjective.

Several possibly predictive factors from three categories are tested and analyzed using a

non-parametric hypothesis test call Bootstrap resampling as well as a parametric one call

Student’s t-test. From the distribution of daily returns of each factor tested the p-value can

be derived which is the probability that a mean return as extreme as the one observed could

be the result of random chance alone. The resulting factors that prove to be statistically

significant during an eight-year period predicting the exchange-traded fund XACT OMXS30

were implemented as inputs to an ensemble of neural networks with weights and thresholds

evolved by a genetic algorithm to achieve optimal positive p-value in the in-sample period.

The ensemble was then tested on an out-of-sample period and this showed highly

significance positive returns.

1

Förord

Detta projektarbete har gjorts inom inriktningen matematik och datavetenskap på

Naturvetenskapsprogrammet, Donnergymnasiet. Arbetet följer målen som är uppsatta för

programmet genom att med ett vetenskapligt förhållningssätt applicera teoretiska modeller

i ett försök att beskriva verkligheten på ett matematiskt sätt. Genom experimentella studier

förfinas modellerna och resultatet tolkas utifrån objektivt synsätt.

Jag vill rikta tack till mina handledare Leif Duveborg och Per Stumle, min engelsklärare Meg

Johansson samt mina programmeringslärare Joakim Wassberg och Johan Sköldh.

Klintehamn den 16 April

Emil Tingström

2

Contents 1. Introduction........................................................................................................................3

1.1 Speculation ...................................................................................................................3

1.2 Backtesting and the scientific method ...........................................................................3

1.2.1 Data mining and data snooping ..............................................................................4

1.2.2 De-trending data ....................................................................................................4

1.2.3 Student’s t-test .......................................................................................................5

1.2.4 Bootstrap method ..................................................................................................5

1.3 Data and software .........................................................................................................6

2. Momentum or Mean Reversion ..........................................................................................6

2.1 Return ...........................................................................................................................6

2.2 Close in relation to the range ......................................................................................14

3. Seasonal Tendencies .........................................................................................................15

3.1 By day of the week ......................................................................................................15

3.2 By day of the month ....................................................................................................16

4. Intermarket analysis .........................................................................................................18

4.1 Sector analysis.............................................................................................................19

4.2 Sector spread analysis .................................................................................................23

5. Combining the Predictive Factors into a Learning Ensemble System .................................34

5.1 Ensemble Trading Model .............................................................................................34

5.1.1 Artificial Neural Networks .....................................................................................34

5.1.2 Genetic Algorithms ...............................................................................................35

5.1.3 Ensemble learning ................................................................................................36

5.2 Method .......................................................................................................................37

5.3 Results ........................................................................................................................39

5.3.1 In-sample performance.........................................................................................40

5.3.2 Out-of-sample performance .................................................................................43

6. Conclusion ........................................................................................................................45

7. References ........................................................................................................................46

Appendix A – Excel Testing ...................................................................................................48

Appendix B – C# code for implementing the neural networks...............................................49

Appendix C – Bootstrap code ................................................................................................56

3

1. Introduction

The following part serves as an introduction to the use of statistics to analyze price

movements in the equity market.

1.1 Speculation

Speculation has always intrigued humans as it offers the opportunity to make a profit solely

by buying at a lower price that you sell at. One of the most common ways to speculate is

through the stock market where shares representing ownership in different companies are

traded. The movements of these shares are based on the different price agreements buyer

and seller make at the stock exchange. Predictions of the movements have been subject to a

lot of work by many people throughout the centuries. Many have been successful in their

endower, but far more have lost as their edge over the other market participants might only

have been an illusion or might have disappeared because of exploitation. The entire

economic system is a highly interconnected, complex and dynamic entity and the magnitude

of factors that affects it are vast. The price movement of a single stock reflects not just the

current news about the company but also the aggregate news of every related security. If

the market would be truly efficient, the news would be synthesized into the price

immediately without any lag. This however would assume that all market participants are

always rational as well as always fast enough to react to every change in the known

information. Only by the use of historical analysis can we know if that is the case.

1.2 Backtesting and the scientific method

Backtesting is the process of evaluating a strategy, theory, or model by applying it to

historical data (Wikipedia, 2011). Prediction is decision-making based on a view that is

assumed to hold true in the future and therefore the more confidence one has in the view

the more certain one can be in the prediction. Karl Popper argues that any theory that is to

be considered scientific (and hence true) must be falsifiable (Thornton, 1997). Only if the

theory can be disproved by objective tests can it hold any value. David Aronson takes this

view further and concludes that there are two views that people base their investment

decisions on; one subjective based on gut feeling and one objective based on verifiable rules.

The subjective view is, according to Aronson, always inferior to the objective view as while

the objective view might be proven false, the subjective view can never be proven at all thus

rendering it absent of any useful meaning (Bukey, 2007). The proper way for which

investment decisions should be made is therefore by following an objective theory that is

verified by statistical tests. However, the establishment of such a theory poses many issues.

This paper examines what predictive factors might exist in the Swedish stock market with

non-parametric statistical methods. The resulting factors are then tested as inputs into a

trading system based on the aggregated output of a large number of artificial neural

4

networks optimized by simulated evolution. The results are then evaluated for both the

training period and the validation period to see if the ensemble can generate significant

outperformance.

This paper is structured as follows: Chapter 2 examines the effect that price return has on

future returns, Chapter 3 looks at anomalies in the returns distribution on specific dates and

Chapter 4 evaluates the effect the price return of sector indices has on the instrument

tested. Chapter 5 then uses the resulting factors that have proven to be statistically

significant as inputs into an ensemble of neural network and examines the result.

1.2.1 Data mining and data snooping

When dealing with historical data many issues need to be considered in order to secure the

validity of the test. Stock market returns are by their nature very noisy and contain a large

amount of randomness. Whenever a test of a rule is conducted there is a chance that the

results might be purely due to chance alone. This chance grows larger when data mining for

profitable rules to test. Data mining is the process of discovering new patterns from data

sets by “mining” from them using a wide range of possible methods. The more rules that are

tested, the more the final or the best rule runs the risk of data snooping. Data snooping (or

data dredging, data fishing) is the inappropriate use of data mining to uncover misleading

relationships in data. Data snooping is most likely to occur when the data sample is too

small, leaving the rules tested without the ability to make robust generalizations out-of-

sample. A small sample is especially susceptible to being data snooped when the function

used to describe it contains a large number of variables. The more variables in a function (or

rule, strategy) the bigger the parameter space is and hence; the number of combinations

that could be tested and found profitable.

Another issue when backtesting strategy across several stocks is that the sample might not

fully represent all stocks available at the time. For example, consider a backtest performed

on the stocks that make up a stock index today. Since it is likely that the members that

constitute the index will have changed over time as companies rise and fall, that backtest

will be biased to have a positive return as the companies dropped out of the list will be those

that performed the worst in the past. This is called a survivorship bias and needs to be

accounted for in order to have a realistic simulation. (Hassler, 2011)

1.2.2 De-trending data

Historical trends in stock prices will influence any test done by shifting the expected mean

for a strategy that trades randomly away from zero. In order to neutralize this bias the

average daily return is subtracted from the return of each day in the test period. This creates

a detrended data sample with a mean arithmetic return of zero on which tests can be

conducted.

5

1.2.3 Student’s t-test

In order to determine with any confidence that the return earn by a rule in a backtest a

hypothesis test is made. One common test made is the t-test. When testing the null

hypothesis that the sample’s mean is equal to a specified value , one uses the statistic

√

where is the sample mean, s is the sample standard deviation of the sample and n is the

sample size.

The sample standard deviation is defined as

√

∑( )

where * + is a vector the observed values in the sample.

Once a t value is determined, a p-value can be found, given the degrees of freedom – ,

using a table of values from Student's t-distribution. The p-value represents the probability

that the observed test statistics is the result of chance alone. Typically the threshold for

statistical significance is set at 0.05 or 0.01, thus rejecting the null hypothesis in favor of the

alternative hypothesis.

1.2.4 Bootstrap method

The stock market is a highly complex and dynamic system and even if it has been assumed to

follow a Gaussian or “normal” distribution, Mandelbrot suggested that the distribution of

stock market returns is far more prone to “fat tails” (outliner such as stock market crashes).

(Kaplan, 2004) This might make a parametric hypothesis test such as Students t-test show

inaccurate results since the Students t-distribution is derived from a normal distribution. A

more robust test could then be a non-parametric test such as the Bootstrap method in which

the assumed distribution is derived from the actual test statistics. (Aronson, 2006)

The Bootstrap procedure goes like this:

Compute the mean

∑

of our return vector { } where is the number of days derived from the backtest and

subtract this from every value in it, thus giving it a mean of zero, a processes known as zero-

centering (not to be confused with de-trending which we have already done). From this new

6

vector we create resamples by drawing at random values with replacements. To

compute the p-value, we calculate the probability that the mean of any resample is

greater than or equal to , (

)

By using the actual sample to create a distribution from which we can make comparisons,

we only make the assumption that the sample is a randomly drawn sample from a larger

population that is independent and independently distributed. Since each resample is drawn

by random chance, the p-value will be deterministic only when an infinite number of

resamples are drawn. For the sake of accuracy, has to be very large but still achievable

within a given timeframe. For the test conducted in this paper was set to 500,000

resamples.

1.3 Data and software

All historical prices for Swedish stocks, ETFs and indices were obtained from NASDAQ OMX

Nordic’s website. Whenever there was a missing value for any specific day that day was

excluded from the backtest. By using the data directly from the exchange it can be assumed

to be more historically accurate than from other data vendors. The tests were performed in

Microsoft Excel 2010 and the Bootstrap test for statistical significance, the genetic algorithm

and the neural networks were all done in Microsoft Visual C# 2010 Express.

2. Momentum or Mean Reversion

2.1 Return

The simplest choice when considering the factors that might impact future price movements

would be past returns, as that probably is what most investors will examine before making

any investment decision. There have been numerous academic researches that indicate that

momentum, or the tendency for the price to correlate positively with its past return, is a

significant effect in stock markets all over the world. This is not just limited to the stock

market but also to commodities markets, fixed income markets and currency markets. The

opposite; mean reversion i.e. the tendency for prices to revert to their means has also been

observed in some markets and time frames.

In order to observe the effects of past return on future return, we first have to quantify past

return in an easily manageable form. To start off, the specific time period during which the

return will be measured needs to be specified. A lot of research has been focused on longer

time periods (months and years) to observe momentum within economic cycles; however

this will introduce the problem of data snooping due to a lack of data. Testing a strategy that

trades once a year on ten years worth of data will probably fail to met any basic

requirements of statistical significance. This is often countered by performing the test across

a vide variety of markets, but as the tests in this paper are only conducted on one market

the time period specified will have to be short to get enough trades to analyze. For all future

7

purposes, the time period measured will be all five intervals between one and five days. The

formula for this will be:

( )

∑

where is the closing price today and is the closing price N days ago where .

( ) is then normalized to shifting levels of volatility and trend biases by ranking the ( )

today by the same values for the past 252 days (roughly corresponding to one year in trading

days). The percentile rank can be expressed by the formula:

( )

Where is the count of all past ( )s less than the ( ) of interest, is the frequency of

( ) and is the number of examinees in the sample (here 252). By ranking today’s return

to past return you get the relative standing of that value as a percentage, which will be the

standard method for expressing a factor in this paper from this point forward. The

advantage of doing this is that the analysis of the factor is less likely to be influenced by

difference in volatility and trend in the test sample and therefore reducing the danger of

data snooping. The percentile rank could also be inferred from the z-score1, but as I

discussed earlier the data sample might not conform to a normal distribution and thus the

non-parametric method is preferable. Another advantage is that the factors are more easily

combined, which will serve useful in later chapters.

The tests will be conducted on XACT OMXS30, which is a tradable proxy for OMXS30 GI

which in turn represents the thirty most traded stocks at the Stockholm Exchange (Fonder).

XACT OMXS30 is a so called Exchange Traded Fund, or ETF, that can be traded just like any

stock at NASDAQ OMX. The accurate price relative to the underlying of the fund (NAV, or

Net Asset Value) is guaranteed by independent market makers. The cost involved when

trading it would be the management fee at 0.3% and standard trading commission at the

broker. Since I am testing on an index, survivorship bias is not an issue and since the rules I

am using for testing are simple and generalized the risk of data snooping should be low.

1 The standard score, or z-score is simply the number of standard deviations an observation is above or below the mean. This value could then be checked against a normal distribution to get the percentile rank since distribution and the actual distribution are assumed to be the same.

8

Figure 2.1.1 Chart of XACT OMXS30’s price adjusted for dividends during the testing period.

The backtests are first conducted on the price history for XACT OMXS30 from the 21st of

January 2004 to the 17st of February 2012. Since XACT OMXS30 gives its owners dividends

every year which is not included in the price history provided by NASDAX OMX Nordic’s data

feeder, a synthetic price history was created to account for the lack of dividends. Using the

history at XACT’s webpage (Xact, 2012), I simply added the dividend to the prices starting

from the ex-dividend day. This way the dividend is included in the returns as they would

have been in reality, with the only difference that they were paid the same day the NAV was

adjusted for it. At max 257 days are required to calculate the ranked return value. XACT

OMXS30 has a history going back to October 2000, but due to inadequate data with missing

values and low trading volume for the ETF only the history starting from January 2002 will be

used in this paper. The test starts at the 21st of January 2004 to coincide with the start of

the later tests conducted in this paper (which required more history to start). The means are

calculated on detrended data to offset any bias due to trends in the sample and based on

buying at the closing price when the ranked return value was below 0.5 as well as when it

was above 0.5 for that day and selling at the close of the next. This means that there would

have been some slippage in reality as the trades are taken at the same price that the signal is

generated from. And as there is no closing call auction2 for ETFs as there is with stocks bid

2 A closing call auction is the time of the day that the final price is determined by grouping together all the outstanding orders to find out at what price the maximum number of shares can be exchanged. The reason for doing this is usually to get a “fair” price from which the values of the index or derivatives can be calculated.

9

and ask spreads are a concern. Also, trading commission will have had a negative impact on

the returns calculated. These issues will not be address in the tests as their impact depends

on factors that cannot be modeled accurately without massive resources and full knowledge

of how the trading is conducted.

In the following table the average arithmetic daily return is show for look back-periods from

one to five as well as their respective significances arrived from the Bootstrap resampling

method.

N Mean P-value

1 0.0690% 0.0494

2 0.0611% 0.0786

3 0.0612% 0.0761

4 0.0790% 0.0314

5 0.0700% 0.0543

Table 2.1.1 Bootstrapped significance for ( )

All the rules tested generated an average daily return above the expected by 0.06% and with

a weak to strong statistical significance indicating that the probability that it is all a fluke is

moderate.

Given that values below 0.5 show positive excess return, values above should be expected to

show negative excess returns.

N Mean P-value

1 -0.0705% 0.0337

2 -0.0637% 0.0415

3 -0.0641% 0.0433

4 -0.0818% 0.0150

5 -0.0734% 0.0205


For all tested the statistical significance is below a 0.05 threshold which indicate that the

null-hypothesis is refuted and that the alternative hypothesis should be accepted. Since the

mean is so far to the left side of the distribution (the negative side) the Bootstrap test

calculates the p-value based on the probability ( ) , i.e. the likelihood that any

10

random distribution drawn by resampling with replacements has a mean lower than or equal

to the observed.

Below are the equity curves of non-compounding3 portfolios trading the percentile ranked

return in the sample period on the detrended price history.

Figure 2.1.1 Non-compounding portfolios trading when ( ) and

respectively with and linear regressions.



3 Since I am going to do a linear regression to measure the consistency of returns, the equity cannot compound

i.e. the returns will not be reinvested.

11







As evidenced by the backtests, the Swedish stock market has exhibited significant mean

reversion tendencies in the past decade in the short term. This is in line with what Stokes

12

(Stokes, 2009) has found testing the short term behavior of large cap equity indices in the

United States. Stokes also notes that this contrarian behavior do not extend to longer look-

back periods as the effect seems to be limited to the twenty-first century. Historically, stock

markets have had a tendency to “trend” in the short term but this momentum effect faded

away, possibly with the advent of computerized trading.

Distinguishing the oversold levels from the overbought using the median leaves the question

whether more extreme levels might show greater returns and significance. For example, it is

unlikely that the exact distinction between when to expect negative returns and when to

expect positive return lies exactly at a ( ) of 0.5. It makes much more sense that the

expected return for the next day is negatively correlated with the value, i.e. that the higher

the value the stronger negative return can be expected. In order to test whether this holds

true and whether this negative correlation is strong enough to justify additional complexities

regarding threshold values, I divide the dataset into ten deciles. A decile is simply a point

taken at regular intervals in the distribution, more specifically for each ten percent. Using

the existing values already percent-ranked for the past 252 days, I calculate the mean return

for days after for each of these ten bins. I then perform a linearly regression against the

means to see how well they correlate to the deciles.

Figure 2.1.6 Mean daily return per decile for .

13

Figure 2.1.7 Mean daily return per decile for



14


Interesting to note is that the R2-value, which is the correlation between the regression line

and the data points squared, seems to increase with . There is a negative correlation for

each N tested which confirms that past returns do drive future returns, but as evidenced by

the low R2 this relationship is disturbed by a lot of noise as one would expect considering the

efficiency and multidimensionality of the market.

2.2 Close in relation to the range

Another way to measure the distance the price has traveled would be by considering the

closing price in relation to the high and the low for the same day. The formula for expressing

the relative close goes like this;

( )

∑

In effect, the difference between the closing price and the lowest price for the day is

scaled by the difference between the highest price and the low. As there are days,

especially early in the sample, when the high and low is the same price those days were

adjusted to 0.5 to avoid errors. The average of days is then taken and ranked as a

percentage in relation to the values for the past 252 days, ( ). I then test the factor

on the price history for XACT OMXS30 from the 21st of January 2004 to the 17th of February

2012 and compute the p-value with a Bootstrap test.

N Mean P-value

1 0.0535% 0.0964

2 0.0363% 0.1886

15

3 0.0814% 0.0294

4 0.0357% 0.1973

5 0.0207% 0.3124


N Mean P-value

1 -0.0531% 0.0897

2 -0.0394% 0.1574

3 -0.0819% 0.0143

4 -0.0353% 0.1809

5 -0.0200% 0.3028


As expected by the mean reversion tendencies confirmed in the previous subsection the

relative close value is negatively correlated to the next day’s return. This effect is however

not statistically significance other than for an N-day period of 3. Nevertheless, the tests show

that the mean reversion effect expands to the close in relationship to the range as well.

3. Seasonal Tendencies This chapter examines the impact that the specific calendar date has on returns. Just like the

demand and availability of an item changes throughout the year, the behavior of investors

could also be changed.

3.1 By day of the week

A simple test of seasonal tendencies could be done by testing the effect the day of the week

has on the return of that day. The possibility exists that there might be signs of excessive

buying or selling depending on which day in the week it is. I tested this by first calculating

the mean daily return for each day of the week and then ran a Bootstrap test to check for

statistical significance on the detrended price history for XACT OMXS30 from the 21st of

January 2004 to the 17th of February 2012.

Weekday Mean P-value

Mon -0.0067% 0.4620

16

Tue -0.0381% 0.2706

Wed 0.0765% 0.1117

Thu -0.0304% 0.3150

Fri -0.0017% 0.4892

Table 3.1.1 Bootstrapped significance for returns per weekday

As indicated by the test results, no day of the week way showed returns that would not be

expected to occur as a result of random chance.

3.2 By day of the month

When examining the historical returns for U.S. equities around the turn of the month,

McConnell and Xu found that there has been a strong tendency for positive return around

the turn of each month (McConnell, 2006). More precisely, from the last day to the third day

of each month returns have on average been so strong that an investor would receive no

reward for having exposure during the other trading days of the month. To test this on

Swedish equities, I first divide the trading days in the dataset according to their distance

from the turn of the month. For example, would be the last day of the month and

is the first day of the month. Since the number of trading days in each month will

change from month to month, only the last and the first eight days will be examined. This it

the maximum amount of days I can include in the test by distance from the turn of the

month, without excluding certain months due to lack of trading days (the month that had

the least days had seventeen). I then calculate the mean return on the detrended price

history for XACT OMXS30 and use the distribution of each to get the Bootstrapped

statistical significance.

t Mean P-value

-8 -0.1313% 0.1908

-7 0.0349% 0.3951

-6 -0.1128% 0.1968

-5 0.0398% 0.3712

-4 0.0384% 0.3636

-3 0.2046% 0.0367

-2 0.1064% 0.2129

-1 0.0523% 0.2911

17

1 0.3139% 0.0151

2 0.0644% 0.3210

3 0.0014% 0.4965

4 -0.2186% 0.0660

5 -0.0324% 0.3948

6 -0.0521% 0.3716

7 -0.1437% 0.1333

8 -0.0429% 0.3635

Table 3.2.1 Bootstrapped significance for returns at the turn of the month

The test confirms McConnell and Xu’s finding that there is a positive bias around the turn of

each month. Although the positive bias began five days before and stretched three days

after the turn of the month, the effect was only statistically significant at and .

At the effect turned to the opposite and there was a statistically significant negative

mean daily return.

The following graphs is the equity curve of two portfolios trading the detrended XACT

OMXS30 price based on the day of the month.

Figure 3.1.2 Non-compounding portfolios trading when and respectively with

linear regressions.

This effect is more easily illustrated by a bar chart. The positive blue bars are centered in

uniform around the sign change of , with every other day in red.

18

Figure 3.2.1 Average daily return at the turn of the month

In conclusion, the turn of the month effect is very much alive in the Swedish market. The

majority of the profit is centered around three days and one day after the switch and at four

days after some of the profit is taken back by the market.

4. Intermarket analysis Investors and portfolio managers might not just trade one instrument at a time but several

simultaneously as a way to diversify their returns and risks. At the same time, the

profitability of one security or stock might be heavily correlated to others as they might

share similar economic factors. The intricate net of factors that have an impact on share

prices makes intermarket analysis a relevant field of study for predicting the former. Given

that there might be a lag in the accurate pricing between stocks that correlate, by trading of

19

the discrepancy alpha could be generated as a result. In order to keep the number of factors

that will be considered in this paper low, only sectors of OMXSPI will be analyzed.

4.1 Sector analysis

The following sector indices will be analysis and backtested as part of an intermarket

strategy:

OMX Stockholm Financials PI

OMX Stockholm Health Care PI

OMX Stockholm Industrials PI

OMX Stockholm Consumer Services PI

OMX Stockholm Consumer Goods PI

OMX Stockholm Utilities PI

OMX Stockholm Basic Materials PI

OMX Stockholm Technology PI

OMX Stockholm Telecommunications PI

There are actually several more sectors that are represented as an index at NASDAQ OMX,

but due to lack of recorded price history those were left out when testing. Notice how all of

these are Price Index (PI) in contrast to the index that XACT OMXS30 is trying to replicate,

which is a Gross Index (GI). While Price Indices only track the price of stocks, Gross Indices

also track the returns earned on dividends and such.

To test the impact the returns of each sector has on the overall stock market, I perform the

same calculation previously made when testing for mean reversion tendencies.

( )

∑

where is the closing price today and is the closing price N days ago where

.The average daily return for the past days is then ranked as a percentile by the same

values for the past 252 days and then used to analyze the returns of the ETF XACT OMXS30.

Sector 1 2 3 4 5

Financials PI 0.0228% 0.0571% 0.0702% 0.0476% 0.0446%

Health Care PI 0.0120% 0.0405% 0.0204% 0.0019% 0.0178%

Industrials PI 0.0245% 0.0194% 0.0476% 0.0362% 0.0343%

Consumer Services PI -0.0183% 0.0347% 0.0516% 0.0496% 0.0341%

Consumer Goods PI 0.0183% 0.0394% 0.0540% 0.0160% 0.0299%

Utilities PI 0.0397% 0.0115% 0.0605% 0.0076% 0.0220%

Basic Materials PI 0.0429% 0.0573% 0.0636% 0.0504% 0.0433%

Technology PI -0.0123% 0.0214% 0.0068% 0.0140% 0.0263%

Telecommunications PI 0.0487% 0.0662% 0.0747% 0.0685% 0.0824%

20

Table 4.1.1 Average daily returns for ( )

Due to computational constraints when performing Bootstrap permutations, I instead use

the t-test to test the hypothesis that the return earned by the rule is not significantly

different from zero. To test both negative and positive means, the following formula is used.

| |

√

is set to 0 since the data the tests are performed on is already detrended.

Sector 1 2 3 4 5

Financials PI 0.2956 0.0922 0.0496 0.1269 0.1459

Health Care PI 0.3881 0.1673 0.3116 0.4818 0.3371

Industrials PI 0.2814 0.3267 0.1406 0.2003 0.2166

Consumer Services PI 0.3279 0.2070 0.1135 0.1256 0.2157

Consumer Goods PI 0.3314 0.1776 0.0995 0.3544 0.2422

Utilities PI 0.1514 0.3908 0.0682 0.4281 0.3034

Basic Materials PI 0.1618 0.0974 0.0750 0.1234 0.1635

Technology PI 0.3827 0.3074 0.4352 0.3692 0.2596

Telecommunications PI 0.1282 0.0645 0.0394 0.0523 0.0270

Table 4.1.2 P-values from Student’s t-test for ( )

Highlighted in green are the p-values that fall below a 0.10 threshold. All sectors but four fall

below that in at least one setting. It seems that the mean reversion effect previously

explored is also present in this analysis. Each sector can be assumed to correlate well with

OMXS30 since some of the components of the sectors are also components of the major

index, so this should not stand out from what could be expected. However, the strong and

significant performance of the Telecommunications sector index indicates that the major

part of the effect from mean reversion in OMXS30 comes from one sector. With an average

p-value of 0.0623 across all tested values of the performance of Telecommunications PI is

unlikely to be the result of data mining.

Sector 1 2 3 4 5

Financials PI -0.0227% -0.0595% -0.0728% -0.0503% -0.0484%

Health Care PI -0.0120% -0.0415% -0.0208% -0.0019% -0.0181%

Industrials PI -0.0252% -0.0198% -0.0478% -0.0362% -0.0357%

Consumer Services PI 0.0185% -0.0353% -0.0528% -0.0512% -0.0353%

Consumer Goods PI -0.0181% -0.0404% -0.0570% -0.0163% -0.0314%

Utilities PI -0.0486% -0.0116% -0.0631% -0.0080% -0.0231%

21

Basic Materials PI -0.0438% -0.0580% -0.0636% -0.0496% -0.0426%

Technology PI 0.0128% -0.0218% -0.0073% -0.0149% -0.0281%

Telecommunications PI -0.0487% -0.0697% -0.0764% -0.0707% -0.0850% Table 4.1.3 Average daily returns for ( )

Sector 1 2 3 4 5

Financials PI 0,2759 0,0542 0,0265 0,0958 0,0995

Health Care PI 0,3773 0,1405 0,2974 0,4799 0,3180

Industrials PI 0,2532 0,2965 0,0918 0,1664 0,1618

Consumer Services PI 0,3198 0,1755 0,0801 0,0825 0,1685

Consumer Goods PI 0,3199 0,1425 0,0677 0,3318 0,2009

Utilities PI 0,1250 0,3832 0,0570 0,4177 0,2684

Basic Materials PI 0,1162 0,0529 0,0386 0,0897 0,1190

Technology PI 0,3715 0,2827 0,4249 0,3491 0,2392

Telecommunications PI 0,0973 0,0270 0,0216 0,0317 0,0115 Table 4.1.4 P-values from Student’s t-test for ( )

Highlighted in red are the p-values that fall below a 0.10 threshold. Both

Telecommunications PI and Basic Materials PI had an average p-value below 0.10 with

0.0378 and 0.0833 respectively.

Below the equity curves of non-compounding portfolios trading the Telecommunications PI

factor is displayed. The returns are detrended.

Figure 4.1.1 Non-compounding portfolios trading when

( ) and respectively with and linear

regression.

22



regression.



regression.



regression.

23



regression.

4.2 Sector spread analysis

After examining the impact of individual sectors on XACT OMXS30, I will test the impact that

the relative performance between two sectors as a different measurement of market

sentiment. The theory behind this is that some sectors might be correlated to the general

sentiment among investors, and if so outperformance relative to sectors that correlate to

the opposite will be a positive sign of future price return.

I first calculate the average daily return for each sector:

( )

∑

( )

∑

Both values are then percentile ranked by the distribution of values for the past 252 days

and then the difference is taken.

( ) ( ) ( )

This difference is then again ranked by the values for the past 252 days to account for any

trend and volatility bias that might exist between the normalized returns from the sectors.

The final value ( ) is then calculated.

24

Since the number of combinations will be too ample to run through with Bootstrap

permutations, the t-test (with absolute t) will be used to calculate the statistical significance.

Sector s1-s2

Finan

cials PI

Health

Care P

I

Ind

ustrials PI

Co

nsu

mer

Services PI

Co

nsu

mer

Go

od

s PI

Utilities P

I

Basic M

aterials

PI

Techn

olo

gy PI

Telecom

mu

nicati

on

s PI

XA

CT O

MX

S30

Financials PI

0,21

0,09

0,26

0,48

0,34

0,32

0,48

0,20

0,05

Health Care PI 0,20

0,39

0,42

0,48

0,44

0,03

0,36

0,40

0,11

Industrials PI 0,08

0,37

0,42

0,12

0,23

0,08

0,50

0,10

0,00

Consumer Services PI

0,24

0,46

0,41

0,43

0,31

0,31

0,28

0,30

0,06

Consumer Goods PI 0,47

0,44

0,14

0,43

0,49

0,24

0,47

0,43

0,03

Utilities PI 0,34

0,41

0,22

0,33

0,49

0,34

0,43

0,36

0,38

Basic Materials PI 0,29

0,03

0,07

0,34

0,20

0,35

0,25

0,08

0,32

Technology PI 0,48

0,38

0,50

0,26

0,48

0,44

0,23

0,21

0,07

Telecommunications PI

0,19

0,39

0,10

0,31

0,41

0,36

0,07

0,23

0,20

XACT OMXS30 0,08

0,11

0,00

0,09

0,02

0,39

0,34

0,13

0,21

Table 4.2.1 P-values from Student’s t-test for ( ) with

Sector s1-s2

Finan

cials PI

Health

Care P

I

Ind

ustrials PI

Co

nsu

mer

Services PI

Co

nsu

mer

Go

od

s PI

Utilities P

I

Basic M

aterials

PI

Techn

olo

gy PI

Telecom

mu

nicati

on

s PI

XA

CT O

MX

S30

Financials PI

0,10

0,07

0,40

0,17

0,25

0,47

0,37

0,34

0,39

Health Care PI 0,09

0,22

0,26

0,25

0,26

0,38

0,50

0,15

0,10

Industrials PI 0,02

0,24

0,24

0,36

0,27

0,15

0,30

0,13

0,00

25


0,37

0,23

0,16

0,44

0,47

0,41

0,39

0,18

0,20


0,24

0,45

0,40

0,43

0,06

0,27

0,03

0,06

Utilities PI 0,23

0,27

0,28

0,45

0,43

0,16

0,32

0,26

0,47


0,44

0,17

0,48

0,10

0,17

0,27

0,38

0,21

Technology PI 0,37

0,49

0,29

0,40

0,29

0,32

0,22

0,07

0,04


0,31

0,15

0,09

0,25

0,03

0,25

0,39

0,08

0,23

XACT OMXS30 0,37

0,09

0,00

0,20

0,05

0,47

0,23

0,06

0,19


Sector s1-s2

Finan

cials PI

Health

Care P

I

Ind

ustrials PI

Co

nsu

mer

Services PI

Co

nsu

mer

Go

od

s PI

Utilities P

I

Basic M

aterials

PI

Techn

olo

gy PI

Telecom

mu

nicati

on

s PI

XA

CT O

MX

S30

Financials PI

0,10

0,03

0,18

0,35

0,21

0,19

0,11

0,38

0,31

Health Care PI 0,06

0,21

0,34

0,19

0,43

0,12

0,12

0,09

0,03

Industrials PI 0,03

0,25

0,42

0,14

0,47

0,04

0,45

0,11

0,02


0,17

0,31

0,45

0,47

0,32

0,18

0,27

0,15

0,06


0,14

0,06

0,47

0,30

0,41

0,12

0,27

0,15

Utilities PI 0,21

0,41

0,47

0,33

0,31

0,14

0,37

0,18

0,16


0,11

0,06

0,22

0,41

0,16

0,05

0,28

0,43

Technology PI 0,10

0,11

0,44

0,28

0,13

0,37

0,05

0,05

0,13


0,38

0,06

0,09

0,20

0,23

0,15

0,25

0,05

0,29

XACT OMXS30 0,31

0,04

0,03

0,10

0,19

0,16

0,39

0,14

0,33


26

Sector s1-s2

Finan

cials PI

Health

Care P

I

Ind

ustrials PI

Co

nsu

mer

Services PI

Co

nsu

mer

Go

od

s PI

Utilities P

I

Basic M

aterials

PI

Techn

olo

gy PI

Telecom

mu

nicati

on

s PI

XA

CT O

MX

S30

Financials PI

0,16

0,24

0,42

0,38

0,19

0,30

0,28

0,20

0,41

Health Care PI 0,20

0,43

0,39

0,47

0,30

0,37

0,40

0,43

0,14

Industrials PI 0,18

0,43

0,31

0,31

0,46

0,10

0,49

0,10

0,01


0,37

0,34

0,27

0,46

0,49

0,44

0,36

0,23

0,15


0,50

0,26

0,47

0,48

0,41

0,36

0,12

0,03

Utilities PI 0,23

0,32

0,48

0,50

0,48

0,23

0,42

0,16

0,33


0,39

0,14

0,37

0,48

0,23

0,24

0,44

0,32

Technology PI 0,28

0,43

0,50

0,36

0,38

0,43

0,24

0,03

0,21


0,23

0,42

0,10

0,22

0,11

0,14

0,43

0,03

0,26

XACT OMXS30 0,34

0,12

0,01

0,17

0,02

0,33

0,30

0,22

0,25


Sector s1-s2

Finan

cials PI

Health

Care P

I

Ind

ustrials PI

Co

nsu

mer

Services PI

Co

nsu

mer

Go

od

s PI

Utilities P

I

Basic M

aterials

PI

Techn

olo

gy PI

Telecom

mu

nicati

on

s PI

XA

CT O

MX

S30

Financials PI

0,21

0,08

0,15

0,35

0,27

0,32

0,26

0,35

0,43

Health Care PI 0,19

0,42

0,42

0,37

0,45

0,30

0,42

0,22

0,30

Industrials PI 0,08

0,44

0,47

0,26

0,47

0,27

0,38

0,06

0,01


0,12

0,48

0,45

0,26

0,35

0,39

0,40

0,08

0,02


0,39

0,20

0,26

0,38

0,41

0,16

0,26

0,25

Utilities PI 0,28

0,45

0,45

0,37

0,39

0,16

0,33

0,32

0,27

Basic Materials PI

27

0,30 0,33 0,29 0,47 0,41 0,15 0,18 0,33 0,25

Technology PI 0,25

0,44

0,41

0,40

0,16

0,29

0,16

0,15

0,33


0,33

0,19

0,03

0,11

0,21

0,33

0,29

0,16

0,20

XACT OMXS30 0,47

0,27

0,02

0,06

0,30

0,24

0,19

0,33

0,30


The same test is then performed on all values .

Sector s1-s2

Finan

cials PI

Health

Care P

I

Ind

ustrials PI

Co

nsu

mer

Services PI

Co

nsu

mer

Go

od

s PI

Utilities P

I

Basic M

aterials

PI

Techn

olo

gy PI

Telecom

mu

nicati

on

s PI

XA

CT O

MX

S30

Financials PI

0,20

0,08

0,24

0,48

0,34

0,32

0,48

0,19

0,04

Health Care PI 0,20

0,39

0,41

0,48

0,44

0,03

0,35

0,40

0,12

Industrials PI 0,09

0,38

0,41

0,11

0,21

0,08

0,50

0,10

0,00


0,26

0,46

0,42

0,43

0,31

0,32

0,28

0,30

0,07


0,43

0,14

0,43

0,49

0,25

0,46

0,44

0,02

Utilities PI 0,35

0,41

0,24

0,33

0,49

0,35

0,43

0,37

0,38


0,03

0,07

0,33

0,18

0,34

0,24

0,07

0,32

Technology PI 0,48

0,39

0,50

0,25

0,48

0,44

0,24

0,23

0,09


0,20

0,39

0,09

0,30

0,40

0,35

0,08

0,21

0,20

XACT OMXS30 0,08

0,10

0,00

0,08

0,02

0,39

0,34

0,10

0,21


28

Sector s1-s2

Finan

cials PI

Health

Care P

I

Ind

ustrials PI

Co

nsu

mer

Services PI

Co

nsu

mer

Go

od

s PI

Utilities P

I

Basic M

aterials

PI

Techn

olo

gy PI

Telecom

mu

nicati

on

s PI

XA

CT O

MX

S30

Financials PI

0,09

0,05

0,39

0,15

0,24

0,47

0,37

0,33

0,38

Health Care PI 0,10

0,22

0,25

0,26

0,27

0,39

0,50

0,15

0,10

Industrials PI 0,02

0,23

0,20

0,35

0,27

0,15

0,30

0,12

0,00


0,38

0,25

0,19

0,44

0,47

0,41

0,39

0,19

0,24


0,23

0,45

0,39

0,43

0,09

0,28

0,03

0,08

Utilities PI 0,24

0,27

0,28

0,45

0,43

0,18

0,32

0,26

0,47


0,43

0,16

0,48

0,07

0,15

0,25

0,36

0,21

Technology PI 0,38

0,49

0,29

0,39

0,27

0,32

0,24

0,08

0,05


0,31

0,15

0,09

0,24

0,03

0,25

0,41

0,07

0,24

XACT OMXS30 0,38

0,09

0,00

0,16

0,04

0,47

0,22

0,05

0,18


Sector s1-s2

Finan

cials PI

Health

Care P

I

Ind

ustrials PI

Co

nsu

mer

Services PI

Co

nsu

mer

Go

od

s PI

Utilities P

I

Basic M

aterials

PI

Techn

olo

gy PI

Telecom

mu

nicati

on

s PI

XA

CT O

MX

S30

Financials PI

0,09

0,02

0,17

0,35

0,22

0,20

0,10

0,37

0,31

Health Care PI 0,07

0,21

0,33

0,18

0,43

0,11

0,11

0,07

0,03

Industrials PI 0,03

0,25

0,41

0,10

0,47

0,05

0,45

0,09

0,02


0,19

0,32

0,46

0,47

0,33

0,21

0,29

0,17

0,09


0,15

0,10

0,46

0,30

0,42

0,12

0,25

0,15

Utilities PI 0,20

0,40

0,47

0,33

0,30

0,15

0,37

0,15

0,16

Basic Materials PI

29

0,20 0,12 0,05 0,19 0,40 0,14 0,04 0,25 0,42

Technology PI 0,11

0,12

0,44

0,26

0,13

0,38

0,06

0,05

0,13


0,39

0,08

0,11

0,18

0,24

0,18

0,27

0,05

0,30

XACT OMXS30 0,31

0,03

0,03

0,07

0,18

0,16

0,40

0,13

0,32


Sector s1-s2

Finan

cials PI

Health

Care P

I

Ind

ustrials PI

Co

nsu

mer

Services PI

Co

nsu

mer

Go

od

s PI

Utilities P

I

Basic M

aterials

PI

Techn

olo

gy PI

Telecom

mu

nicati

on

s PI

XA

CT O

MX

S30

Financials PI

0,18

0,23

0,42

0,38

0,19

0,32

0,28

0,17

0,41

Health Care PI 0,18

0,43

0,38

0,47

0,32

0,37

0,39

0,43

0,14

Industrials PI 0,19

0,43

0,29

0,29

0,46

0,11

0,49

0,08

0,00


0,38

0,35

0,29

0,46

0,49

0,44

0,37

0,24

0,18


0,50

0,27

0,47

0,48

0,42

0,35

0,11

0,03

Utilities PI 0,22

0,30

0,48

0,50

0,48

0,22

0,42

0,15

0,33


0,39

0,13

0,36

0,48

0,24

0,22

0,43

0,30

Technology PI 0,28

0,44

0,50

0,35

0,38

0,43

0,25

0,03

0,23


0,25

0,42

0,11

0,21

0,11

0,16

0,44

0,03

0,28

XACT OMXS30 0,34

0,11

0,01

0,14

0,02

0,33

0,32

0,20

0,22


Sector s1-s2

Finan

cials PI

Health

Care P

I

Ind

ustrials PI

Co

nsu

mer

Services PI

Co

nsu

mer

Go

od

s PI

Utilities P

I

Basic M

aterials

PI

Techn

olo

gy PI

Telecom

mu

nicati

on

s PI

XA

CT O

MX

S30

Financials PI

0,21

0,08

0,13

0,35

0,28

0,34

0,25

0,34

0,43

30

Health Care PI 0,19

0,43

0,41

0,38

0,45

0,30

0,42

0,21

0,31

Industrials PI 0,08

0,44

0,47

0,24

0,47

0,28

0,38

0,04

0,01


0,14

0,48

0,45

0,28

0,36

0,40

0,40

0,08

0,04


0,39

0,21

0,24

0,39

0,42

0,15

0,23

0,27

Utilities PI 0,28

0,45

0,45

0,36

0,38

0,15

0,33

0,32

0,26


0,32

0,28

0,47

0,40

0,16

0,16

0,31

0,23

Technology PI 0,26

0,44

0,41

0,40

0,17

0,29

0,18

0,15

0,34


0,35

0,21

0,05

0,10

0,23

0,34

0,32

0,15

0,21

XACT OMXS30 0,47

0,27

0,02

0,03

0,27

0,26

0,21

0,32

0,29


In order to better determine which sector spreads are useful as a predictive factor of

OMXS30, the following table contains the average p-value of all five tested values for . It is

reasonable to assume that if the factor really does contain useful information, its

performance would not be limited to one setting alone. A factor that is highly dependent on

any specific parameter is probably not stable when going out-of-sample since the best

parameter can change very rapidly in the market.

Sector s1-s2

Finan

cials PI

Health

Care P

I

Ind

ustrials PI

Co

nsu

mer

Services PI

Co

nsu

mer

Go

od

s PI

Utilities P

I

Basic M

aterials

PI

Techn

olo

gy PI

Telecom

mu

nicati

on

s PI

XA

CT O

MX

S30

Financials PI

0,15

0,10

0,28

0,35

0,25

0,32

0,30

0,29

0,32

Health Care PI 0,15

0,34

0,36

0,35

0,38

0,24

0,36

0,26

0,14

Industrials PI 0,08

0,35

0,37

0,24

0,38

0,13

0,42

0,10

0,01


0,25

0,37

0,35

0,41

0,39

0,35

0,34

0,19

0,10


0,34

0,22

0,41

0,42

0,31

0,28

0,22

0,10

31

Utilities PI 0,26

0,37

0,38

0,40

0,42

0,21

0,37

0,26

0,32


0,26

0,15

0,38

0,32

0,21

0,20

0,30

0,30

Technology PI 0,30

0,37

0,42

0,34

0,29

0,37

0,18

0,10

0,16


0,29

0,24

0,08

0,22

0,20

0,25

0,29

0,11

0,24

XACT OMXS30 0,31

0,13

0,01

0,12

0,12

0,32

0,29

0,17

0,26

Table 4.2.11 Average p-values from Student’s t-test for ( ) with

Sector s1-s2

Finan

cials PI

Health

Care P

I

Ind

ustrials PI

Co

nsu

mer

Services PI

Co

nsu

mer

Go

od

s PI

Utilities P

I

Basic M

aterials

PI

Techn

olo

gy PI

Telecom

mu

nicati

on

s PI

XA

CT O

MX

S30

Financials PI

0,15

0,09

0,27

0,34

0,25

0,33

0,29

0,28

0,31

Health Care PI 0,15

0,34

0,36

0,36

0,38

0,24

0,36

0,25

0,14

Industrials PI 0,08

0,35

0,36

0,22

0,38

0,13

0,42

0,09

0,01


0,27

0,37

0,36

0,42

0,39

0,36

0,35

0,20

0,12


0,34

0,23

0,40

0,42

0,32

0,28

0,21

0,11

Utilities PI 0,26

0,37

0,38

0,39

0,42

0,21

0,37

0,25

0,32


0,26

0,14

0,36

0,31

0,21

0,18

0,29

0,30

Technology PI 0,30

0,37

0,43

0,33

0,29

0,37

0,19

0,11

0,17


0,30

0,25

0,09

0,21

0,20

0,25

0,30

0,10

0,25

XACT OMXS30 0,31

0,12

0,01

0,10

0,11

0,32

0,30

0,16

0,25

Table 4.2.12 Average p-values from Student’s t-test for ( ) with

Based on the observations in the tables, there is only one sector spread that passes a

threshold of 0.05; Industrials PI and XACT OMXS30. Overall the industrial companies seem to

be strong drivers of OMXS30, exhibiting the mean reversion tendency that also drive

32

OMXS30. To illustrate this I drew the equity chart of a non-compounding portfolio trading

the spread on detrended data.


( ) and respectively with and

linear regression.



linear regressions.

33



linear regressions.



linear regressions.



linear regressions.

34

5. Combining the Predictive Factors into a Learning Ensemble System Based on the research in the previous chapters, the following factors were robust enough to

warrant further analysis.

Price Change of XACT OMXS30

Trading Day of the Month

Price Change of Telecommunications PI

Price Change of Industrials PI vs. Price Change of XACT OMXS30

These four factors all prove to be predictive of the price change the next day for XACT

OMXS30 below a 0.05 significance level. In order to incorporate them into a final trading

model ready for use, three different techniques will be used.

5.1 Ensemble Trading Model

When building the final trading model, three methods will be used that are described here.

5.1.1 Artificial Neural Networks

Inspired by the way biological neural networks in the brains of animals work, Artificial Neural

Networks are mathematical models used to process information by feeding it into an

interconnected group of artificial neurons (Wikipedia, 2012). The interconnected nature of

neural networks is what accounts for the complex behavior displayed by animals. The main

strength of a neural network is their ability to detect hidden non-linear patterns between

the inputs and the outputs. This has made them popular for applications such as e-mail spam

filtering, handwritten text recognition and time series prediction.

35

Figure 5.1.1 Illustration of a neural network with four layers, one input layer with eight

neurons, two hidden layers with eleven and five neurons respectively and one output layer

with one output neuron.4

One of the main disadvantages of artificial neural networks is their black-box nature. Since

the number of connections increase with the number of neurons added, no human will be

able to figure out exactly how even a small neural network makes its decisions. One of the

implications of this is that the risk of over fitting the model to the data increases. This makes

them inefficient to apply when data is scarce.

The usual network consists of neurons with connections, each one with a weight. The

neuron summarizes inputs from the connections multiplied by their weights .

∑

The signal is then transferred by the neuron’s activation function. This function could have

many forms, for example it could be a threshold function.

{

A binary number is sent to the following connections depending on whether the summarized

value exceeds the threshold value for that specific neuron. By adjusting the weights and

the threshold values the network can “learn” patterns and produce the sought output.

5.1.2 Genetic Algorithms

Inspired by the Darwinian idea of natural selection Genetic Algorithms, a subset of

Evolutionary Algorithms, is an optimization technique designed to evolve the best solution

by features such as recombination and mutations (Brownlee, 2011).

The general outlining of most genetic algorithms could be expressed in the following

procedure:

1. Initialize a population of random solutions.

2. Evaluate the fitness of each solution in the population.

3. Repeat the following steps until some objective is met.

1. Select the best solutions in the population for reproduction.

2. Create a new population of new solutions by recombining the best

from the former and apply occasional mutations to their chromosomes.

4 Picture retrieved from http://www.optimaltrader.net/neural_network.htm 2012-03-31.

36

3. Evaluate the fitness of each new solution.

As the populations are derived from the fittest from the last population, the average fitness

will increase as the algorithm runs. With time this increase in fitness will fade as the

populations converges on a maximum that is hopefully the global maximum of the

parameter space5.

Figure 5.1.2 General outlining of an evolutionary algorithm.

5.1.3 Ensemble learning

In order to achieve better prediction or classification through a learning model, multiple

models can be combined. This process is called ensemble learning and can be of great use

when dealing with models that are trained on sample with such a size that generalization

becomes a problem (Robi Polikar, 2011). Small sample might not fully represent the data

distribution and therefore the model might learn patterns that will not persist on a

validation set. Another use of ensemble modeling could be on non-stationary data, such as

price history. It is likely that the point that represents the global maximum derived from an

optimization of the parameters will not stay at the peak in the parameter space going

forward since the most profitable strategy will change with shifting conditions. By averaging

5 A vector with N parameters can be through of as a specific point in an N-dimensional space. This would be the

parameter-space.

37

the output from the points around the area of the global maximum, and maybe for the

points around the local maxima as well, superior robustness can be achieved.

Figure 5.1.3 Example of a three-dimensional parameter space, based on the total return for

a strategy trading long/short OMXS30 when a crossover occurs between two M- and N-day

moving averages of the closing price.

5.2 Method

The following trading system presented will be based on the aggregated output of several

hundred artificial neural networks trained by a genetic algorithm. There are four inputs to

the networks, the percentile-ranked values for the N-day price return of XACT OMXS30, the

percentile-ranked N-day return for the Telecommunications sector index and the percentile-

ranked N-day relative return of the Industrial sector index and XACT OMXS30 as well as the

value of 1 whenever today is the first or third last trading day of the month (else 0). Since all

inputs will be within the range of 0 and 1, the initial weights and thresholds will be set at any

value between -1 and 1. This allows the network to have maximum flexibility in regards to its

inputs.

The four input neurons are connected to four neurons in the hidden layer. The number of

neurons in the hidden layer were chosen based on the number of input neurons. More

38

neurons might give greater complexity, but with the tradeoff that the networks might fail to

properly generalize. Fewer neurons might not catch all of the patterns that exist. The general

architecture of the neural networks is illustrated in the following figure.

Figure 5.2.1 Neural network topology

The activation function used for the neurons in the hidden layer and in the output layer is

the threshold function described earlier. The values that are to be optimized can be

expressed as a vector. * + where is the number of neurons. The total

number of values in the vector will then be . In order to optimize

this I use a steady-state genetic algorithm with tournament selection and a population size

of 100 individuals. An output of 1 from the output neuron will be seen as a buy signal for the

portfolio and 0 will be seen as a sell signal. The fitness function that will be used is the right

side probability that the null hypothesis is false from a Student’s t-test (i.e. that the network

produces profitable returns). In order to avoid the problem that the sample size might be

too small for proper evaluation of the significance, a rule that says that the fitness is set to 1

(i.e. zero probability that the return is positive due to other factors that random chance) if

the number of days that the network fired was smaller than 100. This will eliminate all

candidates that might just have high significance due to over fitting a small number of

occurrences.

39

A steady-state genetic algorithm differs from a generational one in the respect that the

population is continually updated, instead of generating a new population (generation) each

iteration. The tournament selection method works by first selecting at random two

individuals from the population and . If the fitness value of is less than that of ,

( ) ( ), then is selected as one of the parents for the new individual

and is selected as the individual about to be replaced. This process is then repeated to

select the other parent. The offspring is created by recombining the network weights and

thresholds vector of each parent at random, so for example the weight in the offspring

has a fifty percent chance of being from one of the parents. This is called Uniform Crossover

and should approximately give the offspring half the genes from one parent and half the

genes from the other. In order to promote diversity and continual evolution there is a small

probability that a mutation occurs in the genes of the new individual. With a probability of

0.5% any random value between -1 and 1 is added to the value in the gene, and by that the

population extends its search for the global maximum. 0.5% was selected so that rapid

mutations would not deteriorate the average fitness of the population6. The genetic

algorithm is initiated with individuals with random values for their genes and terminated

when 200,000 iterations have been made or when 15,000 iterations have been made

without an improvement in the best individual’s fitness (indicating that the algorithm has

reached a minimum). The final chromosome (or network setting) is the best individual in the

final population i.e. the one with the lowest fitness and hence the lowest p-value.

To test the network’s ability to properly learn patterns and generalize on unseen data, the

data set will be divided into one in-sample period, where the fitness for the genetic

algorithm is calculated and the networks are optimized, and one out-of-sample period

where the network is tested. In order to improve the network’s ability to generalize the

output of 2,500 independently optimized neural networks, 500 for each N-day period setting

for the inputs, will be aggregated into a continuous signal ranging from 0 to 2500.

5.3 Results

To evaluate the results of the strategy, two performance metrics will be used. To first one is

the Compound Annual Growth Rate, or CAGR. CAGR is defined as:

( ) ( ( )

( ))

where ( ) is the start value for the portfolio, ( ) is the last value and is the

number of days. The numerator in the exponent indicates that the CAGR for daily returns are

scaled to yearly, with 252 being the number of trading days in a year.

6 In nature, massive mutations caused by e.g. radioactive fallout are never beneficial as many mutations simultaneously are extremely unlikely to be anything other than negative for the carrier. Small gradual steps however, are what support long term evolution of species.

40

The second one is the Sharpe Ratio which is a metric used to measure the return in relation

to the risk, defined as the standard deviation of returns. It is named after the Nobel Prize

winner William Forsyth Sharpe. Its definition is:

( ) ( )

√

where √ is the standard deviation of daily returns annualized. Usually the risk free

rate of return is also factored into the formula so that the nominator is the excess return,

but in this paper the risk free return will be equal to zero.

5.3.1 In-sample performance

2,500 artificial neural networks were optimized for XACT OMXS30 during the period from

the 21st of January 2004 to the 4th of January 2010. The aggregated signal ranges from 0 to

2,500 with a higher value indicating that a greater number of networks are firing a buy

signal. By having the exposure to the market scale with the signal’s strength,

∑

where is the output of the network, all information contained in the signal can be used.

For example, a signal of 1,250 means that 50% of the portfolio will be invested and the rest

stays in cash. The following chart illustrates the performance using this approach.

41

Figure 5.3.1 Portfolios with scaling position-size based on output of the neural network

ensemble.

Due to portfolio constraints, this approach might not be possible for some traders. Splitting

the capital into 2,500 equal parts without getting too much slippage from the commission

paid for investing only small parts of the portfolio at once will require a lot of capital. A more

realistic way of trading the aggregated signal from the ensemble would then be by waiting

for a higher value to initiate a trade, for example 1,250 which would be the majority of the

networks firing at the same day.

42

Figure 5.3.2 Portfolios trading when the output of the neural network ensemble indicates

that the majority of the networks are firing.

Ensemble system exposure method CAGR Sharpe Ratio Significance

Scaling Position-size 26.75% 3.47 0.0000

Majority Vote 30.91% 3.24 0.0000

Table 5.3.1 Performance metrics for the in-sample period.

The significance based on a Bootstrap test (on detrended daily returns) is extremely low,

which is what one would expect since the ensemble was evolved for its significance in the

sample. Both the Sharpe Ratio and the CAGR are very high, with higher Sharpe but lower

CAGR when scaling into position as opposed to being either all in or out. This would be

because adjusting the position-size to the signal’s strength smooths the daily return and

hence the standard deviation is lower.

43

5.3.2 Out-of-sample performance

When dealing with optimization one always has to consider that the results are designed and

that the output of the optimized solution was changed in hindsight. In order to test the

validity of the solution it has to be cross validated on an independent data set. This was done

by leaving out 537 of the days in the sample period from the fitness calculation when

running the genetic algorithm to train the neural networks. On the period from 5th of

January 2010 to 17th of February 2012 the ensemble is tested again to analyze its ability to

generalize the patterns that it has learnt during the in-sample period. Other than the factors

selected as input to the networks, no part of the ensemble trading system was changed as a

result of the data in-sample. Below is the performance of the two ways the aggregated signal

might be traded.

Figure 5.3.3 Portfolios with scaling position-size based on output of the neural network

ensemble.

44

Figure 5.3.2 Portfolios trading when the output of the neural network ensemble indicate

that the majority of the networks are firing.

Ensemble system exposure method CAGR Sharpe Ratio Significance

Scaling Position-size 14.83% 2.15 0.0058

Majority Vote 17.37% 1.88 0.0077

Table 5.3.1 Performance metrics for the out-of-sample period.

The significance from the Bootstrap test (with detrended daily returns) is very low for the

portfolio with a scaling position-size as well as for the portfolio that trades based on what

the majority of the neural networks say. The significance was lower for the ensemble system

than for every factor tested which indicates that its ability to learn and use the inputs in an

innovative way is superior to simple rule combinations.

45

6. Conclusion This paper started out by introducing the topic speculation and the difference between

objective and subjective decision. It then showed the framework for testing and verifying

objective trading decisions on which all following chapters were based. The results of the

factor selection process using the non-parametric hypothesis test called Bootstrap

resampling as well as Student’s t-test, were that four factors in particular displayed a

tendency to predict the future returns of the Swedish stock market with a low significance

that results were due to random chance alone. The four factors were then used as inputs

into artificial neural networks that were trained during an in-sample period from 2004 to

2010 using a genetic algorithm with steady-state tournament selection. 2,500 of these

networks were trained and the aggregated output show great results during both the in-

sample period and the out-of -sample period 2010 to 2012. The significance during the out-

of-sample period calculated by bootstrap resampling was low enough to make me confident

that the results were not due to chance. By investing based on what the majority of the

networks in the ensemble said, an annual return of 17% were generated during the period

unused by the training method and with a Sharpe ratio of 1.88, indicating that the returns

obtained were high, relative to the risk taken.

One thing that was not looked at in this paper is the performance and factor selection that

could be examined from other liquid securities. For example future papers could try to

examine what effect the factor tested in this paper has on stocks within the OMXS30 and

maybe do the same for other stock indices around the world. Also, the inputs to the neural

network could be complemented with the current day volume and volatility of the security

that is being optimized, so that more information contained in the price could be used to

make the prediction. The optimization method could also take into account more realistic

effects of slippage, so that the ensemble can evolve the optimal solution for the given

commission structure. This could be done by changing the type of network trained from a

feedforward on to a recurrent one so that the network also takes its current position as an

input.

In conclusion, there is a possibility to achieve higher risk adjusted returns by objectively

trading based on historical tests. However, results can only be verified after the trading has

been done in real-time with real money.

46

7. References

Wikipedia. (2011, October 19). Retrieved November 13, 2011, from Wikipedia:

http://en.wikipedia.org/wiki/Backtesting

Aronson, D. R. (2006). Hypothesis Tests and Confidence Intervals. In D. R. Aronson, Evidence-

Based Technical Analysis: Applying the Scientific Method and Statistical Inference to

Trading Signals (pp. 217-255). Chichester: John Wiley & Sons Ltd.

Bukey, D. (2007, February 1). David Aronson: STRUCK BY SCIENCE, PDF. Retrieved December

31, 2011, from InvivoAnalytics.com: http://invivoanalytics.com/wp-

content/uploads/2008/03/ARONSON_200702.pdf

Fonder, X. (n.d.). Fonder. Retrieved February 19, 2012, from XACT:

http://www.xact.se/fonder/bred-aktiemarknad/xact-omxs30/

Hassler, F. (2011, November 23). The impact of survivorship bias. Retrieved February 19,

2012, from Engineering Returns: http://engineering-returns.com/2011/11/23/the-

impact-of-survivorship-bias/

Kaplan, I. (2004, October 1). Book reviews. Retrieved January 1, 2012, from Bear Products

International Home Page:

http://www.bearcave.com/bookrev/misbehavior_of_markets.html

McConnell, X. (2006, August 20). Equity Returns at the Turn of the Month. Retrieved Mars

31, 2012, from Social Science Research Network:

http://papers.ssrn.com/sol3/papers.cfm?abstract_id=925589

PhD, J. B. (2011, January 1). Nature-Inspired Programming Recipes. Retrieved Mars 31, 2012,

from Clever Algorithms: http://www.cleveralgorithms.com/nature-

inspired/index.html

Robi Polikar, R. U. (2011, October 21). Ensemble learning. Retrieved Mars 30, 2012, from

Scholarpedia : http://www.scholarpedia.org/article/Ensemble_learning

Stokes, M. (2009, February 12). Short-term Mean-Reversion Becoming Stronger: Part II (The

Why). Retrieved February 22, 2012, from MarketSci Blog:

http://marketsci.wordpress.com/2009/02/12/short-term-mean-reversion-becoming-

stronger-part-ii-the-why/

Thornton, S. (1997, November 13). Karl Popper. Retrieved December 31, 2011, from

Standford Encyclopedia Of Philosophy:

http://plato.stanford.edu/entries/popper/#SciKnoHisPre

47

Wikipedia. (2012, Mars 20). Artificial neural network. Retrieved Mars 26, 2012, from

Wikipedia, the free encyclopedia:

http://en.wikipedia.org/wiki/Artificial_neural_network

Xact. (2012, February 17). Utbildning - Historiska utdelningar. Retrieved February 27, 2012,

from http://www.xact.se/: http://www.xact.se/utbildning/utdelning-fran-

fonderna/historiska-utdelningar/#

48

Appendix A – Excel Testing This appendix describes the formulas used when testing the factors in the paper in Microsoft

Excel 2010.

Given that the closing price is in column D, the formula for the daily return is:

The mean daily return could be calculated in the cells in the range F510 to F2541 with the

formula for averages:

( )

The percentile rank is calculated by as the value in H258 is relationship to all the values in

the range H7:H258.

( )

The last value in the formula, 10, specifies the number of decimals.

To test the value, a logical function is used.

( )

If the value in the cell I509 exceeds 0.5, the value from G510 is return, else no value at all is

returned.

To calculate the t-significance, the number of occurrences and the sample standard

deviation has to be calculated.

( ( ) ) ( ( ) ( ( )))

The zero specifies that we are testing the hypothesis that the mean is equal to zero (since

the data is detrended).

The significance is then calculated by the built-in formula in Excel for Student’s t-test.

( ( ) ( ) )

The absolute value of the value returned from the previous formula is checked against a

table with - degrees of freedom.

49

Appendix B – C# code for implementing the neural networks using System;

using System.Collections.Generic;

using System.Linq;

using System.Text;

using Meta.Numerics;

using Meta.Numerics.Statistics;

using System.IO;

namespace Final_Neural_Network_for_The_Project_Nova_Paper

{

class Program

{

static void Main(string[] args)

{

int PopulationSize = 100;

int MaxNumberOfTrainingIterations = 200000;

int MaxNumberOfIterationsFromLastImprovment = 15000;

double MutationRate = 0.005d;

int ValuesPerInput = 2032;

//Counts the number of rows in the input-file

int n = 0;

using (TextReader r = File.OpenText("C:/Input to Project

Nova.txt"))

{

while (r.ReadLine() != null)

{

n++;

}

}

Console.WriteLine("There are " + n + " values in the file.");

//Puts the values from the txt-file into an array

double[] values = new double[n];

using (TextReader r = File.OpenText("C:/Input to Project

Nova.txt"))

{

string line;

for (int i = 0; i < n; i++)

{

line = r.ReadLine();

double.TryParse(line, out values[i]);

}

}

int[] OldPredictions = new int[2032];

for (int Ensemble = 0; Ensemble < 500; Ensemble++)

{

Console.Clear();

Console.WriteLine(Ensemble);

//Create the initial population and assigne them random

values

double[] DNA = new double[25 * PopulationSize];

double[] Fitness = new double[PopulationSize];

Random random = new Random();

for (int i = 0; i < (25 * PopulationSize); i++)

{

DNA[i] = (random.NextDouble() * (random.Next(0, 2) * 2

- 1));

50

}

//Evaluates the entire population by their significance

from a one-sampled t-test.

//Computes the neural network.

for (int i = 0; i < PopulationSize; i++)

{

//declares and sets an array that contains the

individual

double[] IndividualEvaluated = new double[25];

for (int d = 0; d < 25; d++)

{

IndividualEvaluated[d] = DNA[d + 25 * i];

}

//declares a sample that will contain the daily returns

that we will later calculate the significance of

Sample tsample = new Sample();

//tests the neural network settings

for (int g = 0; g < 1495; g++)

{

double[] HiddenLayer = new double[4];

double Output = 0;

int gene = 0;

//sums the weights multiplied by the inputs for

each neuron in the hidden layer

for (int k = 0; k < 4; k++)

{

for (int p = 1; p < 5; p++)

{

HiddenLayer[k] += values[g + ValuesPerInput

* p] * IndividualEvaluated[gene];

gene++;

}

}

//performs the neurons activation function, sends a

binary number if the threshold value is exceeded

for (int k = 0; k < 4; k++)

{

if (HiddenLayer[k] > IndividualEvaluated[gene])

{

HiddenLayer[k] = 1;

}

else

{

HiddenLayer[k] = 0;

}

gene++;

}

//sums the weights times the signals from the

hidden layer in the output neuron

for (int k = 0; k < 4; k++)

{

Output += HiddenLayer[k] *

IndividualEvaluated[gene];

gene++;

}

//if the ouput neuron fires, the tsample addes the

return for that day

if (Output > IndividualEvaluated[gene])

{

tsample.Add(values[g]);

}

51

}

//checks that the sample size is not to low

if (tsample.Count > 100)

{

TestResult fitness = tsample.StudentTTest(0);

//assignes the individual its significance

Fitness[i] = fitness.RightProbability;

}

else

{

Fitness[i] = 1;

}

}

//starts the breeding

int IterationsWhenLastImprovement = 0;

for (int i = 0; i < MaxNumberOfTrainingIterations && (i -

IterationsWhenLastImprovement) < MaxNumberOfIterationsFromLastImprovment;

i++)

{

int Parent1;

int Parent2;

int tIndividual1;

int tIndividual2;

int Exiled;

//selectes the random individual that has the lowest

fitness (t-significance)

//the loser in the turnament is the one being replaced

(Exiled)

tIndividual1 = random.Next(0, PopulationSize);


if (Fitness[tIndividual2] < Fitness[tIndividual1])

{

Parent1 = tIndividual2;

Exiled = tIndividual1;

}

else

{


Exiled = tIndividual2;

}

//selectes the other parent



if (Fitness[tIndividual2] < Fitness[tIndividual1])

{


}

else

{


}

//declares and sets an array that contains the

individual by recombining the parents and performes random mutations to its

chromosomes

double[] IndividualEvaluated = new double[25];

for (int d = 0; d < 25; d++)

52

{

if (random.Next(0, 2) == 0)

{

IndividualEvaluated[d] = DNA[d + 25 * Parent1];

}

else

{

IndividualEvaluated[d] = DNA[d + 25 * Parent2];

}

if (random.NextDouble() < MutationRate)

{

IndividualEvaluated[d] += (random.NextDouble()

* (random.Next(0, 2) * 2 - 1));

}

}

//moves on and tests the new network

Sample tsample = new Sample();

for (int g = 0; g < 1495; g++)

{


double Output = 0;

int gene = 0;

//sums the weights multiplied by the inputs for

each neuron in the hidden layer

for (int k = 0; k < 4; k++)

{

for (int p = 1; p < 5; p++)

{

HiddenLayer[k] += values[g + ValuesPerInput

* p] * IndividualEvaluated[gene];

gene++;

}

}



for (int k = 0; k < 4; k++)

{

if (HiddenLayer[k] > IndividualEvaluated[gene])

{

HiddenLayer[k] = 1;

}

else

{

HiddenLayer[k] = 0;

}

gene++;

}

//sums the weights times the signals from the

hidden layer in the output neuron

for (int k = 0; k < 4; k++)

{

Output += HiddenLayer[k] *

IndividualEvaluated[gene];

gene++;

}

//if the ouput neuron fires, the tsample addes the

return for that day

if (Output > IndividualEvaluated[gene])

53

{

tsample.Add(values[g]);

}

}

//find the best in the population

int best = 0;

for (int j = 0; j < PopulationSize; j++)

{

if (Fitness[j] < Fitness[best])

{

best = j;

}

}

//checks that the sample size is not to low

if (tsample.Count > 100)

{

TestResult fitness = tsample.StudentTTest(0);

//assignes the individual (the one selected to

leave) its significance

Fitness[Exiled] = fitness.RightProbability;

}

else

{

Fitness[Exiled] = 1;

}

//check if the new individual is better then the last

best individual, if then an improvement was made

if (Fitness[Exiled] < Fitness[best])

{

IterationsWhenLastImprovement = i;

}

//the new individual replaces the less fitt

for (int d = 0; d < 25; d++)

{

DNA[d + 25 * Exiled] = IndividualEvaluated[d];

}

}

//find the best in the final population

int fbest = 0;

for (int i = 0; i < PopulationSize; i++)

{

if (Fitness[i] < Fitness[fbest])

{

fbest = i;

}

}

//test the best and save its predictions for each day in a

txtfile

double[] BestIndividual = new double[25];

for (int d = 0; d < 25; d++)

{

BestIndividual[d] = DNA[d + 25 * fbest];

}

//this array will hold the predictions made by this

particular network

int[] predictions = new int[2032];

//tests the individual on the entire sample

54

for (int g = 0; g < 2032; g++)

{


double Output = 0;

int gene = 0;

//sums the weights multiplied by the inputs for each

neuron in the hidden layer

for (int k = 0; k < 4; k++)

{

for (int p = 1; p < 5; p++)

{

HiddenLayer[k] += values[g + ValuesPerInput *

p] * BestIndividual[gene];

gene++;

}

}



for (int k = 0; k < 4; k++)

{

if (HiddenLayer[k] > BestIndividual[gene])

{

HiddenLayer[k] = 1;

}

else

{

HiddenLayer[k] = 0;

}

gene++;

}

//sums the weights times the signals from the hidden

layer in the output neuron

for (int k = 0; k < 4; k++)

{

Output += HiddenLayer[k] * BestIndividual[gene];

gene++;

}

//if the ouput neuron fires, the predictions-array will

hold the output for later usage

if (Output > BestIndividual[gene])

{

predictions[g] = 1;

}

else

{

predictions[g] = 0;

}

}

//combine the array that holds all predictions from all

previously evolved networks with the predictions from the last one

for (int k = 0; k < 2032; k++)

{

OldPredictions[k] = predictions[k] + OldPredictions[k];

}

//creates the file that will contain the final prediction

output

55

using (FileStream stream = new FileStream(@"C:/Prediction -

Project Nova.txt", FileMode.Create))

using (TextWriter writer = new StreamWriter(stream))

{

writer.WriteLine("");

}

//puts the preditions in the txtfile

for (int t = 0; t < 2032; t++)

{

File.AppendAllText("C:/Prediction - Project Nova.txt",

OldPredictions[t] + Environment.NewLine);

}

//continues the loop until the ensemble is clear

}

Console.ReadKey();

}

}

}

56

Appendix C – Bootstrap code using System;

using System.Collections.Generic;

using System.Linq;

using System.IO;

using System.Timers;

using System.Text;

namespace Bootstrap

{

class Program

{

static void Main()

{

int n=0;

using (TextReader r = File.OpenText("C:/bst.txt"))

{

string line;

while ((line = r.ReadLine()) != null)

{

n++;

}

Console.WriteLine("Det finns "+n+" värden i filen");

}

double[] tabl = new double[n];

Random place = new Random();

double mean=0;

int maxits=500000;

DateTime tid;

using (TextReader r = File.OpenText("C:/ bst.txt"))

{

string line;

int x=-1;

while ((line = r.ReadLine()) != null)

{

x = x + 1;

//Console.WriteLine(line);

double.TryParse(line, out tabl[x]);

}

}

//calculate mean

for(int i=0;i<n;i++)

{

mean += tabl[i];

}

mean = (mean / n);

Console.WriteLine("Medelvärdet som kommer att testas är

"+mean);

//substract mean from array values

for (int i = 0; i < n; i++)

{

tabl[i] -= mean;

}

int pvalue=0;

tid=DateTime.Now;

for(int its=0;its<maxits;its++)

{

double sum=0;

for(int i=0;i<n;i++)

{

57

int r = place.Next(0, n);

sum += tabl[r];

//Console.WriteLine(r);

}

if((sum/n) >= mean)

{

pvalue++;

}

}

TimeSpan elsp = DateTime.Now - tid;

Console.WriteLine(elsp);

Console.WriteLine("P-värdet beräknas till " +

((decimal)pvalue/(decimal)maxits));

Console.WriteLine(1-((decimal)pvalue / (decimal)maxits));

Console.ReadKey();

}

}

}

Equity Trading With an Ensemble Neural Network System

Documents

Transcript of Equity Trading With an Ensemble Neural Network System